K-Function (Ripley’s): Definition

Types of Functions >

Contents:

Overview of Ripley’s K-Function

The K-Function (also called Ripley’s K-function or the network K-Function) is a way to analyze completely mapped event location data (i.e. the locations of all events from a study).

The function is generally used to find out if the phenomenon of interest (e.g. trees,) is clustered, dispersed, or randomly distributed throughout the area of interest. Other uses include:

  • Estimating parameters. A parameter is the actual figure from the whole population.
  • Fit a model to the data. For example, your data might fit a linear function.
  • Perform a hypothesis test about the pattern. Hypothesis tests are common in calculus-based statistics, where you have to “prove” your results are valid with a test.
  • Summarize a pattern of points.

In general, the K-function is a summary of how observed distances between similar events are clustered or dispersed. If the events are clustered, it tells you the distance where the clustering occurs. It was originally developed by Ripley in 1976.

A comparison of observed vs. expected distances give you the following information:

  • If the observed value is higher than the expected value, the events tend to cluster.
  • f the observed value is lower than the expected value, the events are dispersed.

Formal Definition of Ripley’s K-Function

The K-function is denoted as:

K(t) = λ-1 E,

Where:
“E” = the number of additional events within distance t of a random event.

Formally, the K-Function is defined as:
ripley's k function

Where:

  • R = region of interest,
  • s = distance,
  • dij = distance between the ith and jth events,
  • Is(dij = An indicator function. If dij ≤ s, it equals 1. It equals zero for all other values.

The function gives the expected number of events within distance of a randomly chosen event.

Limitation

Although useful, the function has one very limiting assumption: points are assumed to be continuous on a Euclidean plane. This could be problematic if you’re analyzing data that isn’t on a Euclidean plane.

As an example, let’s say you’re analyzing traffic flow for a busy New York intersection. The points here might be “cars”, and their distances can all be expressed in Euclidean terms: Left, Right, Up, Down (relative to the intersection). However, let’s say your study extended to several city blocks. A car doesn’t travel straight to the next block through building: it takes the only route available: along streets. Those distances are non-Euclidean (i.e. they belong to a type of geometry called Taxicab geometry).

euclidean route
The K-function is problematic for analyzing non-Euclidean distances, like the taxicab route (shown in black). The orange line is the Euclidean distance (i.e. the straight path from A to B).

References

Dixon, P. (2002). Ripley’s K-Function. In Encyclopedia of Environmetrics. Volume 3, pp 1796-1803. Retrieved November 28 from: https://www3.nd.edu/~mhaenggi/ee87021/Dixon-K-Function.pdf
Gelfand, A. et al. (2010). Handbook of Spatial Statistics.
Popovich, V. et al. (2015). Information Fusion and Geographic Information Systems (IF&GIS’ 2015): Deep Virtualization for Mobile GIS. Springer.
Ripley, B.D. (1976). The second-order analysis of stationary point processes, Journal of Applied Probability
13, 255–266.


Comments? Need to post a correction? Please Contact Us.

Leave a Comment