Autocorrelation > Moran’s I

## What is Moran’s I?

Moran’s I is a correlation coefficient that measures the overall spatial autocorrelation of your data set. In other words, it measures how one object is similar to others surrounding it. If objects are attracted (or repelled) by each other, it means that the observations are not independent. This violates a basic assumption of statistics — independence of data. In other words, the presence of autocorrelation renders most statistical tests invalid, so its important to test for it. Moran’s I is one way to test for autocorrelation.

Spatial autocorrelation is multi-directional and multi-dimensional, making it useful for finding patterns in complicated data sets. It is similar to correlation coefficients, it has a value from -1 to 1. However, while other coefficients measure perfect correlation to no correlation, Moran’s is slightly different (due to the more complex, spatial calculations):

- -1 is perfect clustering of dissimilar values (you can also think of this as perfect dispersion).
- 0 is no autocorrelation (perfect randomness.)
- +1 indicates perfect clustering of similar values (it’s the opposite of dispersion).

## Perfect Dispersion

In the above image, the black and white squares have a definite pattern and are perfectly dispersed. The Moran’s value would equal -1.

## Perfect Randomness

If the squares were truly randomly dispersed, the Moran’s value would be 0.

## Perfect Clustering

This image shows that the white squares and black squares occupy one half of the area, either side of the center. This is perfect clustering of similar values, which gives a Moran value of +1.

## Evaluating the Index

Moran’s I is unlike most other correlation coefficients in that you **can’t take the index at face value.** It is an inferential statistic, and you have to determine statistical significance before you can read the result. This is done with a simple hypothesis test, calculating a z-score and its associated p-value.

- The null hypothesis for the test is that the data is randomly disbursed.
- The alternate hypothesis is that the data is more spatially clustered than you would expect by chance alone. Two possible scenarios are:
- A positive z-value: data is spatially clustered in some way.
- A negative z-value: data is clustered in a competitive way. For example, high values may be repelling high values or negative values may be repelling negative values.

## Calculations

Calculations for Moran’s I are based on a weighted matrix, with units i and j. Similarities between units is calculated as the product of the differences between y_{i} and y_{j} with the overall mean.

- Similarity = (y
_{i}– ̄y)(y_{j}– ̄y), where ̄y = Σ^{n}_{i=1}y_{i}/n - The Moran’s statistic is calculated using the basic form, which is divided by the sample variance:s
^{2}= (Σ(y_{i}– ̄y)^{2})/n).

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!
Can you calculate the value of Moran;s index without using a computer program like ArcGIS? Please give me an example that calculates manually Moran’s index and the test of hypothesis. OK?

I’m not sure why you would want to calculate it manually. Depending on how many values you’re plugging into the formula (which you can find at the end of the article), you’ll be looking at hours and hours of arithmetic. I would start by reading Moran’s original article: “Notes on Continuous Stochastic Phenomena”. Biometrika. 37 (1): 17–23. You can read the entire paper for free on JSTOR.