Bias > Aggregation Bias

## What is Aggregation Bias

In ecological studies, **aggregation bias is the expected difference between effects for the group and effects for the individual**, *if *there is no confounding. If there *is *confounding, then the difference for group and individual effects is a combination of confounding and aggregation bias. Aggregation bias leads to the “**ecological fallacy**” — the conclusion that what is true for the group must be true for the sub-group or individual. It’s called *aggregation* bias because you’re using **aggregated data** and extrapolating it inappropriately.

For example, you might have data showing that inner city students tend to perform poorly on standardized tests. **That doesn’t mean any one individual will perform poorly**. Likewise, you might show that one particular state has a lower than average per-capita income. You can’t say for sure that every county in that state has a lower than average income. And you definitely can’t say that every person in the state has a low income.

Luloff and Greenwood (1980) found that increased aggregation causes unpredictable results:

- The Coefficient of Determination, R
^{2}, sometimes falls, sometimes increases, and sometimes remains constant. - Coefficients switch signs and magnitudes In one case, the directional switch retained significance. Statistical significance is lost in some cases.

## Example from Research

Perhaps the most famous example of an ecological fallacy is Durkheim’s 1897 study, which inferred that Protestants were more likely to commit suicide, based on data showing that countries with larger Protestant populations had higher suicide rates than counties with larger Catholic populations. The study failed to take confounding variables into account — like the fact that Protestant countries differed in many ways from Catholic countries. Plus Durkeim didn’t look at religious groups within countries when determining suicide rates — he just took data from countries as a whole.

**References:**

Durkheim, E. (1897). Le suicide. Paris: F. Alcan. English

translation by J A Spalding (1951). Toronto, Canada: Free

Press/Collier-MacMillan.Luloff, A.E., & P. H. Greenwood. 1980. Definitions of Community: An Illustration of

Aggregation Bias. Station Bulletin 516. New Hampshire Agricultural Experiment Station.

Durham, NH: University of New Hampshire.

Vibhanshu Abhishek, Kartik Hosanagar, Peter S. Fader (2015) Aggregation Bias in Sponsored Search Data: The Curse and the

Cure. Marketing Science 34(1):59-77. http://dx.doi.org/10.1287/mksc.2014.0884

**Need help with a specific statistics question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.