Bias > Aggregation Bias & Ecological Fallacy Definition
In ecological studies, aggregation bias happens when a researcher incorrectly assumes that trends in aggregated data also apply to individual data points. When data is aggregated, or merged, it can hide trends that are happening between individual variables. It is the expected difference between effects for the group and effects for the individual, if there is no confounding — the introduction of unexpected variables into your research that you didn’t account for.
Ecological Fallacy Definition
If there is confounding in an ecological study, then the difference for group and individual effects is a combination of confounding and aggregation bias. Aggregation bias leads to the “ecological fallacy” —the conclusion that what is true for the group must be true for the sub-group or individual. It’s called aggregation bias because you’re using aggregated data and extrapolating it inappropriately.
For example, you might have data showing that inner city students tend to perform poorly on standardized tests. That doesn’t mean any one individual will perform poorly. Likewise, you might show that one particular state has a lower than average per-capita income. You can’t say for sure that every county in that state has a lower than average income. And you definitely can’t say that every person in the state has a low income.
Aggregation problems can bias results from experiments and surveys. It can also lead to incorrect probability distribution selection and distort the results of Hypothesis Testing and regression analysis.
Luloff and Greenwood  found that increased aggregation causes unpredictable results:
- The Coefficient of Determination, R2, sometimes falls, sometimes increases, and sometimes remains constant.
- Coefficients switch signs and magnitudes In one case, the directional switch retained significance. Statistical significance is lost in some cases.
Aggregation Bias Example
Perhaps the most famous example of an ecological fallacy is Durkheim’s 1897 study , which inferred that Protestants were more likely to commit suicide, based on data showing that countries with larger Protestant populations had higher suicide rates than counties with larger Catholic populations. The study failed to take confounding variables into account—like the fact that Protestant countries differed in many ways from Catholic countries. Plus Durkeim didn’t look at religious groups within countries when determining suicide rates—he just took data from countries as a whole.
The best way to avoid aggregation bias is to use individual data points instead of aggregated data points so that the true relationship between variables is clear.
 Luloff, A.E. et al. (1980). Definitions of Community: An Illustration of
Aggregation Bias. Station Bulletin 516. New Hampshire Agricultural Experiment Station. Durham, NH: University of New Hampshire.
 Durkheim, E. (1897). Le suicide. Paris: F. Alcan. English translation by J A Spalding (1951). Toronto, Canada: FreePress/Collier-MacMillan.