Statistics Definitions > What is Simpson’s Paradox?
What is Simpson’s Paradox?: Overview.
The Paradox is named after the statistician who discovered something unusual in the 1960s. It’s a great example of how stats can be wrong. The paradox is that averages can be silly and misleading. Sometimes they can be just plain baffling.
A bit of a ridiculous example of Simpson’s Paradox:
Votes case in a recent election: 120,045.
Number of voting precincts: 109.
Number of voters who own dogs: 19,876.
The “average” above is correct math (120 + 045 + 109 + 19 + 876 / 3 = 46677). But the average doesn’t actually mean anything. It makes no sense to take an average of voters, precincts, and the number of voters who own dogs. In real life, the paradox is usually more subtle. After all, no one is really going to average voters and pet.
Simpson’s Paradox: Real Life Example
A real-life case of the paradox happened in 1973. Admission rates were investigated at the University of Berkeley’s graduate schools. The university was sued by women for the gender gap in admissions:
The results of the investigation were: When each school was looked at separately (law, medicine, engineering etc.), women were admitted at a higher rate than men! However, the average suggested that men were admitted at a much higher rate than women. Talk about confusing.
This misleading average is a classic example. But how can it be possible? The answer is that women applied in large numbers to schools with low admission rates: Like law and medicine. These schools admitted less than 10 percent of students. Therefore the percentage of women accepted was very low. Men, on the other hand, tended to apply in larger numbers to schools with high admission rates: Like engineering, where admission rates are about 50%. Therefore the percentage of men accepted was very high. Even though the statistic was widely reported and caused an outrage, the average in this case, made no sense at all.
Check out our YouTube channel for more tips.------------------------------------------------------------------------------
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!