**Berkson’s paradox** (also known as *Berkson’s fallacy* or *Berkson’s bias*) is the counter-intuitive idea that events which seem to be correlated actually are not.

Take two events, A and B, which are completely independent events (for example, lung cancer and diabetes). If a study selects for both the presence of A (lung cancer) and B (diabetes), the **presence of diabetes will make the presence of lung cancer more likely.** Intuitively, this makes no sense, but the data seems to back this counter-intuitive notion up, showing that there is, in fact, a connection.

Berkson wrote about the paradox in 1946. His original paper showed that two diseases, which have no real relationship, can be what he called ‘**spuriously associated**‘ in hospital-based case control studies. However, the idea wasn’t widely accepted until 1979, when David Sackett of McMaster University provided strong evidence that Berkson’s paradox does, in fact, exist.

## What Causes Berkson’s Paradox?

The reason that the probability of event A happening is higher in the presence of event B happens because **cases where neither occur are excluded.**

In a classic case (from Everitt’s *Medical Statistics from A to Z: A Guide for Clinicians and Medical Students*), data from autopsies was analyzed. Fewer cases than expected of cancer and tuberculosis were found together, implying that tuberculosis is somehow protective against cancer. However, the real reason that rates appear to be lower are because not all autopsies are included in the study; people with both cancer and tuberculosis may, for one reason or another, have lower rates of autopsy.

## Simple Example

To understand this, consider a particular children’s hospital during an influenza scare. We’re going to prove the counter-intuitive idea that having influenza offers some protection against appendicitis.

- 10 percent of the general population has influenza.
- In the hospital, full of sick children, the odds are of course higher; 30 percent of the children may have been admitted for influenza.
- Now suppose 10 % of the children were admitted for appendicitis.

There will be some overlap; we assume a child with appendicitis is just as likely to get flu as any other child, and a child with flu can still have appendicitis. The percent of appendicitis patients with influenza would be 10 % of 10%, (0.10 * 0.10 = 0.01) or 1% of hospital patients.

If you

**choose one hospital child randomly**, he has 30% chance of having influenza, and a 10% chance of having epilepsy/convulsions. That is to say, 10 out of 100 children will have epilepsy/convulsions, and 30 out of a hundred will have influenza.

Now let’s calculate a new percentage: a non influenza child’s chance of having appendicitis.

You are choosing from all the children in the yellow boxed area (70 children) below.

You know the following:

- The thirty influenza patients outside of the yellow box includes the appendicitis/influenza (red/blue) children. In our example, that’s just one child,
- Out of the 100 children, there were 10 total appendicitis patients, so there will be 9 among the seventy non-influenza patients we’re picking from now.

So we can calculate the new percentage: a non influenza child has a 9 / 70 = 12.9 / 100, or ~12.9 % chance, of having appendicitis. That’s higher than the 10% rate of appendicitis among all children.

So even though these two events are entirely independent, **the inner-hospital statistics make it look like having influenza is some small insurance against appendicitis.**

## References

Berkson, Joseph (June 1946). “Limitations of the Application of Fourfold Table Analysis to Hospital Data”. Biometrics Bulletin. 2 (3): 47–53

Ellenburg, Jordan. Why Are Handsome Men Such Jerks? Slate Magazine. Retrieved from http://www.slate.com/blogs/how_not_to_be_wrong/2014/06/03/berkson_s_fallacy_why_are_handsome_men_such_jerks.html on April 3, 2016

Everitt, B. (2006). Medical Statistics from A to Z: A Guide for Clinicians and Medical Students. Cambridge University Press.

Sackett, D. (1979). Bias in Analytic Research. Journal of Chronic Diseases 32(1-2): 51.

Snoep, Morabia, Hernandez-Diza, Hernan, Vandenbroucke.

Commentary: A structural approach to Berkson’s fallacy and a guide to a history of opinions about it

Int J Epidemiol. 2014 Apr; 43(2): 515–521.

Retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3997377/ on April 2, 2018

Woodfine & Redelmeier. Berkson’s Paradox in Medical Care. Journal of Internal Medicine.

Volume278, Issue4. Pages 424-426 2015. Retrieved from https://onlinelibrary.wiley.com/doi/full/10.1111/joim.12363 on April 3, 2016

**Need help with a homework or test question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.