Hypothesis Testing > False Discovery Rate
- What is the False Discovery Rate?
- FDR Formula
- FDR in hypothesis testing
- FDR in medical testing
- Adjusting the false discovery rate
Closely related to the FDR is the family-wise error rate (FWER). The FWER is the probability of making at least one false conclusion (i.e. at least one Type I Error). In other words, it is the probability of making any Type I error at all. The Bonferroni correction controls the FWER, guarding against making one or more false positives. However, using this correction may be too strict for some fields and may lead to missed findings (Mailman School of Public Health, n.d.). The FDR approach is used as an alternative to the Bonferroni correction and controls for a low proportion of false positives, instead of guarding against making any false positive conclusion at all. The result is usually increased statistical power and fewer type I errors.
The false discovery rate formula (Akey, n.d.) is:
FDR = E(V/R | R > 0) P(R > 0)
- V = Number of Type I errors (i.e. false positives)
- R = Number of rejected hypotheses
In a more basic form, the formula is just saying that the FDR is the number of false positives in all of the rejected hypotheses. The information after the | (“given”) symbol is just stating that:
- You have at least one rejected hypothesis,
- The probability of getting at least one rejected hypothesis is greater than zero.
According to University of London’s David Colquhoun, “It is well know that high false discovery rates occur when many outcomes of a single intervention are tested.” Thanks to the power of computing, we can now test a hypothesis millions of times, which can result in hundreds of thousands of false positives.
For a more humorous (an perhaps understandable) look at the problems of repeated hypothesis testing and high false discovery rates, take a look at XKCD’s “Jelly Bean Problem.” The comic shows a scientist finding a link between acne and jelly beans, when a hypothesis was tested at a 5% significance level. Although there is no link between jelly beans and acne, a significant result was found (in this case, a jelly bean caused acne) by testing multiple times. Testing 20 colors of jelly beans, 5% of the time there is 1 jelly bean that is incorrectly fingered as being the acne culprit. The implications for false discovery in hypothesis testing is that if you repeat a test enough times, you’re going to find an effect…but that effect may not actually exist.
The odds of you getting a false positive result when running just 20 tests is a whopping 64.2%. This figure is obtained by first calculating the odds of having no false discoveries at a 5% significance level for 20 tests:
If the probability of having no false conclusions is 35.8%, then the probability of a false conclusion (i.e. a green jelly bean that causes acne) is 64.2%.
Back to top
In medical testing, the false discovery rate is when you get a “positive” test result but you don’t actually have the disease. It’s the complement of the Positive Predictive Value(PPV), which tells you the probability of a positive test result being accurate. For example, if the PPV was 60% then the false discovery rate would be 40%. The image below shows a medical test that accurately identifies 90% of real diseases/cases. The false discovery rate is the ratio of the number of false positive results to the number of total positive test results. Out of 10,000 people given the test, there are 450 true positive results (box at top right) and 190 false positive results (box at bottom right) for a total of 640 positive results. Of these results, 190/640 are false positives so the false discovery rate is 30%.
If you repeat a test enough times, you will always get a number of false positives. One of the goals of multiple testing is to control the FDR: the proportion of these erroneous results. For example, you might decide that an FDR rate of more than 5% is unacceptable. Note though, that although 5% sounds reasonable, if you’re doing a lot of tests (especially common in medical research), you’ll also get a large number of false positives; for 1000 tests, you could expect to get 50 false positives by chance alone. This is called the multiple testing problem, and the FDR approach is one way to control for the number of false positives.
The FDR approach adjusts the p-value for a series of tests. A p-value gives you the probability of a false positive on a single test; If you’re running a large number of tests from small samples (which are common in fields like genomics and protoemics), you should use q-values instead.
- A p-value of 5% means that 5% of all tests will result in false positives.
- A q-value of 5% means that 5% of significant results will be false positives.
The procedure to control the FDR, using q-values, is called the Benjamini-Hochberg procedure, named after Benjamini and Hochberg (1995), who first described it.
When not to correct
Although controlling for type I errors sound ideal (why not just set the threshold really low and be done with it?), Type I and Type II errors form an inverse of relationship; when one goes down, the other goes up and vice-versa. By decreasing the false positives, you increase the number of false negatives — that’s where there is a real effect, but you fail to detect it. In many cases, an increase in false negatives may not be an issue. But if false negatives are costly or vitally important for future research, you may not want to correct for false positives at all (McDonald, 2014). For example, let’s say you’re researching a new AIDS vaccine. A high number of false positives may be a hint that you’re on the right track, and it may indicate potential for future research on the vaccine. But if you overcorrect, you may miss out on those possibilities.
Back to top
Akey, J. (n.d.). Lecture 10: Multiple Testing. Article posted on the University of Washington website. Retrieved October 29, 2017 from:http://www.gs.washington.edu/academics/courses/akey/56008/lecture/lecture10.pdf.
Benjamini, Y. & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) Vol. 57, No. 1, pp. 289-300
Colquhoun, D. An investigation of the false discovery rate and the misinterpretation of p-values. Published 19 November 2014. Available here.
Mailman school of Public Health (n.d.). Article posted on the Columbia University website. Retrieved 10/29/2017 from: https://www.mailman.columbia.edu/research/population-health-methods/false-discovery-rate
XKCD’s “Jelly Bean Problem.”
McDonald, J.H. 2014. Handbook of Biological Statistics (3rd ed.). Sparky House Publishing, Baltimore, Maryland. Retrieved October 29, 2017 from: http://www.biostathandbook.com/multiplecomparisons.html
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.