Statistics How To

False Discovery Rate

Statistics Definitions > False Discovery Rate

What is the False Discovery Rate?

False discovery rates are found in medical testing and in hypothesis testing. The two are closely related. A medical testing example is easier to visualize than a generic “hypothesis test”, so you might find it useful to follow the example below of how a false discovery rate is calculated for a medical test.

The false discovery rate is the proportion of Type I errors among the rejected hypotheses.

False Discovery Rates in Medical Testing

In medical testing, the false discovery rate is when you get a “positive” test result but you don’t actually have the disease. It’s the complement of the Positive Predictive Value(PPV), which tells you the probability of a positive test result being accurate. For example, if the PPV was 60% then the false discovery rate would be 40%. The image below shows a medical test that accurately identifies 90% of real diseases/cases. The false discovery rate is the ratio of the number of false positive results to the number of total positive test results. Out of 10,000 people given the test, there are 450 true positive results (box at top right) and 190 false positive results (box at bottom right) for a total of 640 positive results. Of these results, 190/640 are false positives so the false discovery rate is 30%.
false discovery rate

False Discovery Rates in Hypothesis Testing

According to University of London’s David Colquhoun, “It is well know that high false discovery rates occur when many outcomes of a single intervention are tested.” For a more humorous (an perhaps understandable) look at the problems of repeated hypothesis testing and high false discovery rates, take a look at XKCD’s “Jelly Bean Problem.”
xkcd2


The comic shows a scientist finding no link between acne and jelly beans, when a hypothesis was tested at a 5% significance level. This means that even if there is no effect, you would get the difference (in this case, jelly beans causing acne) in 5% of the studies. Testing 20 colors of jelly beans, 5% of the time there is 1 jelly bean that is incorrectly fingered as being the acne culprit. The implications for false discovery in hypothesis testing is that if you repeat a test enough times, you’re going to find an effect…but that effect may not actually exist. In fact, the odds of you getting a false positive result when running 20 tests is a whopping 64.2%. This figure is obtained by first calculating the odds of having no false discoveries at a 5% significance level for 20 tests:

FDR

The probability 20 trials will not have any false conclusions (using the binomial formula).



If the probability of having no false conclusions is 35.8%, then the probability of a false conclusion (i.e. a green jelly bean that causes acne) is 64.2%.

More technically, the probability of a false conclusion (a Type I Error) is called the family-wise error rate (FWER). The false discovery rate is the proportion of Type I errors among the rejected hypotheses.

Reference:
Colquhoun, D. An investigation of the false discovery rate and the misinterpretation of p-values. Published 19 November 2014. Available here.
XKCD’s “Jelly Bean Problem.”

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!
False Discovery Rate was last modified: October 12th, 2017 by Andale