Post Hoc Tests > Benjamini-Hochberg Procedure

## What is the Benjamini-Hochberg Procedure?

The Benjamini-Hochberg Procedure is a powerful tool that decreases the false discovery rate.

Adjusting the rate helps to control for the fact that sometimes small p-values (less than 5%) happen by chance, which could lead you to incorrectly reject the true null hypotheses. In other words, the B-H Procedure helps you to avoid Type I errors (false positives).

A p-value of 5% means that there’s only a 5% chance that you would get your observed result *if* the null hypothesis were true. In other words, if you get a p-value of 5%, it’s highly unlikely that your null hypothesis is not true and should be thrown out. But it’s only a probability–many times, true null hypotheses are thrown out just because of the randomness of results.

**A concrete example: **Let’s say you have a group of 100 patients who you know are free of a certain disease. Your null hypothesis is that the patients are free of disease and your alternate is that they *do* have the disease. If you ran 100 statistical tests at the 5% alpha level, **roughly 5% of results would report as false positives.**

There’s not a lot you can do to avoid this: **when you run statistical tests, a fraction will always be false positives.** However, running the B-H procedure will decrease the number of false positives.

## How to Run the Benjamini–Hochberg procedure

- Put the individual p-values in ascending order.
- Assign ranks to the p-values. For example, the smallest has a rank of 1, the second smallest has a rank of 2.
- Calculate each individual p-value’s Benjamini-Hochberg critical value, using the formula (i/m)Q, where:
- i = the individual p-value’s rank,
- m = total number of tests,
- Q = the false discovery rate (a percentage, chosen by you).

- Compare your original p-values to the critical B-H from Step 3; find the largest p value that is smaller than the critical value.

As an example, the following list of data shows a **partial list of results from 25 tests **with their p-values in column 2. The list of p-values was ordered (Step 1) and then ranked (Step 2) in column 3. Column 4 shows the calculation for the critical value with a false discovery rate of 25% (Step 3). For instance, column 4 for item 1 is calculated as (1/25) * .25 = 0.01:

The bolded p-value (for Children) is the highest p-value that is also smaller than the critical value: .042 < .050. **All **values above it (i.e. those with lower p-values) are highlighted and considered significant, even if those p-values are lower than the critical values. For example, Obesity and Other Health are individually, not significant when you compare the result to the final column (e.g. .039 > .03). However, with the B-H correction, they are considered significant; in other words, you would reject the null hypothesis for those values.

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!

” For example, Obesity and Other Health are individually, not significant. However, with the B-H correction, they are considered significant”

This is backwards. B-H should either lower the number of significant results or not change it.

Hi, David,

Thanks for your comment. I have never heard of the BH being defined in those terms. When you say it should “lower the number of significant results”…this statement is in comparison to what? Benjamini and Hochberg (1995) stated that “controlling the probability of one or more Type I errors is too severe an approach,” as it results in practically zero significant findings. While doing “nothing” is also unacceptable, because it results in too many spurious significant findings. The BH is a “middle ground.”

See https://www.unc.edu/courses/2007spring/biol/145/001/docs/lectures/Nov12.html for another example. “…p(1) and p(2) failed to exceed their thresholds. They are still deemed significant.”

how is the value (i/m)Q calculated?

Aishwarya, See step 3 under “How to Run.” Was there anything there you needed clarifying? Regards, S.

Why “Obesity and Other Health are individually, not significant”? Their p-values are both less than 0.05, so they are significant without B-H corrections, right?

They are not significant when you compare them to the adjusted p-value in the final column. I added a clarification, I hope that helps :)

Hello,

How about when the false discovery rate is not defined as a percentage?

How about when it is defined as a number. For example the false positives

should not exceed 25. How do you calculate q or FDR?

Thanks

FDR is a proportion, so if it’s a number I would look at the total in your experiment. For your example, it should be “25 out of something.”

Hi again, Thank you for taking the time to answer my question.

I had a suspicion that it could be like that.

So is it 25 out of the total number of tests?

Or 25 out of the total significant number of tests that we could get

if we have not applied this false discovery rate correction in the first place?

If it is the latter how can we know the number of significant tests beforehand, by seeing the raw p-value?

Sorry for asking tricky things.

25 out of the total number of tests

Thank you. it is a bit strange cause once I was told that the Q in the Benjamini-Hochberg formula is:

Q=1-p-value which I guess it is 99% if you choose a p-value of 0.01 or 95% if you choose a p-value 0.05. I think that I will go with your explanation :)

Hi Andale, I just tried to reproduce Column 4 of your figure based on the description provided. However, I’m getting these numbers instead:

0.031

0.063

0.094

0.125

0.156

0.188

0.219

0.250

Am i missing something?

Are you putting in 25 for the number of tests? I suspect, from your numbers, that you might be putting in 8 instead. Also, make sure you are multiplying by .25 (as in, 25%) and not just 25.