Statistics Definitions > Post-Hoc
Post-hoc (Latin, meaning “after this”) means to analyze the results of your experimental data. They are often based on a familywise error rate; the probability of at least one Type I error in a set (family) of comparisons. The most common post-hoc tests are:
- Bonferroni Procedure
- Duncan’s new multiple range test (MRT)
- Dunn’s Multiple Comparison Test
- Fisher’s Least Significant Difference (LSD)
- Holm-Bonferroni Procedure
- Rodger’s Method
- Scheffé’s Method
- Tukey’s Test (see also: Studentized Range Distribution)
- Dunnett’s correction
- Benjamin-Hochberg (BH) procedure
Bonferroni Procedure (Bonferonni Correction)
This multiple-comparison post-hoc correction is used when you are performing many independent or dependent statistical tests at the same time. The problem with running many simultaneous tests is that the probability of a significant result increases with each test run. This post-hoc test sets the significance cut off at α/n. For example, if you are running 20 simultaneous tests at α=0.05, the correction would be 0.0025. More detail. The Bonferroni does suffer from a loss of power. This is due to several reasons, including the fact that Type II error rates are high for each test. In other words, it overcorrects for Type I errors.
The ordinary Bonferroni method is sometimes viewed as too conservative. Holm’s sequential Bonferroni post-hoc test is a less strict correction for multiple comparisons. See: Holm-Bonferroni method for a step-by-step example.
Duncan’s new multiple range test (MRT)
When you run Analysis of Variance (ANOVA), the results will tell you if there is a difference in means. However, it won’t pinpoint the pairs of means that are different. Duncan’s Multiple Range Test will identify the pairs of means (from at least three) that differ. The MRT is similar to the LSD, but instead of a t-value, a Q Value is used.
Fisher’s Least Significant Difference (LSD)
A tool to identify which pairs of means are statistically different. Essentially the same as Duncan’s MRT, but with t-values instead of Q values. See: Fisher’s Least Significant Difference.
Like Tukey’s, this post-hoc test identifies sample means that are different from each other. Newman-Keuls uses different critical values for comparing pairs of means. Therefore, it is more likely to find significant differences.
Considered by some to be the most powerful post-hoc test for detecting differences among groups. This test protects against loss of statistical power as the degrees of freedom increase.
Used when you want to look at post-hoc comparisons in general (as opposed to just pairwise comparisons). Scheffe’s controls for the overall confidence level. It is customarily used with unequal sample sizes.
See: The Scheffe Test.
The purpose of Tukey’s test is to figure out which groups in your sample differ. It uses the “Honest Significant Difference,” a number that represents the distance between groups, to compare every mean with every other mean.
Benjamin-Hochberg (BH) procedure
If you perform a very large amount of tests, one or more of the tests will have a significant result purely by chance alone. This post-hoc test accounts for that false discovery rate. For more details, including how to run the procedure, see: Benjamini-Hochberg Procedure.
The Bonferroni correction is used to limit the possibility of getting a statistically significant result when testing multiple hypotheses. It’s needed because the more tests you run, the more likely you are to get a significant result. The correction lowers the area where you can reject the null hypothesis. In other words, it makes your p-value smaller.
Imagine looking for the Ace of Clubs in a deck of cards: if you pull one card from the deck, the odds are pretty low (1/52) that you’ll get the Ace of Clubs. Try again (and try perhaps 50 times), you’ll probably end up getting the Ace. The same principal works with hypothesis testing: the more simultaneous tests you run, the more likely you’ll get a “significant” result. Let’s say you were running 50 tests simultaneously with an alpha level of 0.05. The probability of observing at least one significant event due to chance alone is:
P (significant event) = 1 – P(no significant event)
= 1 – (1-0.05)50 = 0.92.
That’s almost certain (92%) that you’ll get at least one significant result.
How to Calculate the Bonferroni Correction
The calculation is actually very simple, it’s just the alpha level (α) divided by the number of tests you’re running.
Sample question: A researcher is testing 25 different hypotheses at the same time, using a critical value of 0.05. What is the Bonferroni correction?
Bonferroni correction is α/n = .05/25 = .002
For this set of 25 tests, you would reject the null only if your p-value was smaller than .002.
The Bonferroni Correction and Medical Testing
Matthew A. Napierala, MD points out how multiple tests affect physicians (and patients) in an article for the American Academy of Orthopaedic Surgeons. “In contemporary orthopaedic research studies, numerous simultaneous tests are routinely performed.” This means that given enough tests, one of them is bound to come back as a false positive. Definitely not a good thing when we’re talking about health issues.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!