Post Hoc tests > Familywise Error Rate

## What is the Familywise Error Rate?

The familywise error rate (FWE or FWER) is the probability of a coming to at least one false conclusion in a series of hypothesis tests . In other words, it’s the probability of making at least one Type I Error. The term “familywise” error rate comes from *family of tests*, which is the technical definition for a series of tests on data.

The FWER is also called alpha inflation or cumulative Type I error.

## Formula

The formula to estimate the familywise error rate is:

**FWE ≤ 1 – (1 – α _{IT})^{c}**

Where:

- α
_{IT}= alpha level for an individual test (e.g. .05), - c = Number of comparisons.

For example, with an alpha level of 5% and a series of ten tests, the FWER is:

FWE = ≤ 1 – (1 – .05)^{10} = .401.

This means that the probability of a type I error is just over 40%, which is very high considering only ten tests were performed.

## Controlling the FWER

You need to control the FWER for one main reason: If you run enough hypothesis tests (dozens, hundreds, or sometimes tens of thousands) you’re*highly likely*to get at least one significant result — a “false alarm” where you incorrectly reject the null hypothesis.

Two main procedures are used to control the FWER: single step and sequential.

**Single step**

The single step procedure makes equal adjustments to each p-value. This keeps the overall alpha level at the desired level (e.g. .05) and is called a Bonferroni correction.

- Divide the alpha level by the number of tests you’re running and apply that alpha level to each individual test. For example, if your overall alpha level is .05 and you are running 5 tests, then each test will have an alpha level of .05/5 = .01.
- Apply the new alpha level to each test for finding p-values. In this example, the p-value would have to be .01 or less for statistical significance.

The Bonferroni correction has been criticized for (among other things) loss of power and a high probability of Type II errors.

## Sequential

Similar to Bonferroni, but makes adaptive adjustments to each p-value. Several sequential methods exist. The easiest is probably the Holm-Bonferroni Method, but several others have been developed including the Sidak-Bonferroni and Holland-Copenhaver.

**Holm-Bonferroni**: tests are run and then ordered from lowest to highest p-values. The individual tests are then tested (starting with the one with the lowest p-value) with an overall Bonferroni correction for all tests. See:*Holm-Bonferroni Method*for a step-by-step example.**Sidak-Bonferroni**(sometimes called the Boole or Dunn approximation): a variant of Bonferroni which uses a Taylor expansion (from calculus).

**Reference**:

Olejnik,S., Li, J., Supattathum, S., and Huberty, C.J. (1997). Multiple testing and statistical power with modified Bonferroni procedures. Journal

of educational and behavioral statistics, 22, 389-406.

Holland, B. S., and M. D. Copenhaver. 1987. An improved sequentially rejective Bonferroni test procedure. Biometrics 43: 417–423.

**Need help with a specific statistics question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.

I’m a little loss at the description of criticism of the Bonferroni method. It is written that Bonferroni correction has been criticizing for loss of power and a high probability of Type II error. However, since you are reducing the alpha level, how is this a loss of power? Moreover, how does a loss of power lead to higher probability of Type II error. Maybe my understanding of power is incorrect, but shouldn’t higher power lead to higher Type II errors?

Thank you in advance.

“Since you are reducing the alpha level, how is this a loss of power?”

Power is a function of the number of tests and corresponding critical value increases. See this image.

“…how does a loss of power lead to higher probability of Type II error.”

By definition, power is 1–β (type II error). So high values for beta mean lower power and vice versa.