Statistics Definitions > Empirical Rule
Watch the video or read the article below:
Definition of the Empirical Rule
- 68% of data falls within the first standard deviation from the mean.
- 95% fall within two standard deviations.
- 99.7% fall within three standard deviations.
The rule is also called the 68-95-99 7 Rule or the Three Sigma Rule.
When do we use the Empirical Rule?
The Empirical Rule is often used in statistics for forecasting, especially when obtaining the right data is difficult or impossible to get. The rule can give you a rough estimate of what your data collection might look like if you were able to survey the entire population.
This rule applies generally to a random variable, X, following the shape of a normal distribution, or bell-curve, with a mean “mu” (the Greek letter &mu) and a standard deviation “sigma” (the Greek letter σ). The rule doesn’t apply to distributions that are not normal, but you can apply it to other distributions using Chebyshev’s Theorem.
Empirical Rule: Notation
When applying the Empirical Rule to a data set the following conditions are true:
- Approximately 68% of the data falls within one standard deviation of the mean (or between the mean – one times the standard deviation, and the mean + 1 times the standard deviation). In mathematical notation, this is represented as: μ±1σ
- Approximately 95% of the data falls within two standard deviations of the mean (or between the mean – 2 times the standard deviation, and the mean + 2 times the standard deviation). The mathematical notation for this is: μ±2σ
- Approximately 99.7% of the data falls within three standard deviations of the mean (or between the mean – three times the standard deviation and the mean + three times the standard deviation). The following notation is used to represent this fact: μ±3σ
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you’re are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.