Statistics Definitions > Average (Mean) Deviation
Contents:
- What is average deviation?
- How to find average deviation
- Workaround to avoid the zero problem
- Mean deviation for a frequency table
- Absolute Deviation vs Average Deviation
- Standard Deviation vs. Average Deviation
What is average deviation?
Deviation measures the difference between an observed value and the mean value of a data set. It represents the distance the data points from the distribution’s center point. Similarly, mean deviation (also called Average Mean Deviation (AMD) or just Average Deviation) is used to measure how far a set of values deviate from a dataset’s midpoint.
The average / mean deviation isn’t used very often as it is not a reliable measure of variability [1]. The average deviation obtained from a sample is a biased estimator for the population mean deviation; In other words, it’s average value usually doesn’t match the population’s mean deviation [2]. In addition, it always equals zero for symmetric data with a mean of zero, but there are a couple of workarounds.
How to find average deviation
The formula for the mean deviation (MD) of a population is
MDpopulation = [Σ |X – µ|] / N
where
- Σ = summation of values.
- X = each value in the dataset.
- µ = the population mean.
- N = the number of data points in the population.
For a sample, the formula is the same, but the notation is slightly different, with x̄ — the sample mean — replacing µ, the population mean:
MDsample = [Σ |X – x̄|] / n
where
- Σ = summation (addition) of values.
- X = each value in the dataset.
- x̄ = the sample mean.
- n = the number of items in the sample.
The steps are exactly the same whether you are working with a sample or an entire population:
- Calculate the mean of the dataset.
- Subtract the mean value from each x-value.
- Find the mean of the values from step 2.
Example: Find the average deviation of the following set of numbers: 3, 8, 8, 8, 8, 9, 9, 9, 9.
- Calculate the mean: (3 + 8 + 8 + 8 + 8 + 9 + 9 + 9 + 9) = 71.9 = 7.89.
- Subtract the mean from each data point
- |3 – 7.89| = 4.89
- |8 – 7.89| = 0.11
- |8 – 7.89| = 0.11
- |8 – 7.89| = 0.11
- |8 – 7.89| = 0.11
- |9 – 7.89| = 1.11
- |9 – 7.89| = 1.11
- |9 – 7.89| = 1.11
- |9 – 7.89| = 1.11
- Add up all of the deviations from Step 2. 4.89 + 0.11 + 0.11 + 0.11 + 0.11 + 1.11 + 1.11 + 1.11 + 1.11= 9.77
- Divide by the number of items in your data set. There are 9 items, so: 9.77/9 = 1.09.
The average deviation is 1.09.
Workaround to avoid the zero problem
The mean deviation has a major problem — it can equal zero when the data is symmetric and the mean is zero. This makes it impossible to use it as a reliable measure to compare variability between different distributions [3]. For example, the following table shows the calculations for the mean deviation of (1, 2, 3, 4, 5), which has a mean of 3:
X | X – mean |
---|---|
1 | 1 – 3 = -2 |
2 | 2 – 3 = -1 |
3 | 3 – 3 = 0 |
4 | 1 – 3 = 1 |
5 | 5 – 3 = 2 |
Σ = 0 |
The mean deviation for this set of data is zero, which obviously isn’t true as the numbers are not all equal to the mean.
The sum of deviations will always be zero because it is a property of the sample mean: the sum of deviations below the average will always equal the deviations above the average. As the goal is to capture the magnitude of these deviations in a summary measure, there are a couple of workarounds [2]:
-
-
- Square each deviation from the mean. This is the most popular workaround.
- Take absolute values. While this would technically work, absolute values can cause problems in mathematical proofs, so it’s best to avoid it.
-
Let’s repeat the process, this time squaring the deviations before summing them:
X | X – mean | (X – mean)2 |
---|---|---|
1 | 1 – 3 = -2 | -2 * -2 = 4 |
2 | 2 – 3 = -1 | -1 * -1 = 1 |
3 | 3 – 3 = 0 | 0 |
4 | 1 – 3 = 1 | 1 * 1 = 1 |
5 | 5 – 3 = 2 | 2 * 2 = 4 |
Σ = 10 |
Mean deviation for a frequency table
Consider the following table of data:
Test | No. of students (xi) | Classes per week (fi) |
---|---|---|
Statistics | 6 | 5 |
Science | 5 | 7 |
Film studies | 9 | 4 |
Economics | 12 | 9 |
To calculate the mean deviation for a frequency table, we need a slightly different formula — one that takes into account the frequency (the weight of the number of classes per week):
Mean Deviation = Σ(f * |x – mean|) / Σf
where:
- Σf = the sum of the frequencies
- x = the data point
- mean = the mean of the data set
- |x – mean| = the absolute value of the difference between the data point and the mean
- f = the frequency of the data point.
The steps are exactly the same whether you are working with samples or populations:
- Find the mean of the frequency distribution with the formula Σfx /Σf
- Subtract the mean from each value and record the absolute value of the result. This is |x – mean|.
- Multiply each frequency by |x – mean|.
Putting the results in a table:
x | f | x * f | |x – mean| | f * |x – mean| |
---|---|---|---|---|
6 | 5 | 30 | 6 – 8.36 = 2.36 | 5 * 2.36 = 11.8 |
5 | 7 | 35 | 5 – 8.36 = 3.36 | 7 * 3.36 = 23.52 |
9 | 4 | 36 | 9 – 8.36 = 0.64 | 4 * 0.64 = 2.56 |
12 | 9 | 108 | 12 – 8.36 = 3.64 | 9 * 3.64 = 32.76 |
Totals (sums) | Σ = 25 | Σ = 209 Mean = 209 / 25 = 8.36. |
Σ =70.64 |
Now we can substitute our results into the formula:
Mean Deviation = 70.64 / 25 = 2.8256,
Absolute Deviation vs Average Deviation
Absolute deviation is the distance between each value in the data set and that data set’s mean or median. To find the distance:
- Subtract the values. For example, let’s say the mean of your data set is 10, and you have 5 values: 1, 5, 10, 15 and 19. The absolute deviations are:
- 10 – 1 = 9
- 10 – 5 = 5
- 10 – 10 = 0
- 10 – 15 = -5
- 10 – 19 = -9
- Take the absolute value of the numbers found. The absolute value of -5 is 5, and -9 is 9. The final list of values would be 9 ,5, 0, 5, and 9.
Take all of these absolute deviations, find the average, and you have the average deviation.
Standard Deviation vs. Average Deviation
Absolute Deviation is used less frequently than the standard deviation, but it’s extremely similar: both are a measure of spread. There are occasions when two different sets of data with different spreads can produce the exact same absolute deviation. However, the standard deviation can also be the same for different data sets. The absolute deviation is also considered to be more accurate for real-life situations; some authors have suggested that MAD should replace the standard deviation for real-life data. As well as potentially being more accurate, it’s also a lot simpler to calculate.
Check out our YouTube channel for hundreds of basic stats videos!
References
- Roger N. Morrissette, PhD. Statistics for the Behavioral Sciences Lesson 5 Measures of Variability
- Emory, Oxford College (2021). http://mathcenter.oxford.emory.edu/site/math117/shapeCenterAndSpread/
- Sulivam, L. & LaMorte, W. (2016). Variance and Standard Deviation