Descriptive Statistics > Seven Number Summary

## What is a Seven Number Summary?

A seven number summary is a set of seven facts about a normally distributed data set. It can be a simple check for the assumption of normality, which is required by many statistical tests. The summary is usually made up of:

- The 2nd percentile.
- The 9th percentile.
- The 25th percentile (i.e. the lower quartile or Q1).
- The 50th percentile (the median).
- The 75th percentile (i.e. the upper quartile, or Q3)
- The 91st percentile.
- The 98th percentile.

The percentiles at the beginning (2nd/9th) and end (91st/98th) are used because the seven numbers in the summary will be evenly spaced if the data comes from a normal distribution.

## Alternate Versions

1. A less common version of the summary, which also results in even spacing, is in terms of the mean (μ) and standard deviation (σ):

2. The seven number is also reported occasionally as more of a “Box-plot” related summary, with references to Tukey’s method to find outliers (1.5*IQR):

- Minimum,
- Lower fence (Q1 – (1.5 * IQR)),
- Lower hinge (usually the first quartile),
- Median,
- Upper hinge (usually the third quartile),
- Upper fence,(Q1 + (1.5 * IQR)),
- Maximum.

3. Rammensee et. al (2015) defines a seven-number summary in a completely different way: as the mean, median, standard deviation, 95th and 5th percentiles, minimum and maximum. Other instances of this definition do appear in academic literature, like in this master’s theses.

4. Yet another version (Shoemaker, n.d.) contains the mean, the minimum, the maximum, the first and third quartiles, the median, plus “the number of non-missing observations.” Shoemaker states that this allows you to “…place the center of the distribution and know its rough shape and density.”

The takeaway is, **if you’re asked to find a seven number summary, check with your professor and/or textbook author to make sure you’re getting the right statistics. **Unlike the five number summary, which has a standard meaning, the seven number summary has many meanings, depending on the author and the situation.

## Similarity to the Five Number Summary

The seven number summary is similar to the five number summary, which is made up of:

- Minimum,
- Lower Quartile,
- Median,
- Upper Quartile,
- Maximum.

While the five number summary can apply to *any *distribution, the seven number summary **usually only applies to data that comes from a normal distribution.** In the seven number summary, the minimum in the five number summary is replaced by the 2nd and 9th percentile and the maximum is replaced by the 91st and 98th percentile.

## Non-Parametric Version

While the seven number summary applies only to the normal distribution, Bowley modified the summary so that, like the five number summary, it could be applied to any distribution. This summary, sometimes called the seven-figure-summary, is not used very often, except perhaps in the occasional piece of literature, like this one.

- The minimum.
- The 10th percentile (i.e. the first decile).
- The 25th percentile (i.e. the lower quartile or Q1).
- The 50th percentile (the median).
- The 75th percentile (i.e. the upper quartile, or Q3)
- The 90th percentile.
- The maximum.

**References**:

Bowley, A. (1920) Elementary Manual of Statistics, 3rd ed., p.62.

DeVeaux, R. Velleman, P. & Bock, D. (2006). Intro Stats. 3rd Edition. Pearson/Addison-Wesley. Summary available here.

Mock, R. (2011) Data Quality Analysis for Food Composition Databases. Retrieved 1/10/2017 from here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.463.5857&rep=rep1&type=pdf

Rammensee et. al (2015). Dynamics of Mechanosensitive Neural Stem Cell Differentiation. *Stem Cells*. Aug 30. Retrieved 1/10/2017 from here. http://www.cchem.berkeley.edu/schaffer/2016%20Publications/Pub.2.pdf

Shoemaker, J. (undated). Field Content Verification Using SQL Dictionary Tables. Retrieved 1/10/2007 from: http://www.lexjansen.com/nesug/nesug96/NESUG96010.pdf

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!