What is a Boxplot?
A boxplot, or box and whisker diagram, is a way to show the spread and centers of a data set. Measures of spread include the interquartile range and the mean of the data set. Measures of center include the mean or average and median (the middle of a data set). When you look at a boxplot, it’s much easier to see how your data is centered.
How to read a box plot
A boxplot is a way to show a five number summary in a chart. The main part of the chart (the “box”) shows where the middle portion of the data is: the interquartile range. The ends of the box show the first quartile (the 25% mark) and the third quartile (the 75% mark). The far left of the chart (at the end of the left “whisker”) is the minimum and the far right is the maximum. The median is represented by a vertical bar in the center of the box. Box plots aren’t used that much in statistics. However, they can be a useful tool for getting a quick summary of data.
How to read a box plot: Steps
Step 1: Find the minimum.
The minimum is the far left hand side of the graph, at the tip of the left whisker. For this graph, the left whisker end is at approximately 0.75.
Step 2:Find Q1, the first quartile.
Q1 is represented by the far left hand side of the box. In this case, about 2.5.
Step 3: Find the median.
The median is represented by the vertical bar. In this boxplot, it can be found at about 6.5.
Step 4: Find Q3, the third quartile.
Q3 is the far right hand edge of the box, at about 12 in this graph.
Step 5: Find the maximum.
The maximum is the end of the “whiskers”: in this graph, at approximately 16.
All done. That’s how to read a box plot!
Like the explanation? Check out the Practically Cheating Statistics Handbook, which has hundreds more step-by-step solutions, just like this one!
Note on Outliers:
Data sets can sometimes contain outliers that are suspected to be anomalies (perhaps because of data collection errors or just plain old flukes). If outliers are present, the whisker on the appropriate side is drawn to 1.5*IQR rather than the data minimum or the data maximum. Small circles or unfilled dots are drawn on the chart to indicate where suspected outliers lie. Filled circles are used for known outliers.
Check out our YouTube channel for lots of videos on basic statistics, including using Excel to draw graphs and charts. New videos being added every week; Comments and suggestions are always welcome.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you’re are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.