Statistics Definitions > What is a bin (or Class Interval) in statistics?
What is a Bin in statistics? Overview
In statistics, data is usually sorted in one way or another. You might sort the data into classes, categories, by range or placement on the number line. A bin — sometimes called a class interval — is a way of sorting data in a histogram. It’s very similar to the idea of putting data into categories.
What is a bin in statistics: Why not use “Categories” instead of Class Intervals?
When you put data into categories, you’re putting them into those categories without any thoughts about how that data might tell you something. Basic sorting into categories like male/female or yes/no does exist in statistics, but when it comes to making a histogram you’re aiming to make a chart that tells you some very good information about how your data is spread out. Therefore you want to carefully choose categories/classes. You can think of a bin as being a physical bin where you might sort objects into.
Imagine you’re working in a clothing store and want to know which shoe items is most popular in your inventory. If you only fill one bin, your bin might end up overflowing pretty fast and you’d have no information. You could try using different bins for flats, heels, sneakers and sandals. That might give you a better idea about your inventory. Or you could further add bins for black heels, white heels and so on. It’s the same principle when choosing bins for a histogram; you want to choose the right amount of bins to give you the information you need.
What is a bin in statistics: Choosing bins
Choosing bins can be done by hand for simple histograms in most cases. For example, if you are making a histogram for exam scores, choosing bins that matches grades (70-79, 80-89, 90-100) is a fairly obvious choice. You have two numbers associated with each bin: the low value (sometimes called bin low), which in this example would be 70, 80, 90 and the high value (sometimes called bin high) which for this example is 79 89 100.
In most cases though, choosing bins isn’t going to be that simple especially for large data sets. When dealing with large sets of numbers, you’re usually better off using technology like Microsoft Excel to create a histogram (how to create a histogram in excel 2007), because if your bin choice doesn’t make for a nice-looking diagram you can dynamically change the bin values without having to draw a graph.
That said, if you want to choose bin sizes by hand, watch this video or click the link below for the full article:
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!