Probability Distributions > Open Ended Distribution

## What is an Open Ended Distribution?

Watch the video or read the article below:

An open ended distribution means that one or more of your classes (or bins) is open-ended. In other words, it doesn’t have a boundary. In this frequency distribution table, a height of 57″ or less (in the first row) means this is an open ended distribution.

This frequency histogram also shows an open ended distribution. The upper bin shows book prices of “31 and up.”

The opposite would be a closed ended distribution. In the following table, the boundaries clearly start at 118 and end at 157. For whatever reason, the researcher has no interest in numbers below or above those points.

## Why are open ended distributions necessary?

Open ended distributions are usually a matter of choice. It depends on the type of research you are doing and what you want to find out from your data. For example, let’s say you are making a frequency distribution table of family size. You poll 100 families and get the following data:

- One child: 28 families.
- Two children: 33 families.
- Three children: 28 families.
- Four children: 6 families.
- Six children: 2 families.
- Nine children: 1 families.
- Ten children: 1 families.
- Twenty children: 1 families.

You could summarize this data in a table like this:

Number of children. | Number of families. |
---|---|

1 | 28 |

2 | 33 |

3 | 28 |

4 | 6 |

6 | 2 |

9 | 1 |

10 | 1 |

20 | 1 |

But as you can probably guess, if you did a larger poll, you could end up with dozens of categories. In fact, one lucky(?) couple had 69 children. In most cases you don’t really care about exact family size. You might be comparing the socio-economic status of smaller families (two children and under) with larger families (three or more). Then it makes sense to report the data as an open ended distribution. The modified frequency distribution table is open ended at “more than four.”

You could summarize this data in a table like this:

Number of children. | Number of families. |
---|---|

1 | 28 |

2 | 33 |

3 | 28 |

more than 4 | 11 |

## Avoiding Open-Ended Classes

Sometimes using open-ended classes is unavoidable, but they can cause problems with calculations and interpretation. For example, if I have two classes:

- < 100
- > 100

Both classes could have entries in the thousands (either negative or positive), or values that approach infinity. You also run the possibility of your classes being very imbalanced if the bulk of your data falls into an open-ended class.

------------------------------------------------------------------------------**Need help with a homework or test question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.