## What is a Categorical Distribution?

The term “categorical distribution” has come to mean two separate things: Informally, it’s *any distribution with categories*; Alternatively (and more precisely), it’s** a generalization of the Bernoulli distribution for a categorical random variable**. While a random variable in a Bernoulli distribution has two possibile outcomes, a categorical random variable has multiple possibilities.

The sample space for a Bernoulli distribution is {0, 1} and for a categorical distribution, it’s {0,1…n}. For example, a dice roll, where there are six outcomes {1,2,3,4,5,6} is a categorical distribution. When there is a single trial, the categorical distribution is equal to a multinomial distribution.

As this distribution only deals with discrete outcomes, it is sometimes called a **discrete categorical distribution**.

## Examples

Throw a six-sided dice dice fifty times and observe the outcomes. The possible outcomes (the sample space) are 1,2,3,4,5,6. Each outcome has a probability of 1/6. The number of trials, “n” is 10.

Count the number of times a word appears in a book. The possible outcomes are any words in the text (e.g. the, and, to…). The probability depend on the number of words in the text.

Count the number of times a player on a hockey team scores a goal. The possible number of outcomes (i.e. goals) depend on the length of the game but will likely be somewhere between 0 and 10. Different probabilities will be assigned to the players, depending on their positions and their ability. For example, a forward has a higher chance of scoring a goal than a goalkeeper. And a forward with an excellent scoring record has a higher probability than a player with a poor record.

**References:**

A Dictionary of Statistical Terms, 5th edition, prepared for the International Statistical Institute by F.H.C. Marriott. Published for the International Statistical Institute by Longman Scientific and Technical.

Leitch, M. (201). A Pocket Guide to Risk Mathematics: Key Concepts Every Auditor Should Know. John Wiley & Sons.

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!