## What is a Pólya Distribution?

The**Pólya distribution**(also called the

*Pólya-Eggenberger distribution*), named after George Pólya, is a discrete probability distribution related to Pólya’s urn (also called the

*Pólya-Eggenberger urn scheme*) [1]. It describes the number of red balls drawn in the first n trials of Pólya’s urn. The number of black balls follows a negative Pólya-Eggenberger distribution [2].

The Pólya distribution has applications in fields as diverse as genetics, insurance, and modeling epidemics.

The **multivariate Pólya distribution**, sometimes called the *Dirichlet-multinomial distribution* or *Dirichlet compound multinomial distribution*, is an extension of the univariate beta binomial distribution.

## Process and PMF for the Pólya Distribution

The distribution models a simple process: draw a random ball from an urn containing *r* red balls and *N *− *r* black balls. Record the color of the ball, then return the ball to the urn with* c* additional balls of the same color. Repeat the process for *n* draws. If *X *is the number of red balls removed in the first *n* trials, then the random variable *X* follows a Pólya distribution.

The probability mass function is [3]:

Where *N*, *n*, *r*, and *c *are natural numbers.

When the sample size is large enough, the Pólya distribution can be estimated with the binomial distribution.

## References

[1] Kaiser, H. & Stefansky, W. A Polya Distribution for Teaching. The Teacher’s Corner. Retrieved November 13, 2021 from: https://www.jstor.org/stable/2682866

[2] Marshall, A. (1990). Bivariate Distributions Generated from Pólya-Eggenberger Urn Models. Journal of Multivariate Analysis 35, 48-65.

[3]Teerapabolarn, K. (2014). An Improved Binomial Distribution to Approximate the Pólya Distribution. International Journal of Pure and Applied Mathematics

Volume 93 No. 5, 629-632