The Yule-Simon distribution (or Yule distribution) is a highly skewed discrete probability distribution named after George Udny Yule and Herbert A. Simon—winner of the 1978 Nobel Prize in economics.

Yule (1925) wrote about the distribution first, applying it to distributions of biological genera by number of species. Simon (1955) rediscovered the “Yule” distribution later, using it to examine city populations, income distributions, and word frequency in publications (Mills, 2017). Although Simon suggested the name Yule Distribution, it’s now more commonly called the Yule-Simon distribution (Hazewinkel, 2001). Simon described the distribution as “J-shaped, or at least highly skewed, with very long upper tails” (p. 425), i.e., a negative exponential distribution.

## PMF for the Yule-Simon distribution

Several equivalent forms for the PMF exist:

**Where:**

x = an integer

Γ = the gamma function,

Β = the beta function

α can be estimated with a fixed point algorithm (Garcia Garcia, 2011).

The Yule-Simon is one of the few distributions where x cannot be less than 1.

## CDF

The CDF for the Yule-Simon distribution is:

## Example: The Superstar Phenomenon

The Yule-Simon distribution is used to model a wide variety of phenomena, including the “superstar phenomenon”, where a small number of people dominate their particular field and earn the lion’s share of the money. The term *cumulative advantage* has also been used to describe this phenomenon. Let’s say Tom Cruise and John Doe vie for a lead role in a movie; the obvious choice would be Tom Cruise because he’s well known. Tom Cruise would earn a lot more than John Doe (Exponentially more), because he’s well known. And Tom Cruise would receive more phone calls, more offers to make paid appearances, and should he write an autobiography—he would probably get millions for it. Essentially, he has a cumulative advantage over John Doe, who would struggle to get noticed (or paid), even if his abilities were on the same par as Tom Cruise.

## Similar Distributions

- The Yule distribution is a special case of the
**beta-geometric distribution**, when Β = 1 (King, M, 2017). - The
**Waring distribution**is a generalization of the Yule distribution. - For large x-values, the
**Zipf distribution**and the Yule-Simon distribution are indistinguishable. In other words, the Zipf distribution models the tail end of the Yule.

## References

Garcia Garcia, J. (2011). “A fixed-point algorithm to estimate the Yule-Simon distribution parameter”. Applied Mathematics and Computation. 217 (21): 8560–8566.

Hazewinkel, M. (2001). Encyclopaedia of Mathematics, Supplement III. Springer Science & Business Media.

King, M. (2017). Statistics: A Practical Approach for Process Control Engineers. John Wiley and Sons.

Mills, T. (2017). A Statistical Biography of George Udny Yule: A Loafer of the World. Cambridge Scholars Press.

Simon, H. A. (1955). “On a class of skew distribution functions”. Biometrika. 42 (3–4): 425–440.

Yule, G. U. (1925). “A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S”. Philosophical Transactions of the Royal Society B. 213 (402–410): 21–87

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!