Probability Distributions > Tukey Lambda Distribution

## What is the Tukey Lambda Distribution?

The Tukey Lambda Distribution is a family of symmetric distributions with truncated (cut off) tails.The distribution is defined numerically with three parameters:

- λ, the shape parameter,
- μ, the location parameter,
- σ, the scale parameter.

Unlike most other probability distributions, there isn’t a “one size fits all” formula for the probability density function (PDF) and cumulative distribution function (CDF). You’ll usually see it defined in terms of quantiles or the following distribution function:

The following image shows two vastly different shapes for λ values of 0.5 and 5.0:

So although there isn’t one general PDF for all possible values of λ, a PDF can easily be generated for specific values once you know the percentile function.

## GLD and the Percentile Function

The **generalized lambda distribution** is usually defined in terms of its four-parameter percentile function:

.

**Where**:

0 ≤ y ≤ 1

and

## Uses

Not surprisingly, the Tukey Lambda isn’t used for statistical modeling due to the **lack of an easily definable PDF or CDF.** It’s main use is to **approximate other symmetric probability distributions** such as:

- Cauchy distribution (λ = -1)
- Normal distribution (λ = 0.14)
- U-Shaped Distribution (λ = 0.5)

It is also an **exact match** for some distributions, including:

- Logistic distribution (λ = 0)
- Uniform distribution (λ = -1 to 1)

## Origins

Tukey’s lambda distribution was first proposed by John Tukey in 1960 (although it was based on 1947 work by Hastings et. al). In the early 1970s the distribution was generalized by John Ramberg and Bruce Schmeiser for use in Monte Carlo simulations; Ramberg and colleagues went on to develop the curve-fitting properties of the distribution in the late 70’s.

## Tukey-Lambda PPCC Plot

One of the more common uses for the Tukey-Lambda distribution is in PPCC plot generation. A Tukey-Lambda PPCC plot (Probability Plot Correlation Coefficient Plot) is generated by software and is based on a set of inputted data. The plot results in a suggested model for the data.

**References**:

Tukey, J. (1960). The Practical Relationship Between the Common Transformations of Percentages of Counts and Amounts, Technical Report 36. Statistical Techniques Research Group, Princeton University.

Hastings, C. et. al (1947). Low moments for small samples: a comparative study of statistics. Annals of Mathematical Statistics, 18, 413-426.

Ramberg, J & Schmeiser, B. (1972). An approximate method for generating symmetric random variables. Commun. ACM, 15:987-990.

Ramberg, J. et. al. (1979). A probability distribution and its uses in fitting data. Technometrics, 21(2):201-214, May 1979.

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!