What is an Inverse Distribution?
“Inverse distribution” is one of those terms that has several meanings, depending on where you’re reading about it. It’s one of those informal terms that means one thing if you’re working with sampling, another if you’re looking at cumulative function distributions and yet another if the variables in a distribution are reciprocals. According to the Oxford Dictionary of Statistical Terms, the different definitions include:
- The reciprocal of a random variable’s probability distribution. For example, The inverse gamma distribution is the reciprocal of the gamma distribution.
- Sampling up to a certain number of successes. See: Inverse sampling.
- Distributions where frequencies are reciprocal quantities (e.g. the factorial distribution),
- As a way to find variables in terms of the distribution function F(x). For example, the inverse normal distribution refers to the technique of working backwards, given F(x) to find x-values.
While all definitions are valid uses of the term “Inverse Distribution”, the term “Inverse Distribution Function” in probability and statistics usually implies definition #4, i.e. actually using it to find probabilities.
Inverse Distribution Function
The inverse distribution function for continuous variables Fx-1(α) is the inverse of the cumulative distribution function (CDF). In other words, it’s simply the distribution function Fx(x) inverted. The CDF shows the probability a random variable X is found at a value equal to or less than a certain x. Intuitively, it’s how much area is under the curve at a certain point. The inversion of the CDF, the IDF, gives a value for x such that:
FX(x) = Pr(X ≤ x) = s,
Where s is where random draws would fall s * 100 percent of the time (Greiner et. al, 2014).
The process sounds simple—invert the CDF— but many distributions don’t actually have simple inversions. The exponential distribution is one exception where the inverse is defined as:
Good approximations are available for common functions like the normal and gamma distributions.
Relationship Between CDF and Inverse Probability Function
The CDF gives you probabilities of a random variable X being less than or equal to some value x. The z-table is a basic example of how this works: a score found on the table shows the probability of a random variable falling to the left of the score (the “x”):
The inverse of the CDF (i.e. the Inverse Function) tells you what value x (in this example, the z-score) would make F(x)— the normal distribution in this case— return a particular probability p. In notation, that’s:
F-1(p) = x.
To sum that all up:
- CDF = what area/probability corresponds to a known z-score?
- Inverse Function = what z-score corresponds to a known area/probability?
I used the normal distribution as an example as that’s the distribution most people seem to be familiar with. However, the concept can be applied to most distributions.
Percent Point Function
The term “Percent Point Function” is usually used to denote a specific inverse function. For example:
“The χ2 distribution percent point function (quantile) is used with significance level α to reject the null hypothesis” (Beierle & Dekhtyar, 2015). Percent point functions exist for a wide range of distributions including the gamma distribution, Weibull distribution, triangular distribution, and many more.
The term quantile function is a synonym for the Inverse Distribution Function or Percent Point Function. It’s use is mainly restricted to software applications. For example, the SAS Quantile Function, given a specified distribution and probability, “Returns the quantile from a distribution that you specify.”
The word quantile comes from the word quantity. It refers to dividing a probability distribution into areas of equal probability.
Abernathy, R. and Smith, R. (1993). Inverse Distribution Function in World Heritage Encyclopedia. Retrieved December 5, 2017 from: http://self.gutenberg.org/articles/eng/Inverse_distribution_function
Engineering Statistics Handbook (n.d). Retrieved December 5, 2017 from: http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm
Greiner, D. et. al (2014). Advances in Evolutionary and Deterministic Methods for Design, Optimization and Control in Engineering and Sciences. Springer.
Lewis, P. & McKenzie, E. (1988). Simulation Methodology for Statisticians, Operations Analysts, and Engineers, Volume 1. CRC Press.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.