List of probability distributions > Tweedie distribution
What is a Tweedie distribution?
The Tweedie distribution is a special case of an exponential distribution. It can have a cluster of data items at zero (called a “point mass”), which is particularly useful for modeling claims in the insurance industry, in medical/genomic testing, or anywhere else there is a mixture of zeros and non-negative data points. Basically, if you see a histogram with a spike at zero, it’s a possible candidate to be fitted to a Tweedie model.
It’s very similar in shape to a ZIP distribution, except that it has a spike at zero [1]. Therefore, if you have a lot of zeros (and want to tell a story about those zeros), the ZIP distribution would be a better choice.
The Tweedie distribution is actually a family of distributions that are a subset of Exponential Dispersion Models(EDMs). EDMs are two-parameter distributions from the linear exponential family that have a scale parameter φ. Bent Jørgensen [2] named the distribution after Maurice Charles Kenneth (MCK) Tweedie, who put forward a framework that encompasses a wide range of random variables into one class in his paper titled “An index which distinguishes between some important exponential families” at the Indian Statistical Institute’s Golden Jubilee Conference in 1984 [3].
Properties of the Tweedie distribution
This family of distributions has the following characteristics:
The p in the variance function is an additional shape parameter for the distribution. “p” is sometimes written in terms of the shape parameter α:
p = (α – 2) / (α -1).
Special Cases
Some familiar distributions are special cases of the Tweedie distribution:
- p = 0 : Normal distribution,
- p = 1: Poisson distribution,
- 1 < p < 2: Compound Poisson/gamma distribution,
- p = 2 gamma distribution,
- 2 < p < 3 Positive stable distributions,
- p = 3: Inverse Gaussian distribution / Wald distribution,
- p > 3: Positive stable distributions,
- p = ∞ Extreme stable distributions.
The distribution is not defined for values of “p” from 0 to 1.
The probability density function for the Tweedie family is complex and cannot be expressed in a simple closed form (but it is sometimes expressed as a series of functions). As the distribution mimics other distributions for some values of “p”, you can use the pdf for those functions. For example, if p is 3, you can use the pdf for the inverse Gaussian.
References
[1] Statistical Methods Series: Zero-Inflated GLM and GLMM. [YouTube]
[2] Jørgensen, B (1987). “Exponential dispersion models“. Journal of the Royal Statistical Society, Series B. 49 (2): 127–162. JSTOR 2345415.
[3] Tweedie, M. C. K. (1984). An index which distinguishes between some important exponential families. In Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference. (Eds. J. K. Ghosh and J. Roy), pp. 579-604. Calcutta: Indian Statistical Institute.