Tweedie Distribution: Definition and Examples

List of probability distributions > Tweedie distribution

What is a Tweedie distribution?

The Tweedie distribution is a special case of an exponential distribution. It can have a cluster of data items at zero (called a “point mass”), which is particularly useful for modeling claims in the insurance industry, in medical/genomic testing, or anywhere else there is a mixture of zeros and non-negative data points. Basically, if you see a histogram with a spike at zero, it’s a possible candidate to be fitted to a Tweedie model.

tweedie distribution
The Tweedie distribution has a point mass at zero before following a “regular” exponential curve.

 

It’s very similar in shape to a ZIP distribution, except that it has a spike at zero [1]. Therefore, if you have a lot of zeros (and want to tell a story about those zeros), the ZIP distribution would be a better choice.

The Tweedie distribution is actually a family of distributions that are a subset of Exponential Dispersion Models(EDMs). EDMs are two-parameter distributions from the linear exponential family that have a scale parameter φ. Bent Jørgensen [2] named the distribution after Maurice Charles Kenneth (MCK) Tweedie, who put forward a framework that encompasses a wide range of random variables into one class in his paper titled “An index which distinguishes between some important exponential families” at the Indian Statistical Institute’s Golden Jubilee Conference in 1984 [3].

Properties of the Tweedie distribution

This family of distributions has the following characteristics:

The p in the variance function is an additional shape parameter for the distribution. “p” is sometimes written in terms of the shape parameter α:
p = (α – 2) / (α -1).

Special Cases

Some familiar distributions are special cases of the Tweedie distribution:

The distribution is not defined for values of “p” from 0 to 1.

The probability density function for the Tweedie family is complex and cannot be expressed in a simple closed form (but it is sometimes expressed as a series of functions). As the distribution mimics other distributions for some values of “p”, you can use the pdf for those functions. For example, if p is 3, you can use the pdf for the inverse Gaussian.

References

[1] Statistical Methods Series: Zero-Inflated GLM and GLMM. [YouTube]

[2] Jørgensen, B (1987). “Exponential dispersion models“. Journal of the Royal Statistical Society, Series B49 (2): 127–162. JSTOR 2345415.

[3] Tweedie, M. C. K. (1984). An index which distinguishes between some important exponential families. In Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference. (Eds. J. K. Ghosh and J. Roy), pp. 579-604. Calcutta: Indian Statistical Institute.


Comments? Need to post a correction? Please Contact Us.