Hurdle Distribution

< List of probability distributions < Hurdle distribution

A hurdle distribution (also called a zero-altered distribution) is a two-part mixture distribution that accounts for excess zeros in data. It’s called a hurdle distribution because of the need to overcome the “hurdle” of excess zeros such as the recording of rare phenomenon.

hurdle model

“[The hurdle distribution] provides a natural means for modeling overdispersion and underdispersion of the data”

Mullahy, 1986, p. 54 [1]

The hurdle distribution was first proposed by Cragg in 1971 [2]. Since then, the distribution has gained in popularity and is commonly found in epidemiology, genetics, insurance claims, marketing and medicine.

Hurdle distribution duality

The number of events in a hurdle distribution is a result of two distributions [3]:

  • A binomial distribution that determines whether zero or non-zero events will be observed. A value of zero can only come from this portion of the model.
  • A zero-truncated Poisson distribution or negative binomial distribution to determine the non-zero counts (1, 2, 3, …).

Another way to approach modeling of data with excess zeros is zero-inflated models such as the ZIP distribution and some negative binomial variables of zero-inflated and hurdle models [4]. These distributions differ in how zeros can happen: in zero-inflated models, zeros can happen as an outcome of the counting variable; in hurdle models, zeros can only happen as outcomes when the counting variable is truncated at zero [5].

References

  1. Mullahy, J. (1986). Specification and testing of some modified count data models.
    Journal of econometrics, 33 (3), 341–365.
  2. Cragg J.G. (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica, 39, 829–844.
  3. Martin, P. (2022). Regression Models for Categorical and Count Data. SAGE publications.
  4. Min, Y., and Agresti, A. (2005). Random effect models for repeated measures of zero-inflated count data. Statistical Modelling, 5 (1), 1–19.
  5. Zuniga, F. (2021). A New Trivariate Model and Generalized Linear Model for Stochastic Episodes’ Duration, Magnitude and Maximum. Dissertation.

 


Comments? Need to post a correction? Please Contact Us.

Leave a Comment