Hurdle distribution

< List of probability distributions < Hurdle distribution

A hurdle distribution (also called a zero-altered distribution) is a two-part mixture distribution that accounts for excess zeros in data. It’s called a hurdle distribution because of the need to overcome the “hurdle” of excess zeros such as the recording of rare phenomenon.

“[The hurdle distribution] provides a natural means for modeling overdispersion and underdispersion of the data”

Mullahy, 1986, p. 54 [1]

The hurdle distribution was first proposed by Cragg in 1971 [2]. Since then, the distribution has gained in popularity and is commonly found in epidemiology, genetics, insurance claims, marketing and medicine.

Hurdle distribution duality

The number of events in a hurdle distribution is a result of two distributions [3]:

hurdle distribution
Two part structure of a hurdle model.

Another way to approach modeling of data with excess zeros is zero-inflated models such as the ZIP distribution and some negative binomial variables of zero-inflated and hurdle models [4]. These distributions differ in how zeros can happen: in zero-inflated models, zeros can happen as an outcome of the counting variable; in hurdle models, zeros can only happen as outcomes when the counting variable is truncated at zero [5].

References

[1] Mullahy, J. (1986). Specification and testing of some modified count data models.
Journal of econometrics, 33 (3), 341–365.

[2] Cragg J.G. (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica, 39, 829–844.

[3] Martin, P. (2022). Regression Models for Categorical and Count Data. SAGE publications.

[4] Min, Y., and Agresti, A. (2005). Random effect models for repeated measures of
zero-inflated count data. Statistical Modelling, 5 (1), 1–19.

[5] Zuniga, F. (2021). A New Trivariate Model and Generalized Linear Model for Stochastic Episodes’ Duration, Magnitude and Maximum. Dissertation.

Scroll to Top