In statistics, probability distributions are used as models for random variables. These distributions describe the behavior of the variable, and they are critical in predicting future values. In this article, we will explore what heavy tailed distributions are, how they compare with light tailed distributions, and delve into their characteristics and why they matter.
What are heavy tailed and light tailed distributions?
A heavy tailed distribution has a tail that extends farther out than what is expected with a normal distribution. It can also be described as a distribution with heavier tails than the exponential distribution . Similarly, a Light-tailed distribution has lighter tails than the exponential.
Various authors define the heavy tailed distribution differently. For example, Rolski et al  defines it as the distribution of a random variable X with distribution function F and a moment generating function of X, MX(t) that is infinite for all t > 0. This means 
A tail is “heavy” because the probability of extreme values is more significant than what is seen in other types of distributions. This distribution has a more extended tail, and many high-value outliers form a bulk of the curve in the probability density function (PDF). A heavy tailed distribution usually occurs when dealing with complex systems, real-world networks, and economic data. On the other hand, light-tailed distributions, which have less mass in the tail, don’t reflect “real world” data very well . For example outside of the classroom, the light-tailed normal distribution has few practical applications.
One crucial feature of a heavy tailed distribution is the scale invariance property. It means that the distribution shape remains the same, regardless of the scale we use in the data. Another feature is the “power-law” behavior that describes the distribution of the values. Thus, very high or low values can occur, albeit with low probabilities. Additionally, heavy tailed distributions exhibit a sensitivity to initial conditions, making them common in chaotic or complex systems.
Why Heavy tailed Distributions Matter
Heavy tailed distributions provide a different way to approach data than traditional Gaussian/normal methods. Suppose we use the standard deviation and mean to define the distribution. In that case, the extreme values may be ignored or dismissed, which may result in oversimplifying the data structure. Understanding heavy tailed distributions helps in modeling and predicting extreme values that are otherwise outliers, and traditional statistical methods cannot handle.
A couple of quirks with heavy tailed distributions to be aware of:
- The Central Limit Theorem doesn’t work.
- Some moments don’t exist, so use order statistics instead.
Examples of Heavy tailed distributions
One common example of a heavy tailed distribution is the Pareto distribution. In economic fields, Pareto’s principle states that a small percentage of the population holds the majority of the wealth. The Pareto distribution describes this phenomenon, showing that very high or low values are possible in such a system, and therefore, outliers must be flagged and dealt with correctly. Another example is the power-law relationship that occurs in network science. There are many more examples of heavy tailed distributions, with each having its unique applications and implications. For example:
Many distributions are heavy tailed, including:
- Burr distribution
- Cauchy Distribution
- Fréchet Distribution.
- Lévy distribution
- Log-Cauchy distribution
- Log gamma distribution
- Log-logistic distribution
- LogNormal Distribution
- Pareto Distribution
- q-Gaussian distribution
- Student’s t Distribution
- Weibull distribution (with shape parameter greater than 0 but less than 1)
- Zipf Distribution.
Real life data is often heavy tailed. For example:
- The top 0.1% of the population in the USA owns as much as the bottom 90% .
- File sizes in computers tend to be small, with a few large files thrown into the mix .
- Web page sizes and computer systems’ workloads tend to be heavy tailed .
- Insurance Payouts and Financial Returns follow a heavy tail pattern .
Subclasses of heavy tailed distributions include:
- “Fat tail distribution“: A heavy tailed distribution with infinite variance. The two terms (fat and heavy) are sometimes used interchangeably, especially in finance and trading.
- “Regularly varying”: the tail behavior deviates from pure power laws .
- “Subexponential”: A distribution where the sample’s largest value makes a huge contribution to the overall sum .
- “Long-tailed distribution” : A heavy tailed distribution with a long tail.
Gamma distribution PDF image: Cburnett, CC BY-SA 3.0 http://creativecommons.org/licenses/by-sa/3.0/, via Wikimedia Commons
 Bryson, M. (1974). Heavy Tailed Distributions: Properties and Tests. Technometrics 16(1):61-68 (February 1974).
 Rolski, Schmidli, Scmidt, Teugels, Stochastic Processes for Insurance and Finance, 1999
 S. Foss, D. Korshunov, S. Zachary, An Introduction to Heavy Tailed and Subexponential Distributions, Springer Science & Business Media, 21 May 2013
 Nair, J. et al., (2013). The fundamentals of heavy tails (PPT). Retrieved April 23, 2023 from: http://users.cms.caltech.edu/~adamw/papers/2013-SIGMETRICS-heavytails.pdf
 Monaghan, A. US wealth inequality – top 0.1% worth as much as the bottom 90%.
 Gong et. al. On the Tails of Web File Size Distributions.
 Psounis et al. Systems with multiple servers under heavy tailed workloads. Performance Evaluation 62 (2005) 456–474. Elsevier.
 Wolfram. Heavy Tail Distributions. Retrieved April 19, 2023 from: https://reference.wolfram.com/language/guide/HeavyTailDistributions.html
 Mikosch, T. (1999). Regular Variation, Subexponentiality, and Their Applications in Probability Theory. Retrieved April 19. 2023 from: https://www.eurandom.tue.nl/reports/1999/013-report.pdf