Tukey lambda distribution

< List of probability distributions

The Tukey lambda distribution, also called the symmetric lambda distribution, is a family of symmetric distributions with truncated tails. Typically, it aids in identification of an appropriate distribution and is not directly used in statistical modeling. This is because it lacks a general form of a probability density function (PDF) or cumulative distribution function (CDF). However, there are some useful special cases, including [1]:

Another common use for the Tukey lambda distribution is to generate PPCC plots, where the software processes inputted data to suggest models via a technique such as the Tukey-Lambda PPCC plot (Probability Plot Correlation Coefficient Plot).

Properties of the Tukey Lambda distribution

The Tukey lambda distribution is defined numerically with three parameters:

Several Tukey lambda PDFs, with different lambdas [1].

Most probability distributions have a formula for the probability density function (PDF) and cumulative distribution function (CDF) that fits all shapes of the distribution, but this isn’t the case for the Tukey lambda distribution; In general, neither its PDF not its CDF is known, but the CDF’s inverse — the quantile function — is known [2, 3]. Thus, you’ll usually find the distribution described in terms of quantiles:

tukey lambda distribution cdf

This function is not always analytically invertible and only allows for the following values of λ [4]:−1, 0, 1/4, 1/3, 1/2, 1, 3/2, 2, 3, 4=−1, 0, 1/4, 1/3, 1/2, 1, 3/2, 2, 3, 4.  You must use numerical inversion to get a CDF for other λ values [5].

The generalized lambda distribution is often defined in terms of its percentile function:

  • λ1, the location parameter,
  • λ2, the scale parameter,
  • λ3, skewness,
  • λ4, kurtosis.

If you know the percentile function, you can generate a PDF for specific values of λ.

History of the Tukey lambda distribution

John Tukey was the first to propose Tukey’s lambda distribution in 1960 [6], based on work from Hastings et. al in 1947 [7]. In the early 1970s, John Ramberg and Bruce Schmeiser [8] generalized the distribution for Monte Carlo simulations. In the late 70’s, Ramberg and colleagues developed the curve-fitting properties of the distribution [9]. Once the curve is fit, you can then model the residuals. Curve fitting algorithms you can use include gradient descent, Gauss-Newton and the Levenberg-Marquardt algorithm.

Which distribution fits best? Tukey lambda and PPCC (Excel)

References

[1] IkamusumeFan, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

[2] Ramberg, J. and Schmeiser, B. (1972) An Approximate Method for Generating Symmetric Random Variables. Communications of the ACM, 15, 987-990. https://doi.org/10.1145/355606.361888

[3] Ramberg, J., et al. (1979) A Probability Distribution and Its Uses in Fitting Data. Technometrics, 21, 201-214.

[4] Sarabia, J.M. (1997) A Hierarchy of Lorenz Curves Based on the Generalized Tukey’s Lambda Distribution. Econometric Reviews, 16, 305-320.
https://doi.org/10.1080/07474939708800389

[5] Girone, G. eta l. Mean Difference and Mean Deviation of Tukey Lambda Distribution. Applied Mathematics > Vol.11 No.8, August 2020. 10.4236/am.2020.118051

[6] Tukey, J. (1960). The Practical Relationship Between the Common Transformations of Percentages of Counts and Amounts, Technical Report 36. Statistical Techniques Research Group, Princeton University.

[7] Hastings, C. et. al (1947). Low moments for small samples: a comparative study of statistics. Annals of Mathematical Statistics, 18, 413-426.

[8] Ramberg, J & Schmeiser, B. (1972). An approximate method for generating symmetric random variables. Commun. ACM, 15:987-990.

[9] Ramberg, J. et. al. (1979). A probability distribution and its uses in fitting data. Technometrics, 21(2):201-214, May 1979.

Scroll to Top