Logarithmic distribution

< List of probability distributions

The logarithmic distribution, also known as the logarithmic series distribution, is a discrete probability distribution with a long right tail. It is derived from the Maclaurin series expansion and has a single parameter, β, that ranges from 0 to 1. In this article, we will explore the logarithmic distribution in detail and its wide range of applications.

What is the logarithmic distribution?

The logarithmic distribution PMF is only defined at integer values. The connecting lines are guides for the eye [1].

A random variable with a logarithmic distribution X ~ Log(p) has a probability mass function (PMF) of [2]:

logarithmic distribution pmf

Where γ = -1 / ln(1 – β).

Williams [3] proposed a modified version that allows for zeroes, called the logarithmic-with-zeroes-distribution with probability generating function (PGF)

Logarithmic distribution applications

The logarithmic distribution has found wide applications across various fields. In the field of ecology, the distribution is used to model species diversity data. For instance, the logarithmic distribution has been used to model the number of individuals in a particular habitat in different species. The distribution assumes that the probability of observing a species present in a given habitat is proportional to the reciprocal of the rank of the species, as ordered by their abundance. This implies that the most common species will have the highest rank, and the rarer species will have lower ranks. Fisher, Corbet, and Williams used the logarithmic distribution in 1943 to sample butterflies and obtain data for moths in a light trap [4].

In the field of insurance, the logarithmic distribution is used to model claim frequency. The distribution assumes that the probability of having a certain number of claims in a particular period is inversely proportional to the number of claims. This means that the probability of having one claim is twice as likely as having two claims and three times as likely as having three claims. The distribution can be used to estimate the probability of claims above a certain threshold, which can help companies to better price their insurance policies.

The logarithmic distribution has also found use in marketing research. In this field, the distribution is used to model customer frequencies. The distribution assumes that the probability of having a certain number of customers is inversely proportional to the number of customers. This means that the probability of having one customer is twice as likely as having two customers and three times as likely as having three customers. This information can be used to estimate the market size and the market share of a particular product or service.

In the field of linguistics, the logarithmic distribution is used to model frequency distributions of words in a text or corpus. The distribution assumes that the probability of a word occurring in a particular text is proportional to the reciprocal of its rank, as ordered by frequency. This implies that common words such as “the,” “and,” and “of” have higher ranks and are more likely to occur than rare words such as “zygomatic.”

Other “Logarithmic” distributions

Sometimes authors use the term “logarithmic distribution” to refer to any probability distribution that involves a logarithm. For example, the lognormal distribution is called a logarithmic distribution in 2007’s Springer Series in Statistics Life Distributions [5].

Perhaps confusingly, a similar distribution — the logarithm distribution — outputs that decrease drastically as x-values increase. The main difference is in the PMF, which causes outputs to decrease more rapidly than the “usual” logarithmic distribution.

PMF of the logarithm distribution is different from the “usual” logarithmic distribution.

In conclusion, the logarithmic distribution is a useful probability distribution that has a wide range of applications. Its long right tail makes it suitable for modeling phenomena that exhibit rarity or extreme events. The distribution’s versatility and usefulness have made it an essential tool in various fields, including ecology, insurance, marketing research, and linguistics.

References

[1] Qwfp, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

[2] D. J. Best, J. C. W. Rayner, O. Thas, “Tests of Fit for the Logarithmic Distribution”, Advances in Decision Sciences, vol. 2008, Article ID 463781, 8 pages, 2008. https://doi.org/10.1155/2008/463781

[3] Williams, C.B. (1947). The logarithmic series and its application to biological problems. Journal of Ecology. Vol. 34, No. 2 (Aug., 1947), pp. 253-272 (20 pages)

[4] R. A. Fisher, A. Steven Corbet, C. B. Williams. The Relation Between the Number of Species and the Number of Individuals in a Random Sample of an Animal Population. The Journal of Animal Ecology, Vol. 12, No. 1 (May, 1943), pp. 42-58.

[5] Marshall A.W., Olkin I. (2007) Logarithmic Distributions. In: Life Distributions. Springer Series in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-68477-2_12

Scroll to Top