Geometric distribution

< List of probability distributions

A geometric distribution measures the number of failures before getting a success in a series of Bernoulli trials, with the sequence of probabilities forming a geometric sequence. This plays an important role when calculating the duration prior to achieving a particular event. In this post, we’ll discuss the geometric distribution and how it can be used to understand Bernoulli Trials.

What is the Geometric Distribution?

geometric distribution PMF
The value of any specific distribution depends on the value of the probability p. This image shows the geometric distribution PMF for p = 0.17 [1].

A geometric distribution can be defined as the probability model for counting trials required to obtain the first successful Bernoulli trial. The geometric distribution can be thought of as capturing how long it takes to observe the first success in a series of experimental trials with a two-state or binary outcome, where each experimental trial is independent and has the same probability of success p.

The probability of success in each experiment will be given by p, while the probability of failure is 1 – p. The geometric distribution can be used to measure the number of trials necessary to obtain a particular event or count the number of failures that happen before the first success.

The probability mass function (PMF) for the geometric distribution is

f(x) = (1 − p)x − 1 * p

For example, let’s say you counted how many scratch off lottery tickets you needed to buy until you win (a “success”). If you had to buy 3 tickets before a win, then X = 3.

If the probability of winning is p = 0.56 and you succeeded on your 4th try, the probability of failing up to that point is

f(x) = (1 − p)x − 1 * p = (1 − 0.56)4 − 1 * 0.56 = 0.0477.

Example 2 : If your probability of success of getting into the graduate program of your choice is 0.2, what is the probability you succeed on your third try?
Inserting 0.2 for p and with three tries, the PMF becomes:

  • f(x) = (1 − p)x − 1*p
  • P(X = 3) = (1 − 0.2)3 − 1(0.2)
  • P(X = 3) = (0.8)2 * 0.2 = 0.128.

Three important assumptions for the geometric distribution are:

  • Two possible outcomes for each trial (denoted “success” or “failure”).
  • Independent trials (one trial has no effect on the probability of the other trials).
  • The probability of success is equal for each trial.

Geometric Distribution and Sequence

The geometric sequence is an important part of the geometric distribution. The geometric sequence is a sequence of geometrically increasing or decreasing values that starting with a non-zero number. This sequence shows how the probability of obtaining an additional success or getting a failure changes with additional Bernoulli trials measured in increasing sequence starting from one (1) until the first success outcome is achieved. The sequence usually follows the formula

q(k-1) p

where q is the probability of experiencing a failure up to (k-1) number of experimental trials before obtaining a success outcome.

After finding the sequence, it is possible to calculate its sum or its probability. This total sum is known as a geometric series. The geometric series formula is given by [2]

Real world applications

The geometric distribution can be used in many real-world situations, including web analytics or marketing data. For instance, it can be used for measuring the success rate of a particular marketing campaign by calculating how many times it takes for a viewer to click on the advertisement before making a purchase. Alternatively, it can be used in sports analytics to calculate the number of hits a team needs before scoring a goal or basket. The geometric distribution can also be used in computing the number of times, one needs to generate random numbers before winning a jackpot.

References

[1] Lfahlberg, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

[2] Kjos-Hanssen, B. (2019). Statistics for Calculus Students. Retrieved April 30, 2021 from: https://people.math.osu.edu/husen.1/teaching/530/series.pdf

Scroll to Top