Binomial distribution

< List of probability distributions < Binomial distribution

What is a binomial distribution?

The binomial distribution evaluates the probability for an outcome to either succeed or fail. These are called mutually exclusive outcomes, which means you either have one or the other — but not both at the same time. For example, you either win the lottery or you don’t, a drug to cure a disease works or it doesn’t, or a test results in a pass or a fail. Binomial distributions come help us to analyze the probability of events such as these.

How to figure out if an experiment is binomial.

The binomial distribution formula

Your experiment must meet the following three criteria in order to use the binomial distribution formula:

  1. Fixed number of observations or trials. You can have 3 trials or a million, but not an unlimited number of attempts.
  2. Independent observations or trials. In other words, one coin toss or test attempt (or whatever it is you are measuring) should not affect the next attempt.
  3. Same probability of success from one “trial” to another. For example, the odds of flipping heads remains at 50% from trial to trial.

The binomial distribution formula is:

Where:

  • b = binomial probability
  • nCx = combinations formula nCx = n! / (x!(n – x)!)
  • x = total number of “successes”
  • P = probability of a success on a single attempt
  • n = number of attempts or trials.

The binomial distribution formula can also be written with factorials:

binomial distribution formula

Example (using alternate formula): A fair coin is tossed 10 times. What is the probability of getting six heads?

  • Number of trials (n) = 10
  • Odds of success (“heads”) is 50% or 0.5 (1 – p = 0.5)
  • x = 6 (we want to know the probability of getting 6 heads)
  • P(x = 6) = 10C6 * 0.56 * 0.54 = 210 * 0.015625 * 0.0625 = 0.205078125 = 20.51%.

Relationship to Bernoulli Distribution

The binomial distribution is closely related to the Bernoulli distribution: the binomial distribution with n = 1 is a Bernoulli distribution. In addition, if a Bernoulli trial has independent trials, the number of successes follows a binomial distribution.  

A Bernoulli distribution is a set of Bernoulli trials. Each of these trials has one possible outcome: Success, or Failure. In each trial, the probability of success, P(S) = p, is the same. The probability of failure is 1 minus the probability of success: P(F) = 1 – p. (“1” is the total probability of an event occurring). Finally, all Bernoulli trials are independent from each other and the probability of success doesn’t change from trial to trial, even if you have information about the other trials’ outcomes.

Lexian distribution

A Lexian distribution is another name for the binomial distribution (k, p) if p is not constant [1]. One way to interpret the distribution is as a special case of a mixture of binomial distributions [2]. The Lexian distribution considers a mixture distribution of subsets of binomials, each of which has its own probability distribution function (PDF).

The mean of the Lexian distribution is [3]

Where

  • n is the number of trials
  • is the average value of the distinct probability distributions.

The Lexian variance is

Where

  • var(p) is the variance of the average value of the distinct probability distributions.

As a consequence, if mixed binomial variables are treated as pure binomials, the mean would be correct but the variance would be underestimated when using the “binomial estimator” np(1- p) [4].

History of the Lexian Distribution

The Lexian distribution is named after German economist Wilhelm Lexis, who published several papers on mixture distributions in 1875-1879. The basis of his work was to test for the structure of a set by comparing its actual variance to one obtained from a theoretical binomial variance through a “Lexis Ratio”: the standard deviation from the data, divided by the theoretical binomial standard deviation [5].

References

[1] Haight, F. (1958). Index to the Distributions of Mathematical Statistics. National Bureau of Standards Report.

[2] Suchindran C. M. (1981). A reply to Avery and Hakkert. Population studies35(5), 473–475. https://doi.org/10.1080/00324728.1981.11878519

[3] Johnson, N. L. (1969), Discrete distributions, Houghton Mifflin Company, Boston.

[4] Coppens, F. et al. (2007). The performance of credit rating systems in the assessment of collateral used in Eurosystem monetary policy operations. National Bank of Belgium. Online: http://aei.pitt.edu/7612/1/wp118En.pdf

[5] Bensman, S. (2005). Urquhart’s Law: Probability and the Management of Scientific and Technical Journal Collections Part 1. The Law’s Initial Formulation and Statistical Bases. Haworth Press. doi:10.1300/J122v26n01_04

Scroll to Top