< Probability distributions list < *Beta-binomial distribution*

The **beta-binomial distribution** is a discrete probability distribution that uses the beta distribution as a prior distribution for the probability of success in a binomial experiment. While the binomial distribution has fixed probabilities, the beta-binomial probabilities can vary from trial to trial, which makes it a more flexible distribution than the binomial.

This simple Bayesian model has been used for decades to make informed predictions in fields such as cognitive science, epidemiology, intelligence testing and marketing.

## Beta-binomial distribution

The probability mass function (PMF) for the beta-binomial distribution is:

Where *x* ∈ { 0, …, *n* }.

Two shape parameters α > 0 and β > 0 define the probability of success.

- For large values of α and β the distribution approaches a binomial distribution. In other words, the binomial distribution is the limiting distribution.
- When α and β are both equal to 1, the distribution is a discrete uniform distribution from 0 to
*n*. This is because when the beta function has α = 1 and β = 1, it simply equals 1, meaning that probability of getting any number of successes is equal. - When
*n*= 1, the distribution is equal to a Bernoulli distribution (which models a single trial) with*p*chosen from a beta distribution, which has mean α/(α + β).

The cumulative distribution function (CDF) is

where _{3}*F*_{2}(*a*;*b*;*x*) is the generalized hypergeometric function _{3}*F*_{2}(1, –*x*, *n *– *x *+ *β*; *n *– *x *+ 1, 1 – *x *– *α*; 1).

The mean of the beta-binomial distribution is *nα* / (*α* + *β)*.

The variance is the product of two terms [2]:

- The first term, nαβ / (α + β)² , is the variance for a binomial distribution with the same expected value as the beta-binomial distribution.
- The second term, (α + β + n) / (α + β + 1), is a multiplier greater than 1 for
*n*> 1.

This means that a beta-binomial distribution with *n* > 1 always has a larger variance than a binomial distribution with the same expected value and number of trials.

## Deriving the beta-binomial distribution formula

The beta-binomial(*n*, α, β) distribution is generated by choosing probability *p* for a binomial(*n, p*) distribution from a beta(α, β) distribution. You can think of it as a combination of both the binomial and beta distributions.

Let’s say you have *m* items on an test, and each item is tested *n* times. The binomial distribution PMF is:

- P = binomial probability,
- x
_{i}= total number of “successes” (pass or fail, heads or tails etc.) for the*i*th trial, - p
_{i}= probability of a success on an individual trial, - n = number of trials.

You can also think of *p* as being randomly drawn from a beta distribution. To create the beta-binomial formula, combine the binomial distribution PMF with the PMF for the beta distribution

to get a joint PMF:

Which can also be written (using Beta distribution properties) as:

## Difference between the binomial and beta-binomial distribution

The major difference between a binomial distribution and beta-binomial distribution is that in a binomial distribution, *p* is fixed for a set number of trials; in a beta-binomial, *p* is not fixed and changes from trial to trial. One benefit is that the beta-binomial distribution can be used to model data that is overdispersed, which means that the variance is greater than the mean. This can happen in a many situations, such as when there are more extreme values in a dataset than would be expected if the data followed a normal distribution.

## References

[1] Nschuma, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

[2] Beta-binomial (n, α, β) distribution. Retrieved July 14, 2023 from: https://www.acsu.buffalo.edu/~adamcunn/probability/betabinomial.html