Bernoulli Distribution

< Probability Distributions List < Bernoulli distribution

What is a Bernoulli distribution?

A Bernoulli distribution is a discrete probability distribution for a Bernoulli trial — a random experiment with two possible outcomes. A discrete probability distribution has a countable or finite number of possible outcomes. For example, you could toss a coin six times, or count how many red cars pass by you on the highway in one hour.

The distribution is named after James Bernoulli (1654-1705), a Swiss mathematician who wrote the first major work on probability, called Ars Conjectandi (published posthumously in 1713) [1]. The book included details on the principles of counting and a proof of the binomial theorem [2].

The Bernoulli distribution formula

The outcomes of a Bernoulli trial are called successes and failures. For example, if a coin lands on heads, it’s a “success” and if it lands on a tails, that’s a “failure.” However, that doesn’t mean an experiment is a failure in the usual sense of the word — it just means you didn’t get the result you were looking for in the experiment. For example, if you are counting whether frogs are present in ponds or not, and you come across a pond without a frog — then that’s a failure. A success in this case would be a pond with frogs.

The probability of success is p and the probability of failure is q (q is sometimes written as 1 – p instead). These probabilities add up to 1:

p + q = 1

For example, let’s say the probability of you finding frogs in a pond is p = 0.4. The probability of a failure — not finding frogs — is q = 1 – 0.4 = 0.6.

A Bernoulli distribution with p = 0.4.

Bernoulli distribution vs. binomial distribution

A Bernoulli distribution is a special case of the binomial distribution with a one trial (e.g., the toss of one coin). In other words, the Bernoulli distribution is a special case of the binomial distribution with n = 1.

A binomial distribution tells us the probability of achieving a certain number of successful outcomes in a sequence of n Bernoulli trials. For example, the probability of flipping 5 heads in 10 coin flips can be represented by a binomial distribution. This means that while the Bernoulli distribution has 2 possible outcomes, the binomial distribution has 2n possible outcomes, because it can have any number of outcomes from 0 to n.

FeatureBernoulli distributionBinomial distribution
Number of trials1n
Possible outcomes22n
Table of main differences between the Bernoulli and binomial distributions.

Properties of the Bernoulli distribution

Bernoulli distributions have several important properties:

  1. The probability of success is the same for each trial. If you flip a coin 100 times, 10 times or just once, the probability of getting a head on any given flip is the same: 0.5.
  2. The trials are independent: the result of one trial doesn’t affect the outcome of another. It doesn’t matter if you got heads or tails on the previous flips — it doesn’t affect the outcome of the next flip.
  3. The possible outcomes are binary, with two possible outcomes exist for each trial. If you flip a coin, the possible outcomes are “heads” and “tails.” For choosing a black ball from an urn with blue, black, red and green balls, the outcomes are “black” or “not black.”

The probability mass function (PMF) for the Bernoulli distribution distribution is px (1 – p)1 – x, which can be written as:

The expected value for a Bernoulli random variable X is:
E[X] = p.
For example, if p = .03, then E[X] = 0.03.

The mean (usually denoted as the Greek letter μ) is the same as the probability of overall success. Let’s look at an example of finding a mean for a Bernoulli experiment with the following data [3]:

variance example bernoulli distribution

To find the mean, multiply each success by its probability. Then find the sum:

Mean (μ) = (0 * ¼) + (1 * ½) + (2 * ¼) = 1.

Calculate the variance (usually denoted as σ2) by multiplying the square of the distance from the mean by the probability of each number of success:

σ2 = (0 – 1)2 * ¼ + (1 – 1)2 * ½ + (2 – 1)2 * ¼ = ½

Bernoulli Trials explained

A Bernoulli trial is one of the simplest experiments you can conduct. It’s an experiment where there are two possible outcomes, like “Yes” and “No.” A few examples:

  • Coin Tossing – record how many coins land heads up or tails down?
  • Births– what percent boys were born on any given day compared to girls (or vice versa)?
  • Rolling Dice – does having more successes mean better luck for certain rolls as opposed to others?

Bernoulli trials are typically described in terms of success and failure. However, ‘success’ here doesn’t mean achievement in the traditional sense, but rather points to an outcome of interest. For example, if you want to know the daily number of boys born, a boy’s birth would be labeled as a ‘success’ while a girl’s birth would be labeled a ‘failure.’ Similarly, getting double sixes on a series of dice rolls could be a ‘success,’ while any other result would be a ‘failure.’

One of the most important aspects about Bernoulli trials is that each action must be independent. This means you cannot depend on what happened before, because it will affect your future outcomes — for example if I win a scratch off lottery ticket then my odds would change if I buy another ticket because one less winning ticket is on the market. Dependent events such as drawing lotto numbers come with different probabilities depending upon how many balls remain in play; when there are 100 left there is a 1/100th chance of drawing a certain numbered ball but when there are only ten balls left, the probability increases to 1/10. 

Here are some examples of independent Bernoulli trials:

  1. Tossing a coin twice. The result of the first toss does not influence the outcome of the second toss.
  2. Rolling a die twice. The result of the first roll has no impact on the outcome of the subsequent roll.
  3. Drawing a card from a deck of cards twice. The result of the first draw does not influence the outcome of the next draw.

An example of a trial that isn’t independent (i.e., it’s dependent) is drawing two cards from a deck without replacing the first one. The outcome of the first draw influences the second because the deck now has one less card.

Is a coin flip binomial or Bernoulli?

Tossing a coin is an example of a Bernoulli trial, an experiment with only two potential outcomes, in this case, heads or tails. The Bernoulli distribution outlines the likelihood of achieving either heads or tails in a single coin flip. On the other hand, the binomial distribution gives us the probability of getting a specific number of successes (such as 5 heads) in a sequence of n Bernoulli trials. Therefore, flipping a coin once is a Bernoulli trial, not a binomial distribution because it involves only a single trial with two potential outcomes.

However, if you were to flip a coin 10 times, the distribution of the number of heads would follow a binomial distribution.

Is rolling a dice Bernoulli?

Rolling a die isn’t a Bernoulli trial.

That’s because a Bernoulli trial only has two possible outcomes. When you roll a die, six outcomes are possible (1, 2, 3, 4, 5, and 6). This is more than the two-outcome limit of a Bernoulli trial. However, if we define ‘success’ as rolling a specific number, such as 6, then rolling a die can be viewed as a Bernoulli trial as it now has two potential outcomes: rolling a 6 or not rolling a 6.

Therefore, while rolling a die is not typically a Bernoulli trial, it can be considered one if ‘success’ is defined in a particular manner.

Advantages and disadvantages of Bernoulli trials

Bernoulli trials serve as a straightforward and adaptable tool for modeling a wide array of phenomena. They offer several advantages such as:

  1. Their simplicity makes them easy to understand and apply, even for those without a statistical background. With only two potential outcomes – success and failure – Bernoulli trials are fundamentally basic.
  2. Estimating the probability of success in a Bernoulli trial is relatively easy, even with a small sample size. This quality makes them a practical instrument for modeling phenomena where data collection may be challenging or costly.
  3. Their broad applicability allows Bernoulli trials to model a vast range of phenomena, from coin tosses to predicting consumer behavior, making them a flexible tool for various purposes.

Despite these advantages, Bernoulli trials do have certain limitations:

  1. They are restricted to two possible outcomes, which can limit their use in modeling phenomena with more than two potential outcomes.
  2. Bernoulli trials operate under the assumption that different trial outcomes are independent, which might not always be a realistic assumption for certain phenomena.
  3. Being discrete, Bernoulli trials can only assume a finite number of values, which may restrict their use in modeling phenomena capable of assuming an infinite number of values.

What is the Bernoulli distribution used for in real life?

Bernoulli distributions find use in multiple fields, such as:

  1. Machine Learning: In machine learning, Bernoulli distributions model the probability of a class label.
  2. Clinical trials: the Bernoulli distribution is sometimes used to model an event like death, a disease, or disease exposure. The model can indicate the probability a person has, or will experience, the event in question.
  3. Engineering: Bernoulli distributions can model the probability of failure of a component. For example, it could be used to model the probability of a light bulb burning out within a certain time period.
  4. Finance: Bernoulli distributions can model the probability of a stock price rising or falling. I could also model the probability of a company declaring bankruptcy

How do you know when to use a Bernoulli distribution?

When considering the use of the Bernoulli distribution, several factors should be taken into account:

  1. Number of Possible Outcomes: The Bernoulli distribution is limited to two potential outcomes, making it suitable for modeling phenomena with only two possible results. For instance, it can model the probability of flipping a coin and obtaining heads, or the probability of a patient recovering from an illness (or not).
  2. Independence of Trials: The Bernoulli distribution operates under the assumption that trials are independent. This means one trial’s outcome doesn’t influence another’s outcome. While this assumption generally holds true for frequently repeated phenomena like coin flips or dice rolls, it might not hold for phenomena that aren’t often repeated, such as the recovery rate of a disease.
  3. Probability of Success: The Bernoulli distribution is effective only if you’re aware of the success probability (e.g., 50% heads). This probability could be estimated from data or provided by a subject matter expert.

If you can answer yes to all three questions, then the Bernoulli distribution could be the right model for the phenomenon you’re studying.

Additional considerations when using the Bernoulli distribution include:

  1. Sample Size: As a discrete distribution, the Bernoulli distribution can only assume a finite number of values. This can pose challenges if you’re trying to estimate success probability from a small sample.
  2. Skewness of Distribution: The Bernoulli distribution is symmetric, meaning the success and failure probabilities are equal. However, if the success probability doesn’t equal the failure probability, the distribution will be skewed. This skewness can complicate estimating the success probability from a small sample size.

In conclusion, the Bernoulli distribution is a flexible tool capable of modeling a broad range of phenomena. However, understanding its limitations is essential before applying it.

Watch Prof. Essa’s video for the definition of a Bernoulli distribution, the mean and variance.


[1] Bernoulli, J. (1713). Ars Conjectandi.

[2] Empirical Distributions

[3] 1. Normal distribution.

Scroll to Top