Birthday Distribution (aka The Birthday Paradox)

< List of probability distributions < Birthday distribution

The birthday distribution is a probabilistic phenomenon related to the probability of two or more individuals sharing the same birthday within a group of people. It is often used as a classic example in probability and statistics to illustrate the concept of probability. It may surprise you to know that with a class of 23 people, there is a 50% chance that two people will share a birthday!

The Birthday distribution explained

In the birthday problem, the objective is to determine the probability of at least two people in a group of n individuals having the same birthday. The probability depends on the number of people in the group and the number of days in a year. For a group of n people, the probability of two or more individuals sharing the same birthday can be calculated using the following formula:

P(n) = 1 – (365! / [(365 – n)! x 365n])

Where n is the number of people in the group and 365! represents the factorial of 365.

 The probability reaches 100% when n = 367 because there are 366 possible birthdays — including February 29. But with just 57 people in a group, the probability is 99% and with only 23 people — the size of an average classroom — the probability is 50% that two people will share the same birthday. This is the reason why so many “happy birthday” messages pop up on your social media account on the same day as your birthday!

birthday distribution

The birthday paradox is a veridical paradox — a situation that produces a solution that seems absurd, but is correct nonetheless: it seems wrong at first glance but is, in fact, true. The veridical paradox, also known as the veridicality problem, is a type of paradox that arises when a situation or observation challenges our expectations or beliefs about reality. In other words, the veridical paradox occurs when something is true, yet it goes against what we believed to be true.

Apart from the birthday paradox, there are many famous examples of the veridical paradox, which include:

  1. The McGurk Effect: This is an auditory-visual illusion that occurs when we see a person’s lips move in a way that contradicts the sound we hear. For example, if we see a person say “ga,” while we hear “ba,” our brain may perceive a third sound, such as “da.” This situation challenges our belief that our senses are reliable, and that what we see and hear should match.
  2. The Capgras Delusion: This is a psychological disorder in which a person believes that their loved ones have been replaced by identical imposters. This situation challenges our belief that our perception of people and objects is objective and accurate.
  3. The Monty Hall Problem: This is a probability puzzle that occurs in a game show where a contestant is given a choice between three doors. Behind one door is a prize, while the other two doors reveal nothing. After the contestant makes their initial choice, the host opens one of the other doors to reveal no prize. The contestant is then given the option to switch their choice. The veridical paradox arises because it seems counterintuitive that switching the choice would increase the probability of winning, even though it is mathematically true.

Incorrect Reasoning

The probability of two people in the same class having the same birthday is fairly high, which can seem counterintuitive. There are a couple of common misconceptions about calculating the probabilities. First, the odds are not 1/365 — they are 1/366 due to the leap year. However, this obviously only makes a small difference in odds. What makes the biggest difference in odds is that when looking at a group of, say 30 people, you have to take into account the odds of every person in the class: it’s like you having a 1/100 chance of winning a raffle, but if 100 people all buy a ticket and have the same odds, then it’s highly likely that group will produce a winner.

Applications of the birthday distribution

The birthday distribution has many applications, such as in cryptography, where it relates to the probability of collisions in hash functions, and in the design of computer networks, where it relates to the probability of congestion or packet collision. It is also used in social science research to study patterns of group behavior and interaction.


[1] Guillaume Jacquenot, CC BY-SA 3.0, via Wikimedia Commons

Scroll to Top