
A joint probability distribution shows the probability of two events happening together and at the same point in time. In other words, it is the probability of event X and event Y happening at the same time.
In mathematical terminology, that can be written as:
p(A and B)
or in set notation:
p(A ∩ B)
Joint probability distribution examples
Example 1: You choose a red three from a standard deck of cards. What is the probability that a card is a three and red?
Solution: p(three and red) is 1/26.
Explanation: The probability that a card is a three and red, which we can write as p(three and red), is the probability of two events happening:
- The card is a three. The probability of this event is the number of red threes divided by the total number of cards. There are two red threes in a deck of 52 cards, so the probability of the first event happening is 2/52.
- The card is red. The probability the number of red cards divided by the total number of cards. There are 26 red cards in a deck of 52 cards, so the probability of the this event happening is 26/52.
The probability of the two events happening at the same time is the product of the probabilities of the individual events. This is because the events are independent, which means that they don’t affect each other. So, the probability of a card being a three and red is 2/52 * 26/52 = 1/26.
You can use a joint probability table to find answers to problems. For example, use the following table to find the probability of Y = 2 and X = 3:

To solve, find the intersection of Y = 2 and X = 3.

The solution is (1/6).
Why are joint probability distributions important?
Joint probability distributions are important because it allows statisticians to better understand the relationship between two random variables. A random variable is a numerical representation of the possible outcomes of a random experiment or process, often used to describe uncertain events in probability and statistics.
Joint probability distributions serve various purposes such as:
- Predicting one variable’s value based on another,
- Identifying dependencies between variables,
- Estimating a model’s parameters. the term “parameters” refers to the unknown values that define a model, such as the slope and intercept of a linear equation.
By knowing the joint probability distribution of variables, such as height and weight in a population, predictions can be made about an individual’s weight based on their height. Furthermore, joint probability distributions help determine whether variables are dependent or independent, which is crucial for making inferences about a population.
In addition to these applications, joint probability distributions also enable the calculation of conditional probabilities and marginal probabilities. Conditional probability refers to the likelihood of an event occurring given that another event has already happened, while marginal probability represents the likelihood of an event happening irrespective of another event’s value. For example, the probability of flipping a coin and getting heads is 50%. This is the marginal probability of getting heads, because it is the probability of getting heads irrespective of the outcome of the previous flip (the probability of getting heads after flipping a tails is still 50%). Joint probability distributions allow statisticians to calculate both types of probabilities, offering a deeper understanding of the relationship between variables.
Lastly, joint probability distributions can help visualize the connection between variables through methods like heatmaps, contour plots, and 3D scatterplots, making the relationship more intuitive and easier to understand.

Overall, joint probability distributions serve as an important tool for statisticians, enabling them to analyze the association between two random variables, make predictions, identify dependencies, and estimate model parameters.
How Is Joint Probability Used?
Joint probability can be used in a variety of ways in statistical analysis. For example, it can be used to calculate the likelihood of two events occurring simultaneously. This is known as the conjunction rule. The conjunction rule states that the probability of two events occurring simultaneously is equal to the product of their individual probabilities. So, if event A has a probability of 0.2 and event B has a probability of 0.5, then the probability of them occurring simultaneously is 0.2 x 0.5 = 0.1.
Another way that joint probability can be used in statistical analysis is to calculate the likelihood of one event occurring given that another event has already occurred. This is known as conditional probability. The formula for conditional probability is as follows:
P(A|B) = P(A and B)/P(B),
where
- P(A|B) represents the conditional probability of event A given that event B has already occurred,
- P(A and B) represents the joint probability of events A and B occurring simultaneously, and
- P(B) represents the marginal probability of event B occurring (i.e. the probability of event B regardless of whether or not event A occurs).
Joint probability density and mass functions
The joint probability mass function indicates the probability of two discrete random variables simultaneously assuming specific values. It can be represented as a table or formula, displaying the probability for each potential combination of values pertaining to the two variables. For instance, the joint PMF concerning the number of heads and tails in two coin tosses would indicate the probability of obtaining 0 heads and 0 tails, 1 head and 1 tail, 2 heads and 0 tails, and so on.
The joint probability density function also gives us the probability of two continuous random variables adopting specific values at the same time. It is a formula that presents the probability for each possible range of values for both variables. For example, the joint PDF for a population’s height and weight would indicate the probability of an individual being between 5 feet and 5 feet 1 inch tall, as well as between 100 pounds and 105 pounds.
The primary distinction between joint PMF and joint PDF lies in their applicability; joint PMF is exclusively relevant to discrete random variables, while joint PDF pertains to both discrete and continuous random variables.
Another key difference is that the joint PMF is either a table or formula illustrating the probability of each potential value combination for the two variables, whereas the joint PDF is a formula representing the probability of each possible value range for both variables.
References
[1] By IkamusumeFan – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=30432580
[2] Steven Kay from Edinburgh, United Kingdom, CC BY-SA 2.0 https://creativecommons.org/licenses/by-sa/2.0, via Wikimedia Commons
Pingback: Helmert’s Distribution - P-Distribution