Probability mass function (PMF)

< Probability and statistics definitions

A probability mass function (PMF) describes the probability of a discrete random variable occurring at a specific value. It is a statistical formula used to calculate the probabilities assigned to discrete random variables, which are random variables that can take only a countable number of values such as dice roll experiments, choosing a number out of a hat, or getting a score on a test. The “discrete” part means that there’s a set number of outcomes. For example, you can only roll a 1, 2, 3, 4, 5, or 6 on a die.

PMF is usually denoted by the letter P(X=x) for a random variable X, where x is a possible value of X.

Occasionally, the PMF is called the discrete density function or frequency function.

Probability mass function definition

The formal definition of a probability mass function is the is the probability distribution of a discrete random variable; it gives possible values and their associated probabilities.

It is the function p: ℝ → [0, 1] defined as [2]

Associated with each PMF is a cumulative distribution function (CDF), defined as the probability that a random variable has an experimental value less than or equal to a specified amount. The CDF associated with a PDF is step function–it increases in increments from the smallest possible value and increases in discrete steps.

How does the PMF Work?

The probability mass function calculates the probability of each event, which is represented by the random variable. This function shows the probabilities of each specific occurrence for a given range or set of values within the probability distribution. The sum of all probabilities sums up to 1 or 100% in most cases.

The graph of a probability mass function. All the values must be non-negative and sum up to 1.

The PMF has multiple applications in statistics. One of the most common applications is in computing descriptive statistics such as mean, median, and mode. It is also useful in calculating conditional probabilities, which involve calculating the probability of one event occurring given that another event has already occurred. Additionally, it is used in the calculation of other statistical measures such as variance and standard deviation.

How to calculate the PMF

Unlike a probability density function (PDF), the probability mass is associated with continuous rather than discrete random variables. While a PDF must be integrated over an interval to find a probability [1], a PMF can be found with summation:

The summation of all probabilities is 100% (i.e. 1 as a decimal): Σfx(x) = 1.

The formal definition is not a very useful equation on its own; What’s more useful is an equation that tells you the probability of some individual event happening. For example:

P(X = 2) = 0.1 * 0.2.

Calculating PMF is dependent on the type of probability distribution the random variable follows.

For instance, if a random variable X follows a Bernoulli distribution, with a probability p of success, then the PMF can be defined by:

P(X = 1) = p and P(X = 0) = 1 – p

The binomial distribution has the PMF

For a random variable that follows the Poisson distribution with a given parameter μ, the PMF can be written as:

poisson pmf

In conclusion, the probability mass function is an essential tool in probability and statistics that helps describe the likelihood of different events. It is commonly used in computing various measures of central tendency, including mean, median, and mode, and several other measures. As a college student, a full understanding of this concept is crucial in succeeding in your studies and future career. By following the above guide, you will develop a comprehensive understanding of PMF and understand its broader applications in the field of statistics.

Relationship to the histogram

A histogram cab be used to graph a PMF. Probabilities are plotted on the y-axis and x-values (discrete random variables) on the x-axis.

a histogram can plot a probability mass function

The above histogram shows that the random variable 0 happened 27% of the time and the random variable 3 occurred just twice.

References

[1]  A modern introduction to probability and statistics : understanding why and how. Dekking, Michel, 1946-. London: Springer. 2005. ISBN 978-1-85233-896-1OCLC 262680588. ^

[2] 2.4 PROBABILITY MASS FUNCTION Retrieved April 10, 2023 from: https://web.mit.edu/urban_or_book/www/book/chapter2/2.4.html

Scroll to Top