< Probability and statistics definitions < Probability density function
What is a probability density function?
A probability density function (PDF), also called a probability density or a probability function, describes the probability distribution for a continuous random variable. It can be used to find the probability that the value of a certain event occurs within a range of values.
A continuous random variable (such as one describing height, temperature, or weight) has possible values of either:
- One interval on the number line, such as from X = 1 to X = 2, or
- A union of disjoint intervals (“disjoint” means no points in common). For example, X = 1 to X = 2 and X = 5 to X = 6.
Probability density function examples
Suppose that the following graph represents the probabilities of depths of a local sea and you wanted to know the probability that a randomly dropped weight will fall to between 40 and 60 feet. To find the probability, calculate the area of a segment under the graph between 40 and 60.
In this example, the area is about 50%, which means that the probability the weight will fall 40 to 60 feet is 50%.
In practical terms, PDFs help us to quantify how likely something is to happen given certain conditions. As an example, if we had data on precipitation and wanted to know when it was more likely to rain, we could fit the data to a PDF to calculate those probabilities. It isn’t always obvious which probability distribution gives us the best fit, so we often choose two or more distributions to overlay on the data and see which one fits best.
How Does It Work?
A PDF is essentially a function where its integral (the area under the curve) over an interval provides the probability of a value occurring in that interval. To put it another way, if you have two numbers – say, 120 and 140 – then you can integrate over that range to find the probability that the next IQ score you measure will fall between those two numbers. The higher the integral value is over an interval, the greater chance there is that your measured IQ score will fall somewhere within that range.
In elementary statistics, we don’t usually integrate, as this involves calculus. Thankfully, most common distributions such as the normal distribution, gamma distribution and cosine distribution have known formulas for PDFs, which don’t involve calculus.
Probability Density Function vs. Probability Mass Function
PDFs are used to define probabilities for random variable probabilities landing within a range of values, while a probability mass function (PMF) can give us probabilities for a single value.
If a random variable can only have certain values (such as drawing cards from a standard deck), a PMF describes the probabilities of the outcomes. On the other hand, you would use a PDF for continuous random variables that are not restricted to a set range of distinct values: they can take on any number — including decimals and fractions — within a range. Some examples include weight, height and time.
If you have continuous variables, you can’t write out every possible value because you would have infinite possibilities to write out (which is, of course, impossible). This fact is important because it tells us that the probability a continuous random variable takes on any specific value of x is zero (because 1/∞ = 0). In other words, it is not possible to calculate P(X = x) for a continuous random variable; What you can do is create a PDF; a formula that describes all possible outputs for ranges of data.
A common confusion happens because although the PDF is defined for a range of values and the PMF is defined for distinct values, a calculator for a PDF will give us probabilities for single values as well. This happens because the calculator is actually calculating a tiny range behind the scenes. For example, you might type in X = 5 and get an answer of 0.02345 (2.345%). The calculator isn’t calculating the PDF at exactly X = 5, it’s calculating it for a very tiny range around that number — say from X = 4.99999 to X = 5.000001.
PDF vs CDF
The PDF gives us the probability of a random variable at a specific range of values, while the cumulative distribution function (CDF) gives the probability of a random variable below or equal to a specific value.
For example, we could use the PDF to tell us what the probability is a baby will weigh between 8 and 9 pounds. We can use the CDF to tell us the probability a baby will weight below 8 pounds — and with a little math, we can also use the CDF to tell us the probability that the baby will weigh over 8 pounds.
 National Weather Service. Basic statistics: probability density function. Retrieved September 1, 2023 from: https://training.weather.gov/pds/climate/pcu2/statistics/Stats/part1/BS_pdf.htm