Cumulative distribution function (CDF)

< Statistics and probability definitions < Cumulative distribution function

The cumulative distribution function (CDF) of a random variable is the probability that a variable takes a value less than or equal to x. It is one way to describe the distribution of continuous random variables, and can be thought of as an extension of a cumulative frequency table, which measures discrete counts. However, one advantage of the CDF is that it can be defined for every type of random variable, such as discrete, continuous, or mixed.

The CDF can be used to calculate the probability of a given event occurring, and it is often used to analyze the behavior of random variables.

Contents:

What Is the Cumulative Distribution Function?

cumulative distribution function
Cumulative frequency distribution for various normal distributions. The horizontal axis on the graph domain for the probability function. The vertical axis represents a probability, so it must fall between zero and one.

The cumulative distribution function (CDF) (also called the distribution function) gives you the cumulative (additive) probability associated with a mathematical function. The formula will differ depending on what you are trying to calculate.

For example, the CDF for a continuous random variable is represented by the integral:

where f(t) represents the density function at point t on the x-axis.

You can use the CDF to figure out probabilities above a certain value, below a certain value, or between two values.

This means that for any given value of x on the x-axis, F(x) gives us the total area under the curve up to that point on the graph. In other words, it tells us how much “probability mass” has been collected up to that point on the x-axis.

The CDF is an extension of cumulative frequency tables, only instead of discrete counts up to a certain number, an infinite number of data points and their probabilities are considered.

The CDF is an extension of the cumulative frequency distribution.

Why Is the cumulative distribution function important?

Cumulative distribution functions can be used to analyze data in many different fields such as economics, finance, and engineering. These functions can also be used to calculate expected outcomes and make predictions about future events based on past data points. For example, if you wanted to predict how likely it was for your business to succeed over five years based on market trends from past years, you could use your company’s historical data points and run them through a cumulative distribution function in order to arrive at an answer.

Exceedance distribution function (EDF)

The exceedance distribution tells us how often we can expect rare events.

The exceedance distribution function (EDF) is defined as 1 minus the cumulative distribution function [1]:

EDF = 1 – CDF

The EDF, like the CDF, is bounded between 0 and 1. The CDF tells us what fraction of events are below a specified value. For example, if a measure is at P = 100m with a CDF of 0.4, that tells us 40% of the values are below 100m; the EDF is 1 – 0.4 = 0.6, so 60% of values are above 100m. Because of its complementary nature, the EDF is sometimes called the complementary distribution function.

Heavy tails in an exceedance distribution tell us that there are more occurrences of extreme events; these values represent a lower quantile than a light-tailed distribution [2].

Sometimes the exceedance curve – the graph of a continuous probability exceedance distribution – is called a risk curve or expectation curve.

Exceedance Distribution Formula

Various formulas have been proposed in the literature.

For example, Sarkadi [3] provided the following formula for the “number of exceedances”,

Where

  • n and N are independent observations (two sample sizes)
  • ξ is the number of exceedances, or the number of elements of sample size N which surpass (are larger than) at least nm  + 1 elements of the sample size n( 1 ≤mn).

Several other authors reference the same formula (e.g., [4]). The formulas have largely become somewhat obsolete with the availability of computer software.

References

[1] Katul, G. Intensity-Duration-Frequency Analysis. Online: https://nicholas.duke.edu/people/faculty/katul/ENV234_Lecture_7.pdf

[2] do Nascimento & Pereira (2017). A Bayesian approach to extended models for exceedance. Brazilians Journal of Probability and Statistics. Vol 31, No. 4, 801-820.

[3] Sarkadi, K. (1957). On the Distribution of the Number of Exceedences. The Annals of Mathematical Statistics. Vol 4. P.1021.

[4] Haight, F. (1958). Index to the Distributions of Mathematical Statistics. National Bureau of Standards Report.

[5] Lundberg, O. (1940). On random processes and their applications to sickness and accident statistics. Almqvist, Uppsala.

2 thoughts on “Cumulative distribution function (CDF)”

  1. Pingback: Lindley distribution - P-Distribution

  2. Pingback: Benini distribution - P-Distribution

Comments are closed.

Scroll to Top