< List of Distributions < *Triangular distribution*

**Contents:**

## What is a Triangular Distribution?

The **triangular distribution**, so named because of its triangular shape, is typically used when not much is known about the distribution of data, but the minimum, mode, and maximum can be estimated [1]. It is also used in discrete-event and Monte Carlo simulation as to model randomness.

A triangular distribution (or *triangle *distribution) is a continuous probability distribution defined by three parameters:

*a*: the**minimum or lower limit**, (*a*≤*c*),*c*: the**mode (height or peak)**, (*a*≤*c*≤*b*),*b*: the**maximum or upper limit****(**b ≥ c).

When *a *and *b *are equal but opposite in sign (e.g., -1, 1), the distribution is a special case called a ** symmetric triangular distribution**.

The parameters — *a*, *b*, and *c* — change the triangle’s shape; the PDF can be estimated from sample data:

*a*: Use the sample minimum,*b*: Use the sample maximum.- c: Use the sample mean, mode or median.
- The mode is the value that appears most often in the sample data. This is often a “best guess.” With samples, it is necessary to use a histogram to estimate the mode of the underlying PDF, which can be tricky [1].

- The median is the value that divides the sample data into two equal halves: one of the right and one on the left.
- The mean is the average of the sample data.

You’ll want to avoid outliers because it can skew the parameter estimates. For example, one tiny outlier will result in the estimate for parameter *a* being too small.

The parameters can also be estimated by expert knowledge of likely values. For example, literature in your field might yield some estimated values.

The parameters, *a*, *b* and *c* change the triangle’s shape:

Like all probability distributions, the total probability (aka *the area under the curve*) equals 100% (1.0). This means that wider ranges will have shorter peaks and more compact ranges will have higher peaks.

## Properties and relationship to the two-sided power distribution

The triangular distribution is a special case of the *two sided power (TSP) distribution* when *n* = 2. The probability density function (PDF) of the TSP distribution is

When *n* = 2, the PDF becomes:

The **mean** for the triangular distribution is:

μ = 1/3 (*a* + *b* + *c*).

The **standard deviation**, *s*, is:*s* = (1/√*6*) *a*.

Provided:

- The distribution is centered at zero,
- Endpoints (the boundary of the closed interval [
*a*,*b*]) are known.

Support = a ≤ b

## Use Cases

In real life, this distribution can be used to estimate minimum and maximum values, the most likely outcomes — even if the mean and standard deviation are unknown. It can also model skewed distributions, such as a the sum of two dice. For example, the minimum roll (a) is 2, the maximum (b) is 12 and the peak (c) is at 7.

** Example question**:

Voting for an election has close but the votes have not been counted. One candidate wants to find the probability that they received *less than* 450 votes. In other words, they want to find the probability p < 450. They guess that the minimum number of votes they received is 200, the maximum is 900 and the most likely scenario (the peak) is 550.

- Draw a triangular distribution (shown above) with min = 200, max = 900 and peak at 550. Cut the distribution into two segments at 450 votes.
- Calculate the area of the triangle: The area of a triangle is 1/2 base * height.
- base = 450 – 200 = 250.
- height = f(450). Since 450 is between a = 200 and c = 550, this gives:

Which means the area is (base x height) = 1/2 x 250 x 0.0020408 = 0.2551 or 25.51%.

## Semi Triangular Distribution

Haight’s 1958 *Index to the Distributions of Mathematical Statistics *[2] lists the following formula for the semi-triangular distribution (p. 106):

For 0 ≤ *x* ≤*a* [3].

With grouping corrections:

The formula originated from an article by Kupperman in an article titled **On Exact Grouping Corrections to Moments and Cumulants **(pp. 429-434). Kupperman considered the effect of grouping on the mean and variance of a new twist on the rectangular distribution that he dubbed the “semi-triangular distribution”. This new distribution has a frequency curve shaped like the right half of the “regular” triangular distribution’s frequency curve.

Kupperman gave the following properties for the semi-triangular distribution:

- Mean = (1/3)
*a* - Variance = (1/18)a
^{2}. - Second moment about the origin = (1/6) a
^{2}.

## The Rarity of the Semi-Triangular Distribution

Following Kupperman’s publication in Biometrika, the distribution makes a few sparse entries in the literature afterwards. Entries tend to be of a **bibliographical nature** rather than a discussion of the formula or properties. For example, Kupperman’s article is mentioned in another publication by the National Bureau of Standards in 1970 [4].

Kupperman noted that grouping the range into a number of equal intervals “overstates the mean and understates the variance”, which may be one reason why **the distribution never took off in a practical sense.** Another reason may be that the “full” triangular distribution, from which the semi-triangular distribution was developed, has limited practical use—it is usually used when little information is known about outcomes except the most likely one (this creates the peak in the center of the distribution).

## References:

Image: Dr Nicola Ward Petty and Dr Shane Dye of Statistics Learning Centre: StatsLC.com

[1] Triangular, gamma, Erlang, Weibull distributions

[2] Haight, F. (1958). Index to the Distributions of Mathematical Statistics. National Bureau of Standards Report.

[3] Kupperman, M. On Exact Grouping Corrections to Moments and Cumulants. Biometrika 39, (pp. 429-434).

[4] National Bureau of Standards. An Author and Permuted Title Index to Selected Statistical Journals (1970). Special Publication 321. Online: https://books.google.com/books?id=uyIBpRKalmsC