Multivariate distribution

< List of probability distributions

Multivariate distributions enable comparing two or more measurements and their relationships, with a general version for each univariate distribution with one random variable.

multivariate probability distribution
The multivariate normal distribution.

The multivariate normal distribution is the most commonly used model for analyzing multivariate data [1], though one can also work with the multivariate lognormal distribution, the multivariate binomial distribution, and more. The simplest of these is the bivariate distribution comprising one pair of random variables. Yet, theoretically at least, one could have an infinite number of pairs. All results obtained from a bivariate distribution for two pairs can be generalized to n random variables.

The multinomial and multivariate distributions share many similarities, but there is one important difference: a multinomial distribution has a dependent variable with more than one outcome (i.e., the dependent variable has two or more levels), while a multivariate distribution has more than one dependent variable.

Properties of Multivariate distributions

Multivariate distributions can be a bit complicated, because unlike univariate distributions, there isn’t a probability density function that fits all variables. Each variable in a multivariate distribution has its own mean and variance. Joint probabilities describe multivariate distributions for discrete random variables; univariate distributions are extended for the continuous case. To fully grasp multivariate analysis, it’s important to understand the variance-covariance matrix, especially when dealing with vector observations and multivariate normal distributions. [2]

variance covariance matrix
A variance-covariance matrix.

History of multivariate distributions

The history of multivariate probability distributions can be traced back to the early developments in probability theory and statistics. Multivariate probability distributions are extensions of univariate distributions, which consider multiple variables simultaneously.

  1. Early developments: The first steps toward multivariate probability distributions were made by mathematicians like Pierre-Simon Laplace and Carl Friedrich Gauss in the 18th and early 19th centuries. Laplace introduced the concept of joint probability, which laid the foundation for studying multivariate probability distributions. Pierre-Simon Laplace worked on the theory of joint probability in his seminal book “Théorie Analytique des Probabilités” (Analytical Theory of Probability), published in 1812. Gauss developed the method of least squares and the normal distribution, which later paved the way for the multivariate normal distribution. In the book “Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium” (Theory of Motion of the Heavenly Bodies Moving in Conic Sections Around the Sun) , published in 1809, Gauss applied the method of least squares and the normal distribution to analyze observational errors and improve the accuracy of predictions for the orbits of celestial bodies.
  2. Multivariate normal distribution: The multivariate normal distribution is one of the earliest and most important multivariate probability distributions. It was introduced by English statistician George Udny Yule in 1897. Yule introduced the concept of the multivariate normal distribution in a paper titled “On the Theory of Correlation.” In this paper, Yule explored the relationships between multiple variables and presented the generalization of the univariate normal distribution to the multivariate normal distribution.
  3. Other multivariate distributions: Throughout the 20th century, statisticians extended several other univariate distributions to their multivariate counterparts, such as the multivariate t-distribution, multivariate gamma distribution, and multivariate beta distribution. These multivariate distributions have found applications in various fields, including economics, biology, psychology, and engineering.
  4. Modern developments: With the increasing complexity of data and research problems, more advanced multivariate probability distributions and statistical methods have been developed. These include copulas, which are used to model the dependence structure between random variables; multivariate regression techniques, such as principal component analysis and canonical correlation analysis; and Bayesian methods, which allow for the incorporation of prior knowledge in the analysis of multivariate data.

In summary, the history of multivariate probability distributions has evolved over time, beginning with early developments in probability theory and progressing through the introduction of key multivariate distributions and modern statistical techniques. Today, multivariate probability distributions play a crucial role in the analysis of complex data across numerous disciplines.

References

[1] Engineering Statistics Handbook. The Multivariate Normal Distribution. Retrieved November 11, 2021 from: https://www.itl.nist.gov/div898/handbook/pmc/section5/pmc542.htm

[2] Do, C. (2008). The Multivariate Gaussian Distribution.

Scroll to Top