Statistics and Probability Definitions

A to Z

Probability mass function of sum of two regular dice. Bar graph used to portray discrete density function. Labels on the right correspond to the n/36 results format.

Statistics and Probability: Two Sides of the Same Coin?

Everybody knows that math can be divided into many different branches. There’s algebra, geometry, trigonometry, and calculus, to name a few. But did you know that there’s a branch of math specifically devoted to the study of random events? It’s called probability, and it’s closely related to another branch called statistics. In this blog post, we’ll explore the similarities and differences between these two important areas of mathematics.

Probability vs. Statistics: What’s the Difference?

At first glance, probability and statistics may seem like two sides of the same coin. After all, they both deal with the collection and analysis of numerical data. However, there are some important distinctions between the two fields. Probability is concerned with the likelihood of something happening, while statistics is focused on actual numerical data. Put another way, probability deals with hypothetical situations, while statistics deals with actual observations.

For example, let’s say you’re trying to determine the probability that it will rain tomorrow. To do this, you would look at past weather patterns and make a prediction based on that data. On the other hand, if you were collecting statistical data on rainfall in your area, you would simply go outside and measure how much rain fell over a given period of time. As you can see, probability is concerned with predicting future events, while statistics is concerned with describing past events.

A Few Statistics and Probability Definitions


The ! symbol after a number indicates it’s a factorial:

  • 6! is “six factorial.”
  • 3! is “three factorial.”

To solve, multiply “n” by every whole number below it. For example, 3! means that n is 3, so

3! is 3 x 2 x 1 = 6. 

Factorials are a shorthand way of writing numbers. For example, instead of writing 479001600, you could write 12! instead (which is 12 x 11 x 10 x 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1). A few more examples:

  • 1! = 1
  • 2! = 2
  • 3! = 6
  • 4! = 24
  • 5! = 120

The formal definition is that factorials are products of every whole number (counting numbers 1, 2, 3…) from 1 to n. Watch the video for some examples:

Hermite Polynomial

Hermite polynomials are widely-used polynomials defined over the interval (-∞, ∞), with a weight function proportional to w(x) =  e-x2.

Multiple definitions exist for “Hermite polynomials,” which can be confusing. There are two distinct sets of polynomials referred to as the “physicists'” and “probabilists'” polynomials, depending on the starting point. Most authors assume the reader is working in either physics or probability and simply mention Hermite polynomials without clarification.

In calculus, the “physicists'” Hermite polynomials are commonly used, constructed from monomials. The first few are [1]:

  • H0 (x) = 1
  • H1 (x) = x
  • H2 (x) = x2 – 1
  • H3 (x) = x3 – 3x
  • H4 (x) = x4 – 6x2 + 3
  • H5 (x) = x5 – 10x3 + 15x
  • H6 (x) = x6 – 15x4 + 45x2 – 15.
  • Another definition, with w(x) = e-x2/2  is occasionally used, particularly in statistics. The “probabilists'” polynomials are sometimes referred to as Chebyshev-Hermite polynomials. The first few are:
  • H0 (x) = 1
  • H1 (x) = 2x
  • H2 (x) = 4x2 – 2
  • H3 (x) = 8x3 – 12x
  • H4 (x) = 16x4 – 48x2 + 12
  • H5 (x) = 32x5 – 160x3 + 120x
  • H6 (x) = 64x6 – 480x4 + 720x2 – 120

Hermite polynomials are useful as interpolation functions since their value, as well as their derivative values up to order n, are equal to unity at the endpoints of the closed interval [0, 1] [see ref. 2]. They provide an alternative method for representing cubic curves, allowing the curve to be defined based on endpoints and the derivatives at those endpoints [3].

Hermite polynomials arise in various areas of physics, including in the solution to the quantum harmonic oscillator Hamiltonian. They also come up in numerical analysis as Gaussian quadrature.


[1] Sawitzki, G. (2009). Computational Statistics: An Introduction to R, CRC Press.

[2] Huebner, K. et al. (2001). The Finite Element Method for Engineers. Wiley.

[3] Buss, S. (2003). 3D Computer Graphics. A Mathematical Introduction with OpenGL. Cambridge University Press.

Linear Transformation

Linear transformation from [0,1] to [-π/2,+π/2] [1]

< Statistics and Probability Definitions < Linear transformation

What is linear transformation?

A linear transformation is a special case of a vector transformation with additional properties:

  1. Addition must be Preserved: The linear transformation must be additive, which means that T(u + v) = T(u) + T(v) for any vectors u and v in the domain of the transformation. To check if addition is preserved, take u and v and add them together. Then transform each vector individually and add the results. If the sum of the transformed vectors is the same as the sum of the original vectors, then addition is preserved and your transformation is linear.
  2. Scalar Multiplication must be Preserved: This means that if we multiply a vector by a scalar, the linear transformation must also multiply the vector by the same scalar. To check if scalar multiplication is preserved, take vector u and multiply it by a scalar c. Then transform u and multiply the result by scalar c. If the product of the transformed vector is the same as the product of the original vector, then scalar multiplication is preserved by your transformation and it is linear.
  3. Homogeneity: The transformation must be homogeneous, which means that T(ku) = k * T(u) for any vector u in the transformation’s domain and any scalar k.

What is the role of linear transformation?

Linear transformations are used in many different areas of mathematics and computer science. For example, they are used in linear algebra, differential geometry, and machine learning:

  • In linear algebra, linear transformations are used to represent geometric transformations, such as rotations, reflections, and scalings. They are also used to solve systems of linear equations.
  • In differential geometry, linear transformations are used to represent smooth mappings between manifolds. They are also used to study the properties of differential operators.
  • In machine learning, linear transformations are used to represent features in data. They are also used to train machine learning models.

Here are some of the roles of linear transformations:

  • Representing a geometric transformation. For example, a rotation can be represented by a linear transformation that rotates vectors by a certain angle.
  • Solving systems of linear equations. This is because a system of linear equations can be represented by a matrix, and the solution to the system can be found by finding the inverse of the matrix.
  • Representing features in data. For example, a linear transformation can be used to represent the color of an image as a vector.
  • Training machine learning models. This is because machine learning models can be represented as linear transformations, and the parameters of the model can be learned by minimizing a loss function.

Overall, linear transformations are a powerful tool that can be used in many different areas of mathematics and computer science. They are used to represent geometric transformations, solve systems of linear equations, represent features in data, and train machine learning models.

Linear Transformation Example

Example Question: Is the following transformation a linear transformation?
T(x, y)→ (x – y, x + y, 9x)

Step 1: Give the vectors and (from rule 1) some components. The choice of and here is arbitrary:

  • = (a1, a2)
  • = (b1, b2)

Step 2: Find an expression for the addition part of the left side of the addition preservation equation T(u + v) = T(u) = T(v):
(u + v) = (a1, a2) + (b1, b2)
Add these two vectors together to get:
((a1 + b1), (a2 + b2))
In matrix form, the addition is:

Step 3: Apply the transformation. We’re given the rule T(x,y)→ (x – y, x + y, 9x), so transforming our additive vector from Step 2, we get:

  • T ((a1 + b1), (a2+ b2)) =
  • ((a1 + b1) – (a2 + b2),
  • (a1 + b1) + (a2 + b2),
  • 9(a1 + b1)).

Simplifying/Distributing using algebra:
(a1 + b1 – a2 – b2,
a1 + b1 + a2 + b2,
9a1 + 9b1).
Set this aside for a moment: we’re going to compare this result to the result from the right hand side of the equation in a later step.

Step 4: Find an expression for the right side of the Rule 1 equation, T(u) + T(v). Using the same a/b variables we used in Steps 1 to 3, we get:
T((a1,a2) + T(b1,b2))

Step 5: Transform the vector u, (a1,a2). We’re given the rule T(x,y)→ (x – y, x + y, 9x), so transforming vector u, we get:

  • (a1 – a2,
  • a1 + a2,
  • 9a1)

Step 6: Transform the vector v. We’re given the rule T(x,y)→ (x – y, x + y,9x), so transforming vector v, (a1,a2), we get:

  • (b1 – b2,
  • b1 + b2,
  • 9b1)

Step 7: Add the two vectors from Steps 5 and 6:
(a1 – a2, a1 + a2, 9a1) + (b1 – b2, b1 + b2, 9b1) =
((a1 – a2 + b1 – b2,
a1 + a2 + b1 – b2,
9a1 + 9b1)

Step 8: Compare Step 3 to Step 7. They are the same, so condition 1 (the additive condition) is satisfied.

Part Two: Is Scalar Multiplication Preserved?

In other words, in this part we want to know if T(cu)=cT(u) is true for T(x,y)→ (x-y,x+y,9x). We’re going to use the same vector from Part 1, which is = (a1, a2).

Step 1: Work the left side of the equation, T(cu). First, multiply the vector by a scalar, c.
c * (a1, a2) = (c(a1), c(a2))

Step 2: Transform Step 1, using the rule T(x,y)→ (x-y,x+y,9x):
(ca1 – ca2,
ca1 + ca2,
Put this aside for a moment. We’ll be comparing it to the right side in a later step.

Step 3: Transform the vector u using the rule T(x,y)→ (x-y,x+y,9x). We’re working the right side of the rule 2 equation here:
(T(a1, a2)=
a1 – a2
a1 + a2

Step 4: Multiply Step 3 by the scalar, c.
(c(a1 – a2)
c(a1 + a2)
Distributing c using algebra, we get:
(ca1 – ca2,
ca1 + ca2,

Step 5: Compare Steps 2 and 4. they are the same, so the second rule is true. This function is a linear transformation.


[1] Jochen Burghardt, CC BY-SA 3.0, via Wikimedia Commons

Scroll to Top