The Rademacher distribution, denoted Rad½, is a modified version of the Bernoulli distribution. This discrete distribution uses {-1, 1} to denote the two possible values. Similar to the Bernoulli distribution, a Rademacher random variable has a 50-50 chance of success and a failure. However, while the Bernoulli uses 0 to indicate failure and 1 to indicate success, the Rademacher uses -1 and 1 for the same purpose.

Hans Rademacher, a German mathematician (1892-1969) renowned for his contributions to real analysis, Fourier series, summability theory, and probability theory initially introduced the Rademacher distribution in 1922. Since its introduction, the Rademacher distribution has been widely used in statistics and machine learning, as well as in probability theory and number theory research. For example, the analysis of the sum of i.i.d. Rademacher variables has resulted in various findings including concentration inequalities, like the Bernstein inequalities [1], and anti-concentration inequalities, such as Tomaszewski’s conjecture [2].

The variance of the Rademacher distribution is equal to 1, while all other moments are equal to 0. It is a symmetric distribution, which means that the probability of X = +1 is equal to the probability of X = -1.

The probability mass function (PMF) of the Rademacher distribution is given by:

``````f(x) =
{  ½ if x = +1;
½ if x = -1;
0 otherwise}``````

In terms of the Dirac delta function, the PMF can be written as f(k) = ½ ( δ (k – 1) + δ (K + 1).

The cumulative distribution function (CDF) is given by:

``````F(x) =
{ 1 if x ≥ 0;
0 if x < -1;
½ if -1 ≤ k < 1}``````

Rademacher random variables could be defined with Bernoulli random variables. If Y is a Bernoulli random variable, then X equals 2Y − 1, thus becoming a Rademacher random variable [3]. Conversely, if the variable X is a Rademacher random variable, then (X + 1) / 2 is a Bernoulli random variable.

The Laplace distribution could be used to define these variables. Given a Rademacher random variable X and when Y ~ Exp(λ) is independent from X, then XY Laplace (0, 1/λ).

## Uses for the Rademacher distribution

Rad½ is used for statistical proofs, random sampling [5], and bootstrapping, where weights dg = {-1, 1} are called Rademacher weights [6]. In machine learning, it can be used to generate binary data, such as whether or not a customer clicks on an ad. It can also be used to create synthetic data sets for training and testing machine learning models. A sequence of successive sums of Rademacher random variables is a simple symmetrical random walk when the step size equals 1.

In probability theory, the Rademacher distribution can be used to show that  normally distributed and uncorrelated does not imply independent [6].

## References

[1] 4 Concentration Inequalities, Scalar and Matrix Versions. Retrieved May 21, 2023 from: https://ocw.mit.edu/courses/18-s096-topics-in-mathematics-of-data-science-fall-2015/87db5678b0405c1087b0e85181b64c1a_MIT18_S096F15_Ses12_14.pdf

[3] Contreras, D. (2021). Estimation of Flexibility Potentials in Active Distribution Networks. Books on Demand.