< List of probability distributions

The **Skellam distribution**, also known as the *Poisson difference distribution*, compares counts from two Poisson random variables and represents the difference between them. This two-parameter discrete probability distribution can be applied to independent variables or dependent variables when there is a common additive random contribution that is cancelled by differencing.

J.G. Skellam introduced the Skellam distribution in 1946 [1] as a means of modeling the difference between two Poisson variates that belong to different populations. He discovered that this difference follows a specific distribution, now known as the Skellam distribution, while observing the variance between the number of births and deaths in a population.

## Skellam distribution properties

The Skellam distribution is closed under summation, is relatively easy to sample from, and approaches a normal distribution with larger variance.

The probability mass function (PMF) of the Skellam distribution is [2]:

Where:

*k*is the difference between two Poisson random variables;*k*∈ {…, -2, -1, 0, 1, 2, …}.- μ
_{1}and μ_{2}are expected values or means of two Poisson distributions. - I
_{k}(z) is the modified Bessel function of the first kind. Since*k*is an integer,*I*(_{k}*z*)=*I*(_{|k|}*z*).

Mean: μ_{1} – μ_{2}

Variance: μ_{1} + μ_{2}

Skewness: μ_{1} – μ_{2} / (μ_{1} + μ_{2})^{3/2}

## Uses of the Skellam distribution

Probability generating function (PGF): e^{-(μ1 + μ2) + μ1t + μ2/t}

The Skellam distribution is a good choice for modeling data that is expected to be approximately Poisson distributed, such as the difference between the number of births and deaths in a population. This is the original use case for the Skellam distribution, and it is still a common use case today. Other use cases include:

- Image detection and denoising [3].
- Modeling the difference between the number of arrivals and departures at a transportation hub [4].
- Modeling noise in PET imaging [5].
- Showing the spread of points in sports, where points are equal (in other words, games such as hockey or soccer where goals always equal one point) [6].
- Studying treatment effects [7].
- Creating the Skellam mechanism for Differentially Private Federated Learning.
*Differential privacy*is a technique that adds noise to datasets to protect individual privacy, while*federated learning*enables model training without sharing data. Differentially private federated learning combines these techniques to train machine learning models on sensitive data from multiple devices. This technique is well-suited to training models on data like medical or financial data while keeping individual data private.

## References

[1] Skellam, J. G. (1946) “The frequency distribution of the difference between two Poisson variates belonging to different populations”. Journal of the Royal Statistical Society, Series A, 109 (3), 296.

[2] Hwang, Y. et al. Statistical background subtraction based on the exact per-pixel distributions. MVA2007 IAPR Conference on Machine Vision Applications, May 16-18, 2007, Tokyo, JAPAN from: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.145.2875&rep=rep1&type=pdf

[3] Hirakawa, K. et al. Wavelet-based Poisson Rate Estimation using the Skellam distribution, in C.A. Bouman et al. (Eds.). Proceedings of SPIE 7246 (2009).

[4] Liu, X. Models for excess demand in urban environments. Doctoral thesis.

[5] M. Yavuz and J. A. Fessler. Maximum likelihood emission image reconstruction for randoms-precorrected pet scans. In IEEE Nuclear Science Symposium Conference Record, pages 15/229–15/233, 2000

[6] Karlis, D. & Ntoufras, I. Bayesian modelling of football outcomes: using the Skellam’s distribution for the goal difference. IMA J. Manag. Math 20(2) (2008), 133-145.

[7] Karlis, D. & Ntzoufras, I. Bayesian analysis of the differences of count data, Stat. Med. 25 (2006), 185-1905.