In certain types of Markov jump processes, a Phase Type distribution, also called a PH distribution — which should not be confused with the “pH” distribution used in biology — represents the distribution of absorption times or hitting times [1, 2].
A Phase Type distribution can be described as the time distribution to absorption into a finite state (0) in a Markov chain . A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, “What happens next depends only on the state of affairs now.” One example is predictive text in a search engine, where the engine tries to predict what you are searching for before you fully type out your query.
At its simplest form, PH distributions arise as convolutions and mixtures of exponential distributions.
- A convolution combines two probability distributions to create a new probability distribution which describes the probability of getting a particular value when you add the two original distributions together.
- A mixture distribution is made up of two or more other distributions. The new distribution describes the probability of getting a particular value when you randomly select one of the original distributions and then get a value from that distribution.
These distributions can be broken down into a Markov chain consisting of a set of states and a transition matrix, through which matrix-based computer algorithms allow for very efficient evaluation .
Properties of a Phase Type distribution
One of the fundamental assumptions in Markov chain analysis is that waiting times follow an exponential distribution. Therefore, the PH distribution and exponential distribution are interconnected.
The distribution of time X until the process reaches the absorbing state is phase-type distributed, denoted as PH(α,S).
The cumulative density function (CDF) of X is given by
F(x) = 1 – α exp(Sx) 1,
and the probability density function (PDF) is
f(x) = α exp(Sx) S0,
for all x > 0, where exp (·) is the matrix exponential. The matrix exponential is a function that takes a square matrix as input and returns a new matrix as output. The new matrix is the result of taking the exponential of each element of the input matrix.
Applications of the Phase Type distribution
Phase-type distributions are a powerful tool for modeling a wide variety of real-world phenomena. They are relatively easy to understand and implement, and they can be used to model a wide variety of distributions.
Phase type distributions are used to model positive random variables, primarily random times such as processing times, repair times, or time to failure in manufacturing systems . The phases in the distribution can represent different states of the machine, such as “working,” “broken,” and “being repaired.”
However, their usefulness extends beyond manufacturing. For instance, Coxian phase-type distributions, which are a sub-type of Markov model for duration until an event occurs in terms of a sequence of latent phases, have been used in healthcare settings . The phases in the distribution can represent different stages of the recovery process, such as “being admitted to the hospital,” “undergoing treatment,” and “being discharged from the hospital.” Other uses include distances between DNA mutations and service times for agents in queueing systems. The phases in the distribution can represent different stages of the customer service process, such as “greeting the customer,” “taking the customer’s information,” and “solving the customer’s problem.”
Advantages and disadvantages
Some of the advantages of phase-type distributions include:
- Versatility: PH distributions can model a wide variety of distributions.
- Ease of use: compared to other methods, PH distributions are easy to understand and implement.
- Flexibility: They can be used to model both continuous distributions and discrete distributions. Discrete distributions are modeled by discrete phase type distributions.
- Ease of manipulation, because they are closed-form distributions.
However, a considerable drawback in using phase-type distributions is the underlying assumption of an exponential distribution. In reality, many random variables do not follow an exponential distribution; therefore, the Weibull distribution is often a better model for electronic component failure times . Other disadvantages include that they can be computationally expensive to calculate and difficult to fit to data.
 Bladt, M. (2005). Review on Phase Type Distributions and Their Use in Risk Theory. May, Astin Bulletin 35(1):145-161
 O’Cinneide, C. (2017). Phase-type distributions and invariant polytopes. Retrieved Aprril 8, 2021 from: https://higherlogicdownload.s3.amazonaws.com/INFORMS/50069173-2ea8-479b-baa7-2071ed7cd5dc/UploadedImages/2017-Marcel_Neuts_Lecture_.pdf
[3[ Osogami, T. (2005). Definition of a PH Distribution. Retrieved April 8, 2021 from: http://www.cs.cmu.edu/~osogami/thesis/html/node40.html
 Christian Commault & Stéphane Mocanu (2003) Phase-type distributions and representations: Some results and open problems for system theory, International Journal of Control, 76:6, 566-580, DOI: 10.1080/0020717031000114986
 Marshall, A. & McClean, S. (2004). Using Coxian phase-type distributions to identify patient characteristics for duration of stay in hospital. Health Care Manag Sci. 2004 Nov;7(4):285-9. doi: 10.1007/s10729-004-7537-z.
 Komarkova, Z. (2012). Phase-Type Approximation Techniques. Retrieved April 8, 2021 from: https://is.muni.cz/th/ysfsq/thesis.pdf