# Phase-Type distribution (PH Distribution)

< List of probability distributions < Phase-Type distribution

In certain types of Markov jump processes, a Phase-Type distribution, also called a PH distribution — which should not be confused with the “pH” distribution used in biology — represents the distribution of absorption times or hitting times [1, 2].

A phase-type distribution can be described as the time distribution to absorption into a finite state (0) in a Markov chain [3]. A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, “What happens next depends only on the state of affairs now.” One example is predictive text in a search engine, where the engine tries to predict what you are searching for before you fully type out your query.

At its simplest form, PH distributions arise as convolutions and mixtures of exponential distributions.

• A convolution combines two probability distributions to create a new probability distribution which describes the probability of getting a particular value when you add the two original distributions together.
• A mixture distribution is made up of two or more other distributions. The new distribution describes the probability of getting a particular value when you randomly select one of the original distributions and then get a value from that distribution.

These distributions can be broken down into a Markov chain consisting of a set of states and a transition matrix, through which matrix-based computer algorithms allow for very efficient evaluation [4].

## Properties of a phase-type distribution

One of the fundamental assumptions in Markov chain analysis is that waiting times follow an exponential distribution. Therefore, the PH distribution and exponential distribution are interconnected.

The distribution of time X until the process reaches the absorbing state is phase-type distributed, denoted as PH(α,S).

The cumulative density function (CDF) of X is given by

F(x) = 1 – α exp(Sx) 1,

f(x) = α exp(Sx) S0,

for all x > 0, where exp (·) is the matrix exponential. The matrix exponential is a function that takes a square matrix as input and returns a new matrix as output. The new matrix is the result of taking the exponential of each element of the input matrix.

## Applications of the phase-type distribution

Phase-type distributions are a powerful tool for modeling a wide variety of real-world phenomena. They are relatively easy to understand and implement, and they can be used to model a wide variety of distributions.

A phase-type distributions can model positive random variables, primarily random times such as processing times, repair times, or time to failure in manufacturing systems [4]. The phases in the distribution can represent different states of the machine, such as “working,” “broken,” and “being repaired.”

However, their usefulness extends beyond manufacturing. For instance, Coxian phase-type distributions, which are a sub-type of Markov model for duration until an event occurs in terms of a sequence of latent phases, have been used in healthcare settings [5]. The phases in the distribution can represent different stages of the recovery process, such as “being admitted to the hospital,” “undergoing treatment,” and “being discharged from the hospital.” Other uses include distances between DNA mutations and service times for agents in queueing systems. The phases in the distribution can represent different stages of the customer service process, such as “greeting the customer,” “taking the customer’s information,” and “solving the customer’s problem.”

Some of the advantages of a phase-type distribution include:

• Versatility: PH distributions can model a wide variety of distributions.
• Ease of use: compared to other methods, PH distributions are easy to understand and implement.
• Flexibility: They can be used to model both continuous distributions and discrete distributions. Discrete distributions are modeled by a discrete phase-type distribution.
• Ease of manipulation, because they are closed form.

However, a considerable drawback in using phase-type distributions is the underlying assumption of an exponential distribution. In reality, many random variables do not follow an exponential distribution; therefore, the Weibull distribution is often a better model for electronic component failure times [6]. Other disadvantages include that they can be computationally expensive to calculate and difficult to fit to data.

## Discrete phase-type distribution

A discrete phase-type distribution is a general class of distributions that models the time until a discrete event occurs; The probability of moving from one state to another depends only on the current state and not on past events. These distributions are often used to approximate continuous distributions; one advantage with using discrete, rather than continuous, phase-type distributions is that a lower coefficient of variation can be obtained with the same number of phases [7].

## Phase-type vs discrete phase-type distribution

A phase-type (PH) distribution models the time it takes for a system to transform from one state to another. It is widely used for complex systems found in areas such as reliability analysis and queueing theory. On the other hand, a discrete phase-type distribution is where the system is in one of a finite number of states; the system moves between these states at specific rates. The “phases” of the distribution refer to the different periods of time that it takes for the system to move from one state to another.

The distribution is typically represented by a matrix, called a transition rate matrix, that specifies rates of movement between states. The matrix can be used to calculate the probability of staying in a certain state or moving from one state to another, and the expected time of moving from state to state. The matrix form of the phase-type distribution is simpler to use and understand, in particular when it comes to find the moments of time before reaching the absorbing state [8].

## Discrete phase-type distribution and Markov chains

A discrete phase-type distribution is a type of Markov chain, where each state represents a phase and the transition between phases is described by a set of transition rates. In a Markov chain, the transition probabilities between the states are usually specified, while in a discrete phase-type distribution the transition probabilities are derived from transition rates. In addition, the duration of each phase in a discrete phase-type distribution is modeled using an exponential distribution, while in a general Markov chain the duration of each state can be modeled using any arbitrary distribution.

More specifically, a discrete phase–type distribution is the distribution of time to absorption in a finite discrete time Markov chain with transition matrix P of dimension m + 1 [9].

The Markov chain has m transient states and 1 absorbing state. The initial probability vector is (α, αm+1). The pair (α,S) is a representation for the phase.

Any distribution that can be described as the time to absorption of a discrete-time Markov chain on a finite state space, with substochastic transition matrix P and initial distribution α, is a discrete phase-type distribution [10]. A substochastic matrix is a square matrix with nonnegative entries where every row adds up to at most 1.

## Uses

A phase-type distribution can approximate general distributions because they are often analytically tractable; Any positive-valued discrete distribution or continuous distribution can, theoretically, be approximated with a phase-type distribution to arbitrary precision.

The phase-type distribution is flexible in that it can approximate a wide range of probability distributions with varying numbers of phases, making it a useful tool in modeling complex systems. It can also be easily combined with other distributional models, such as the exponential distribution or gamma distribution, to provide even greater flexibility in modeling.

Discrete phase-type distributions have applications in fields such as insurance mathematics, queueing theory, population genetics, reliability analysis, and machine learning. They are useful in modeling systems that have multiple possible states and can be used to predict various outcomes, such as the time it takes for a machine to fail or the time it takes for a customer to be served in a queue.

## References

[1] Hakan Lorens Samir Younes (2005). Verification and Planning for Stochastic Processes with Asynchronous Events.

[2] Oregon State University. PhaseTypeR.

[3] Nielsen, B. Lecture notes on phase–type distributions for 02407 Stochastic Processes. October 2022. Retrieved May 7, 2023 from: http://www2.imm.dtu.dk/courses/02407/lectnotes/ftf.pdf

[4] Bean, N. & Nielsen, B. Decay rates of discrete phase-type distributions with infinitely-many phases. Matrix-analytic Methods Theory and Applications : Proceedings of the Fourth International Conference : Adelaide, Australia, 14-16 July 2002

[5] Bladt, M. (2005). Review on Phase-Type Distributions and Their Use in Risk Theory. May, Astin Bulletin 35(1):145-161