< Probability and statistics definitions < *Threshold parameter*

The **threshold parameter** defines defines the minimum value that is theoretically possible for data from a probability distribution. We can either discard all information below the threshold – -which would eliminate outliers — or choose to keep the information if outliers are of interest.

The threshold parameter is sometimes called the *location parameter*, but it is not the same location parameter used in probability distributions to denote the mean.

## Threshold parameter examples

The threshold sometimes appears in common probability distributions, including:

- Beta distribution
- Exponential distribution
- Gamma distribution
- Lognormal distribution
- Weibull distribution.

Depending on which distribution you’re working with, you might see the threshold denoted as µ, θ, or γ.

Suppose we have an exponential distribution — f(x) = e^{−x}, x ≥ 0, and f(x) = 0, x < 0 — and we want to convert it to an exponential location family. To do this, we replace *x* with *x* – *µ*:

The threshold here, denoted µ, denotes a bound on the range of *x* [1].

## Drawbacks with using a threshold parameter

It can be challenging to choose an appropriate threshold parameter because traditional methods, such as referring to a mean excess plot or choosing a certain percentile does not guarantee an appropriate threshold selection was made. This can lead to model bias or violation of the independence condition of excess.

In some cases, setting a threshold parameter is never a good choice. For example, if we are interested in the distribution of very rare events, then we wouldn’t want to discard all data points below a certain threshold [2].

The inclusion of a threshold parameter can also create problems with analysis. For example, the maximum likelihood estimator of the threshold parameter γ for the three-parameter lognormal distribution is the smallest order statistic greater than γ, but inclusion of this third parameter may create trouble in estimating scale parameters for the three-parameter lognormal distribution [3]. The threshold parameter can artificially inflate the scale parameter — the standard deviation of the distribution on the logarithmic scale — because γ removes the smallest data points from the distribution, which are often the most variable.

## References

[1] Common Families of Distributions

[2] Behrens, C. et al. (2003). Bayesian Analysis of Extreme Events with Threshold Estimation. Retrieved August 18, 2023 from: https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=19B83FFAE1F4841926EBC771FB0C8C39?doi=10.1.1.71.3689&rep=rep1&type=pdf

[3] Aristizabal, Rodrigo J., “Estimating the Parameters of the Three-Parameter Lognormal Distribution” (2012). FIU Electronic Theses and Dissertations. 575.

https://digitalcommons.fiu.edu/etd/575