A **canonical statistic** (sometimes called a *natural statistic*) is a way to specify a particular exponential distribution. All exponential families of distributions over x have the general form (Creager, 2018)

**p(x| η) = h(x) g(η) exp{η ^{T} u(x)}**

Where:

- u(x) is the canonical (natural) statistic, which is a function of x,
- η is the natural parameter,
- h(x) is the base measurement, which is often constant,
- g(η) is the normalizer.

The canonical statistic is usually a minimal sufficient statistic.

**Example:**

A sequence of Bernoulli trials might result in the following probability function for the outcome sequence y = (y_{1}, … y_{n} (Sundberg, 2019):

The canonical statistic here is y(y) = Σ y_{i} (Σ is summation notation, which means to “Add them up”).

## Non Uniqueness of a Canonical Statistic

The name “canonical” in math means to indicate a choice from a particular number of different conventions, leading to a unique choice. However, a canonical parameter and statistic are not unique (Geyer, 2020):

- Any one-to-one affine function of a canonical parameter (or statistic) is canonical. However, these change the canonical statistic (or parameter) in addition to the cumulant function.
- A scalar-valued affine function of the canonical parameter can be added to the cumulant function. This will change the canonical statistic.

Although there are many possibilities, the workaround is to make a choice: “The” canonical statistic is a result of fixing one choice of statistic, from all of the different possibilities.

**Related article**: Canonical Correlation Analysis / Variates.

## References

Creager, E. (2018). Introduction to Advanced Probability for Graphical Models. Retrieved January 15, 2021 from: http://www.cs.toronto.edu/~jessebett/CSC412/content/week1/tutorial1-probability-412–ec-edit.pdf

Geyer, C. (2020). Stat 5421 Lecture Notes. Exponential Families, Part I.

Sundberg, R. (2019). Statistical Modelling by Exponential Families. Cambridge University Press.