< List of probability distributions < Hard distribution & Easy distribution

## What is a hard distribution?

A “**hard distribution**” is so-called because it can be difficult to identify the distribution, or work with components or data points. The opposite is an “**easy distribution**.”

For example, computer models can be classified as “hard” or “harder” depending on their complexity and how well they model data. If model B is harder than C, that means the best algorithm for B does worse than the best algorithm for C.

## Hard distribution and hardness examples

Hardness is a measure of either sampling complexity or the complexity of computing a distribution function:

- Ting et al. define a hard distribution as one that is hard for a DBSCAN density-based clustering algorithm to identify all the clusters in the distribution [1].
- Rochetto et al. [2] describe hard distributions as those that are conjectured to be difficult to sample without the use of quantum computing.
- Markov Chain Monte Carlo (MCMC) can sample from a hard distribution that can’t be explicitly written out; Often we can’t compute simulations for very large distribution because a Markov Chain has too many states; it could take a computer eons to calculate some of these problems. The workaround is to simulate the Markov Chain for many iterations until a “good” state solution is found [3].

## References

- Ting et. al. August 2016. Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. Conference: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). DOI: 10.1145/2939672.2939779
- Rocchetto, A., Grant, E., Strelchuk, S. et al. Learning hard quantum distributions with variational autoencoders. npj Quantum Inf 4, 28 (2018). https://doi.org/10.1038/s41534-018-0077-z
- Tsum A. Probability & Statistics with Applications to Computing. Chapter 9: Applications to Computing. 9.6: Markov Chain Monte Carlo (MCMC).