< List of probability distributions < Hard distribution & Easy distribution
What is a hard distribution?
A “hard distribution” is so-called because it can be difficult to identify the distribution, or work with components or data points. The opposite is an “easy distribution.”
For example, computer models can be classified as “hard” or “harder” depending on their complexity and how well they model data. If model B is harder than C, that means the best algorithm for B does worse than the best algorithm for C.
Hard distribution and hardness examples
Hardness is a measure of either sampling complexity or the complexity of computing a distribution function:
- Ting et al. define a hard distribution as one that is hard for a DBSCAN density-based clustering algorithm to identify all the clusters in the distribution [1].
- Rochetto et al. [2] describe hard distributions as those that are conjectured to be difficult to sample without the use of quantum computing.
- Markov Chain Monte Carlo (MCMC) can sample from a hard distribution that can’t be explicitly written out; Often we can’t compute simulations for very large distribution because a Markov Chain has too many states; it could take a computer eons to calculate some of these problems. The workaround is to simulate the Markov Chain for many iterations until a “good” state solution is found [3].
References
- Ting et. al. August 2016. Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. Conference: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). DOI: 10.1145/2939672.2939779
- Rocchetto, A., Grant, E., Strelchuk, S. et al. Learning hard quantum distributions with variational autoencoders. npj Quantum Inf 4, 28 (2018). https://doi.org/10.1038/s41534-018-0077-z
- Tsum A. Probability & Statistics with Applications to Computing. Chapter 9: Applications to Computing. 9.6: Markov Chain Monte Carlo (MCMC).