Statistics How To

IID Statistics: Independent and Identically Distributed Definition and Examples

Statistics Definitions > IID Statistics

IID Statistics and Random Sampling

In statistics, we commonly deal with random samples. A random sample can be thought of as a set of objects that are chosen randomly. Or, more formally, it’s “a sequence of independent, identically distributed (IID) random variables“.

In other words, the terms random sample and IID are basically one and the same. In statistics, we usually say “random sample,” but in probability it’s more common to say “IID.”

  • Identically Distributed means that there are no overall trends–the distribution doesn’t fluctuate and all items in the sample are taken from the same probability distribution.
  • Independent means that the sample items are all independent events. In other words, they aren’t connected to each other in any way.

iid statisticsWhat types of data meet this criteria? Most of the examples you’ll come across in Elementary Statistics are IID. John Mack’s explanation of IID Statistics is clear and easy to grasp:

“A peculiarity of casino games is that they are structured to yield independent, identically-distributed (IID) outcomes. Each iteration of a game–spin of a roulette wheel, roll of dice or deal of shuffled cards–is independent of any other iteration. And the odds of any given result occurring are the same in any iteration. Classical statistics is based on equivalent IID data-generating processes: flipping coins, drawing colored balls from urns, etc.”

Technically Speaking

A more technical definition of an IID statistics is that random variables X1, X2, . . . , Xn are IID if they share the same probability distribution and are independent events. Sharing the same probability distribution means that if you plotted all of the variables together, they would resemble some kind of distribution: a uniform distribution, a normal distribution or any one of the dozens of other distributions.

Each distribution has it’s own characteristics. Let’s say we are looking at a sample of n random variables,
X1, X2,…, Xn. Since they are IID, each variable Xi has the same mean (μ), and variance(σ)2. In equation form, that’s:
E(Xi) = μ ; Var(Xi) = σ2
for all i = 1, 2,…, n.
Random variables that are identically distributed don’t necessarily have to have the same probability. A flipped coin can be modeled by a binomial distribution and generally has a 50% chance of a heads (or tails). But let’s say the coin was weighted so that the probability of a heads was 49.5% and tails was 50.5%. Although the coin flips are IID, they do not have equal probabilities.

------------------------------------------------------------------------------

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!
IID Statistics: Independent and Identically Distributed Definition and Examples was last modified: October 12th, 2017 by Stephanie Glen

4 thoughts on “IID Statistics: Independent and Identically Distributed Definition and Examples

  1. Andale Post author

    Yes, it does. There was a colon missing (it should be (Xi) = μ ; Var(Xi) = σ2). Thanks for pointing that out!

  2. Ron Haley

    I should have realized that is what you meant. However, I had spent hours viewing dozens of differing explanations for “Identically Distributed”. Would the ID condition then disallow applications (e.g., involving correlation matrices) that involve different measurement types on the same object (e.g., comparing electrical conductivity sample statistics to density sample statistics)?

    Thanks for your response.

  3. Andale Post author

    No, it wouldn’t disallow applications in general. I Say “in general” because you could have a correlation matrix on variables that aren’t IID. Although correlation is going to have more meaning for i.i.d. variables, it’s very difficult to test that assumption (i.e. prove that the data is i.i.d.). I’d say that different measurement types do not matter, but if they come from vastly different distributions, describing the meaning of any correlation might be a problem.