Sampling > Sampling with replacement / Sampling without replacement
Contents (click to skip to that section):
Sampling with Replacement
Sampling with replacement is used to find probability with replacement. In other words, you want to find the probability of some event where there’s a number of balls, cards or other objects, and you replace the item each time you choose one.
Let’s say you had a population of 7 people, and you wanted to sample 2. Their names are:
- John, John
- John, Jack
- John, Qui
- Jack, Qui
- Jack Tina
- …and so on.
When you sample with replacement, your two items are independent. In other words, one does not affect the outcome of the other. You have a 1 out of 7 (1/7) chance of choosing the first name and a 1/7 chance of choosing the second name.
- P(John, John) = (1/7) * (1/7) = .02.
- P(John, Jack) = (1/7) * (1/7) = .02.
- P(John, Qui) = (1/7) * (1/7) = .02.
- P(Jack, Qui) = (1/7) * (1/7) = .02.
- P(Jack Tina) = (1/7) * (1/7) = .02.
Note that P(John, John) just means “the probability of choosing John’s name, and then John’s name again.” You can figure out these probabilities using the multiplication rule.
But what happens if you don’t replace the first name before you choose the second? In other words, what happens if you sample without replacement?
Sampling Without Replacement
Sampling without Replacement is a way to figure out probability without replacement. In other words, you don’t replace the first item you choose before you choose a second. This dramatically changes the odds of choosing sample items. Taking the above example, you would have the same list of names to choose two people from. And your list of results would similar, except you couldn’t choose the same person twice:
- John, Jack
- John, Qui
- Jack, Qui
- Jack Tina…
But now, your two items are dependent, or linked to each other. When you choose the first item, you have a 1/7 probability of picking a name. But then, assuming you don’t replace the name, you only have six names to pick from. That gives you a 1/6 chance of choosing a second name. The odds become:
- P(John, Jack) = (1/7) * (1/6) = .024.
- P(John, Qui) = (1/7) * (1/6) = .024.
- P(Jack, Qui) = (1/7) * (1/6) = .024.
- P(Jack Tina) = (1/7) * (1/6) = .024…
As you can probably figure out, I’ve only used a few items here, so the odds only change a little. But larger samples taken from small populations can have more dramatic results.
You can tell how dramatic these results are by calculating the covariance. That’s a measure of how much probabilities of two items are linked together; the higher the covariance, the more dramatic the results. A covariance of zero would mean there’s no difference between sampling with replacement or sampling without.
Agresti A. (1990) Categorical Data Analysis. John Wiley and Sons, New York.
Dodge, Y. (2008). The Concise Encyclopedia of Statistics. Springer.
Everitt, B. S.; Skrondal, A. (2010), The Cambridge Dictionary of Statistics, Cambridge University Press.
Gonick, L. (1993). The Cartoon Guide to Statistics. HarperPerennial.
Stephanie Glen. "Sampling With Replacement / Sampling Without Replacement" From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/sampling-with-replacement-without/
Need help with a homework or test question? With Chegg Study, you can get step-by-step solutions to your questions from an expert in the field. Your first 30 minutes with a Chegg tutor is free!
Comments? Need to post a correction? Please post a comment on our Facebook page.