Sampling > Sampling with replacement / Sampling without replacement
What is sampling with and without replacement?
Sampling without replacement is where items are chosen randomly, and once an observation is chosen it cannot be chosen again. On the other hand, when you sample with replacement, you also choose randomly but an item can be chosen more than once.
Contents (click to skip to that section):
Sampling with Replacement
Sampling with replacement is used to find probability with replacement. In other words, you want to find the probability of some event where there’s a number of balls, cards or other objects, and you replace the item each time you choose one. Let’s say you had a population of 7 people, and you wanted to sample 2. Their names are:
 John
 Jack
 Qiu
 Tina
 Hatty
 Jacques
 Des
You could put their names in a hat. If you sample with replacement, you would choose one person’s name, put that person’s name back in the hat, and then choose another name. The possibilities for your twoname sample are:
 John, John
 John, Jack
 John, Qui
 Jack, Qui
 Jack Tina
 …and so on.
When you sample with replacement, your two items are independent. In other words, one does not affect the outcome of the other. You have a 1 out of 7 (1/7) chance of choosing the first name and a 1/7 chance of choosing the second name.
 P(John, John) = (1/7) * (1/7) = .02.
 P(John, Jack) = (1/7) * (1/7) = .02.
 P(John, Qui) = (1/7) * (1/7) = .02.
 P(Jack, Qui) = (1/7) * (1/7) = .02.
 P(Jack Tina) = (1/7) * (1/7) = .02.
Note that P(John, John) just means “the probability of choosing John’s name, and then John’s name again.” You can figure out these probabilities using the multiplication rule. But what happens if you don’t replace the first name before you choose the second? In other words, what happens if you sample without replacement?
Sampling Without Replacement
Sampling without Replacement is a way to figure out probability without replacement. In other words, you don’t replace the first item you choose before you choose a second. This dramatically changes the odds of choosing sample items. Taking the above example, you would have the same list of names to choose two people from. And your list of results would similar, except you couldn’t choose the same person twice:
 John, Jack
 John, Qui
 Jack, Qui
 Jack Tina…
But now, your two items are dependent, or linked to each other. When you choose the first item, you have a 1/7 probability of picking a name. But then, assuming you don’t replace the name, you only have six names to pick from. That gives you a 1/6 chance of choosing a second name. The odds become:
 P(John, Jack) = (1/7) * (1/6) = .024.
 P(John, Qui) = (1/7) * (1/6) = .024.
 P(Jack, Qui) = (1/7) * (1/6) = .024.
 P(Jack Tina) = (1/7) * (1/6) = .024…
As you can probably figure out, I’ve only used a few items here, so the odds only change a little. But larger samples taken from small populations can have more dramatic results. You can tell how dramatic these results are by calculating the covariance. That’s a measure of how much probabilities of two items are linked together; the higher the covariance, the more dramatic the results. A covariance of zero would mean there’s no difference between sampling with replacement or sampling without.
When should you choose sampling with and without replacement?
Sampling with replacement is used because it allows us to use the same dataset multiple times to build models, instead of gathering new data, which often requires a lot of time and money. On the other hand, sampling without replacement is used when we want to randomly select a sample from a population with no chance of selecting the same item again. For instance, if we want to estimate the mean household income in Florida with a total of 100,000 different households, we might collect a random sample of 1,000 households. In this case, we don’t want any household to appear twice in the sample, so we would choose to sample without replacement.
Types of sampling with and without replacement
Common types of sampling with replacement include:

 Monte Carlo sampling involves using random numbers to chose items from the original dataset.
 Jackknife sampling, estimates the sampling distribution of a statistic using a subset of the original dataset. The jackknife sample is created by removing one data point from the original dataset. The statistic is estimated based on the remaining items. This process is repeated for each item in the original dataset.
 Bootstrap aggregating (bagging) combines multiple bootstrap samples to create a more accurate estimate of the sampling distribution of a statistic. After bootstrap samples are created, statistics are then estimated on each bootstrap sample. The final estimate of the statistic is the average of the estimates from the bootstrap samples.
Sampling without replacement methods include:

 Simple random sampling: Each item in the original data set has an equal chance of being included in the sample.
 Stratified sampling: The population is divided into strata, and then a sample is drawn from each stratum.
 Systematic sampling: Data points are selected at regular intervals. For example, if you want to sample every 10th item in a data set, you would select the first item randomly, and then select every 10th item after that.
 Cluster sampling: The population is divided into clusters, and then a sample of clusters is drawn. This is a useful method for sampling large populations, as it can be more efficient than sampling each data point individually.
 Multistage sampling: The population is divided into stages, and then a sample is drawn from each stage. This is a useful method for sampling populations that have a hierarchical structure [1].
References