**Contents (click to go to that section):**

## About Samples

Samples are parts of a population. For example, you might have a list of information on 100 people out of 10,000 people. You can use that list to make some assumptions about the entire population’s behavior. This is because it’s not that simple. When you do stats, it’s needed for your sample size to be ideal– not too large or too small. Then once you’ve decided on a sample size you must use a sound technique for actually drawing the sample from the population. There are two main areas:

**Probability Sampling**uses randomization to select sample members. The probability of each member being chosen for the sample is known, it isn’t necessary for the odds to be equal.**Non-probability sampling**uses non-random techniques (i.e. the judgment of the researcher). This is where you can’t calculate the odds of any particular item, person or thing being included in your sample.

## Types

## Common Types

The most common techniques you’ll likely meet in elementary statistics or AP statistics include taking a sample with and without replacement. Specific techniques include:

**Bernoulli sampling**is where independent Bernoulli trials on population elements decide whether the element becomes part of the sample. All population elements have an equal chance of being included in each choice of a single sample. The sample sizes in Bernoulli samples follow a binomial distribution.**Poisson sampling**is less common. Each population member being sampled is given an independent Bernoulli trial to decide if the element is included in the sample.**Cluster sampling**divides the population into groups (clusters). A random sample is then selected from the clusters. It’s used when researchers don’t know the individuals in a population but they do know which groups are in a population.- In
**systematic sampling**elements are selected for a sample from an ordered sampling frame. A sampling frame is just a list of participants that you want to get a sample from. One type of systematic sampling is the equal-probability method where an element is selected from a list and then every kth element is selected using the equation k = N\n where n is the sample size and N is the size of the population. **SRS**is where a Simple Random Sample is chosen completely randomly so that each element has the same probability of being chosen as any other element and each subset of elements has the same probability of being chosen as any other subset of k elements.- In
**stratified sampling**, each subpopulation is sampled independently. The population is first divided into homogeneous subgroups before getting the sample. Each population member only belongs to one group. Simple random or systematic sampling is applied within each group to choose the sample.**Stratified Randomization**is a sub-type of stratified sampling used in clinical trials. Patients are divided into strata and then randomized with permuted block randomization.

### Less Common Types

If you are taking elementary statistics or AP statistics you’ll rarely (if ever) come across these techniques:

**Acceptance-Rejection Sampling**: A way to sample from an unknown distribution using a similar, more convenient distribution.**Accidental sampling**(also known as grab, convenience or opportunity sampling) is where a sample is drawn from a convenient, readily available population. It doesn’t give a representative sample for the population but can be useful for pilot testing.**Adaptive sampling**(also called response-adaptive designs) is where you adapt your selection criteria as the experiment progresses, based on preliminary results as they come in.**Bootstrap Sample**: A bootstrap sample is a smaller sample that is “bootstrapped” from a larger sample. Bootstrapping is a type of resampling where large numbers of smaller samples of the same size are repeatedly drawn, with replacement, from a single original sample.**Demon algorithm**(physics) is used to sample members of a microcanonical ensemble (used to represent the possible states of a mechanical system which has an exactly specified total energy) with a given energy. The “demon” is a degree of freedom in the system which stores and provides energy.**Critical Case Samples**: Critical cases are carefully chosen to maximize the information you can get from a handful of samples.**Discrepant case sampling**is where you choose cases that appear to contradict your findings.**Distance sampling**is a widely used technique that estimates the density or abundance of animal populations.- The
**experience sampling method**samples experiences (rather than individuals or members). In this method, study participants stop at certain times and make notes of their experiences as they experience them. **Haphazard Sampling:**where a researcher chooses items haphazardly, trying to simulate randomness. However, the result may not be random at all and is often tainted by selection bias.

### Additional Uncommon Types

**Inverse Sampling**is based on negative binomial sampling. Samples are taken until a specified number of successes have happened.**Importance Sampling**: A method to model rare events.- The
**Kish grid**is a way to select members of a household for interviews and uses a random number tables for the selections. **Latin hypercube sampling**is used to construct computer experiments. It generates samples of plausible collections of values for parameters in a multidimensional distribution.- In
**line-intercept sampling**, an element is included in a sample from a particular region if a certain line segment intersects the element. **Maximum Variation Samples**are taken when you want to include extremes (like rich/poor or young/old). A related technique is**extreme case sampling**.**Multistage sampling**; one of a variety of cluster sampling techniques where random elements are chosen from a cluster (instead of every member in the cluster).**Quota sampling**is a way to select survey participants. It’s similar to statified sampling but members of a group are chosen based on judgment. For example, people closest to the researcher might be chosen for ease of access.**Respondent Driven Sampling.**A chain-referral sampling method where participants recommend other people they know.- A
**sequential sample**is one that doesn’t have a set size; items are taken one (or a few) at a time. It’s commonly used in ecology. **A Snowball sample**is where existing study participants recruit future study participants from people they know.**Square root biased sample**is a way to decide who is chosen for additional screenings at airports. It is a combination of SRS and profiling.

## What is Sampling Error?

This error is one that occurs because you’re taking a sample from the population rather than using the *entire *population. In other words, it’s the difference between the statistic you measure and the parameter you would find if you took a census of the entire population.

Following this If you were to survey the entire population (like as in the US Census), there would be no error. It’s nearly impossible to calculate the error margin. However, when samples are taken at random, the error is estimated and called the margin of error.

For example, if you wanted to figure out how many people out of a thousand were under 18, and you came up with the figure 19.357%. If the actual percentage is 19.300%, the difference (19.357 – 19.300) of 0.57 or 3% is the margin of error. If you continued to take samples of 1,000 people, you’d probably get slightly different statistics, 19.1%, 18.9%, 19.5% etc, but they would all be around the same figure. This is one of the reasons that you’ll often see sample sizes of 1,000 or 1,500 in surveys: they produce a very acceptable margin of error of about 3%.

Formula:the formula for the margin of error is 1/√n, where n is the size of the sample. For example, a random sample of 1,000 has about a 1/√n; = 3.2% error.

Sample error can only be reduced, this is because it is considered to be an acceptable tradeoff to avoid measuring the entire population. In general, the larger the sample, the smaller the margin of error. There is a notable exception: if you use cluster sampling, this may increase the error because of the similarities between cluster members. A carefully designed experiment or survey can also reduce error.

## Another Type of Error

The **non-sampling** error could be one reason as to why there’s a difference between the sample and the population. This is due to poor data collection methods (like faulty instruments or inaccurate data recording, selection bias, non response bias (where individuals don’t want to or can’t respond to a survey), or other mistakes in collecting the data. Increasing the sample size will not reduce these errors. They key is to avoid making the errors in the first place with a well-planned design for the survey or experiment.

## More Articles

- Latin Hypercube Sampling
- What is an Effective Sample Size?
- Finite Population Correction Factor.
- Markov Chain Monte Carlo
- What is a Typical Case?
- How to Use Slovin’s Formula.
- Samp. Distributions.
- Samp. Distribution of the Sample Proportion.
- Sampling variability.

Check out our YouTube channel for more stats tips and help!

can you tall me how many type of sampling

Well, dozens…I don’t think there’s an official “list.” If you’re looking for an exact number, I’d look at all the different methods listed in one particular textbook and then refer to that list and the textbook author’s name (i.e. Smith “2016 lists 20 different sampling methods…”)

thanks for the good notes. i think it would be more helpful if these very notes were also in audio form to make them more attractive to learners. thanks

Thanks for your suggestion, Hope. Many of the articles have video (a few hundred) and out goal is to have videos for as many as possible :)

thanks for the providing nice notes. i think it would be more helpful if these very notes were also in videos form to make them more attractive to learners / lovers of statistics .

Thanks