Statistics Definitions > Cluster Sampling
Cluster sampling is used in statistics when natural groups are present in a population. The whole population is subdivided into clusters, or groups, and random samples are then collected from each group.
Cluster sampling is typically used in market research. It’s used when a researcher can’t get information about the population as a whole, but they can get information about the clusters. For example, a researcher may be interested in data about city taxes in Florida. The researcher would compile data from selected cities and compile them to get a picture about the state. The individual cities would be the clusters in this case. Cluster sampling is often more economical or more practical than stratified sampling or simple random sampling.
- Cluster elements should be as heterogenous as possible. In other words, the population should contain distinct subpopulations of different types.
- Each cluster should be a small representation of the entire population.
- Each cluster should be mutually exclusive. In other words, it should be impossible for each cluster to occur together. In the city tax example, it would be impossible for Miami city taxes and Jacksonville city taxes to occur together, so it fits the requirements for mutual exclusivity.
- Single-stage cluster sampling: all the elements in each selected cluster are used.
- Two-stage cluster sampling: where a random sampling technique is applied to the selected clusters. For example, once you’ve decided on your clusters, you could use simple random sampling to select your sample.
Difference Between Cluster Sampling and Stratified Sampling
For a stratified random sample, a population is divided into stratum, or sub-populations, before sampling. At first glance, the two techniques seem very similar. However, in cluster sampling the actual cluster is the sampling unit; in stratified sampling, analysis is done on elements within each strata. In cluster sampling, a researcher will only study selected clusters; with stratified sampling, a random sample is drawn from each strata.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!