Statistics Definitions > What is a Population in Statistics?

## What is a Population in Statistics?

In stats, a sample is a **part of a population**. A population is a whole, it’s every member of a group. A population is the opposite to a sample, which is a fraction or percentage of a group. Sometimes it’s possible to survey every member of a group. A classic example is the U.S. Census, where it’s the law that you have to respond. Note: if you do manage to survey everyone, it actually *is* called a census: The U.S. Census is just one example of a census.

In most cases, it’s impractical to survey *everyone*.

Imagine how long it would take you to call every dog owner in the U.S. to find out what their preferred brand of dog food was. In addition, sometimes people either don’t want to respond or forget to respond, leading to incomplete censuses. Incomplete censuses become samples by definition.

### Sample vs. Population Example

If you go into a candy store, the owner might have **samples of their products** on display. It wouldn’t be possible for you to sample everything in the store; Financially the owner wouldn’t want you to taste everything for free. And you probably wouldn’t want to eat a sample of candy from a couple hundred jars or you might get sick to your stomach. So, you might base your opinion about the entire store’s candy line based on the samples they have to offer. The same logic holds true for most surveys in stats; You’re only going to want to take a sample of the whole population (“population” in this example would be the entire candy line). The result is a **statistic about that population.**

### Statistic vs. Parameter.

A parameter is **data about an entire population.** For example, if you want to find out which classes freshmen at a certain college were taking, you could ask everyone (perhaps via email) and it would be possible to get a parameter. Statistics are when you base your data from samples. For example, you might ask 20 percent of the freshman class what classes they are taking and use that data to make assumptions about what everyone is taking. Obviously, if you base your results from a bit of the population, your results aren’t going to be perfect. That’s where we talk about **margins of error** and **confidence intervals** in stats. In the candy store, you might be able to get a good feel for the candy line if you taste a few samples, but how confident are you that you can accurately say if your sampling wasn’t skewed? Perhaps the candy that day was extra fresh and tasted wonderful, or perhaps the flavors offered were ones that you didn’t care for. If you had the opportunity to taste test everything, you could offer an excellent opinion about the parameters of the candy line, but with sampling, all you have is a statistic.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!