Statistics How To

Calculating Confidence Intervals

Share on

Calculating Confidence Intervals

With random sampling of binomial values (in-favor vs. not-in-favor; heads vs. tails):

  1. Sampling from populations with percent-in-favor close to 50% have wider sampling distributions than populations with percentages closer to 0% or 100%.
  2. Larger sample sizes have narrower sampling distributions.

The various sampling distributions have different locations on the horizontal axis and they have different widths. It would be useful to convert them all to one standard scale. We’ll need a common unit. And the rescaling to that unit must account for the effects of the population percent-in-favor value (number 1above) and sample size (number 2 above).

The unit to be used is called Standard Error. It’s labeled “Standard” because it serves as a standard unit. And it’s labelled “Error” because we don’t expect our sample statistic values to be exactly equal to the population statistic value; there will be some amount of error. The Standard Error formula, which I’ll explain a piece at a time, is as follows:


The variable p is the proportion rather than percentage: .5 rather than 50% (and 0 rather than 0%; .01 rather than 1%; .1 rather than 10%; and 1 rather than 100%).
The p*(1-p) term in the numerator is called the proportion variance. Sampling from populations with percent-in-favor close to 50% have wider sampling distributions than populations with percentages closer to 0% or 100%.

The variance p*(1-p) reflects this dynamic:

  • 0.0 * (1 – 0) = 0.00
  • .01 * (1 – .01) = .01
  • .1 *(1 – .1) = .09
  • .3 *(1 – .3) = .21
  • .5 *(1 – .5) = .25
  • .7 *(1 – .7) = .21
  • .9 *(1 – .9) = .09
  • .99 *(1 – .99) = .01
  • 1.0 *(1 – 1) = 0.00

So, as p moves from .5 towards 0 or 1, variance decreases, and since variance is in the numerator, Standard Error decreases. Decreases in Standard Error correspond to narrowing of the sampling distribution. This reflects lower uncertainty. Lower variance, lower uncertainty.
Variance is itself a statistic and is very important in statistical analysis. We’ll be seeing it in formulas from now on. Now let’s consider sample size, which is represented in the denominator of the formula by n.

Larger sample sizes have narrower sampling distributions. Since n is in the denominator of the Standard Error formula, as n increases Standard Error decreases. Again, decreases in Standard Error correspond to narrowing of the sampling distribution. Again, this reflects lower uncertainty. Larger sample size, lower uncertainty.

Now we can use the Standard Error scale to determine 95% intervals. First, an important fact: The boundary lines of the 95% interval on the Standard Error scale are always -2 and +2 (they’re actually -1.95996… and +1.95996…, but I’m rounding to -2 and +2 for the present purposes). Let’s clarify all this by looking at several example calculations and illustrations.

Let’s start with random sampling of 100 from a population that is 50% in favor of the new public health policy (Figure 1.2, below).
intro to statistics figure 2

Plugging in the numbers gives


Standard Error is .05 and two Standard Errors is .1 in proportions and 10% in percentages. Since we want to center the interval on the percentage p of 50%, we’ll add and subtract 10% from 50%. This yields a calculated 95% interval of 50% + 10% (50% minus 10% to 50% plus 10%) or 40%-to-60%. That’s also what Figure 1.2 shows!

Putting everything we just computed into a formula for calculating 95% intervals we get


Next let’s consider the 95% interval of random sampling of 100 from a population that is 30% in favor of the new public health policy (Figure 2.7, reproduced below).


Standard Error is .045 and two Standard Errors is .09 in proportions and 9% in percentages. We want to center the interval on 30%, so we’ll add and subtract 9% from 30%. This yields a 95% interval of 30% ± 9% (30% minus 9% to 30% plus 9%) or 21%-to-39%. That’s also what Figure 2.7 shows!


Last let’s consider the 95% interval of random sampling of 1000 from a population that is 50% in favor of the new public health policy (Figure 2.3, below).


Standard Error is .015 and two Standard Errors is .03 in proportions and 3% in percentages. We want to center the interval on 50%, so we’ll add and subtract 3% from 50%. This yields a 95% interval of 50%+3% or 47%-to-53%. That’s also what Figure 2.3 shows!


The formula works! The reason the formula works is because the sampling distributions are “bell shaped”. More than that, they approximate the very special bell shape called the Normal distribution.

Let’s go one step further and standardize an entire sampling distribution to get what’s called the Standard Normal distribution. The Standard Normal Distribution is a normal distribution that uses Standard Error as its unit (rather than percentages or proportions). To illustrate, let’s standardize Figure 1.1 (below).
intro to statistics

Figure 3.1 is a standardized version of Figure 1.1.


Notice that Standard Error is the unit used on the horizontal axis of Figure 3.1. This is done by rescaling the horizontal axis unit of Figure 1.1 to the Standard Error unit of Figure 3.1 using the below formula.


This formula gives us how many Standard Errors a proportion, p, is from .5. First, we convert the percentages to proportions. Next, we recenter the axis: whereas Figure 1.1 is centered on the proportion value .5 (50%), Figure 3.1 is centered on zero Standard Errors; the numerator p-.5 centers the horizonal axis of Figure 3.1 onto zero. Finally, these differences are divided by the Standard Error to rescale the horizontal axis. Voila, Figure 1.1 has been standardized to the Standard Error scale of Figure 3.1.

Figure 3.2 shows its 95% interval below Figure 1.2.
intro to stats



Recall that the boundary lines of the 95% interval on the Standard Error scale are -2 and 2 (rounded). Plugging .4 (40%) and .6 (60%) from Figure 1.2 into the above formula gives us -2 and 2 Standard Errors as the 95% boundary lines in the Standard Error unit. As emphasized above: The boundary lines of the 95% interval on the Standard Error scale are always -2 and +2 (rounded). If we standardized Figures 2.3 and 2.7,
we’ll again find the 95% interval boundary lines to be -2 and 2. (You can use the formula and do the arithmetic if you want to confirm this.)

We can convert our units (e.g., percent-in-favor, percent-heads) into the Standard Error unit and vice versa by multiplying and dividing by Standard Error. That comes in very handy. All of the sampling distributions we’ve looked at so far can be standardized in this way. In practice, we don’t convert entire sampling distributions to the standardized distribution; we use Standard Error in formulas as multipliers and divisors to calculate individual values, like we do to calculate the boundary lines for 95% intervals and to convert proportions to the Standard Error scale.



We’ll further explore the Standard Normal distribution later on, but first let’s put some of what we’ve covered so far into action, while also expanding our horizons. 1,000 Surveyors’ Sample Statistics and Their 1,000 95% Confidence Intervals. In this section we’re going to look at things from a different perspective. Surveyors won’t be comparing their sample statistics of public opinion with what to expect when the population opinion statistic equals a particular value, like 50%. Instead, the surveyors want to determine, based on their sample statistic, what the value of
the population statistic might be. For example, a surveyor who gets a sample statistic value of 34% will want to calculate a 95% interval surrounding 34% and explain what that interval might tell us about the overall population’s opinions.

We are going to explore the subtleties involved by sending out 1,000 surveyors to survey the same population and see what they come up with and how they should interpret what they come up with. But first we’ll need to set the stage by inventing a population that has certain characteristics that we know, but none of the surveyors know. Our invented community, Artesian Wells, has about 70,000 residents. There is a
new public health policy being debated and, since we are all-knowing, we know that 40% of the residents agree with the new policy. Only we know this. We want to know what to expect when many, many surveyors randomly sample 1000 people from this population. The survey respondents will be asked whether they “agree” or “disagree” with something, a binomial response. We’ll use proportions rather than percentages, with the proportions rounded to two decimal places. Figure 3.3 shows us the sampling distribution of what to expect. (Don’t get confused: There are 1,000 random samples, and each sample has a sample size of 1000.)


Based on visual inspection, notice that the great majority of the sample proportions are in the interval 0.37 to 0.43. Approximately 950 of the 1,000 sample proportions are contained within the interval 0.37 to 0.43, indicating that 0.37 to 0.43 is the 95% interval surrounding the population proportion of 0.40. The formula will give us the same boundary lines. (Feel free to double check.)

As always, we expect the 95% interval around the population proportion to contain 95% of all sample proportions obtained by random sampling.
Now, we hire 1,000 independent surveyors who converge on the town to do the “agree” or “disagree” survey. All 1,000 surveyors get their own random sample of 1000 residents and calculate their own sample proportion-agree statistic. How does each of the individual surveyors analyze their sample proportion?

First, let’s look at the formula for calculating 95% confidence intervals for sample proportions. It looks much like the formula in the previous section. The variable p with a hat on denotes the sample proportion (as opposed to the population proportion). The square root term calculates the Standard Error for the sample proportion. Sample size is again represented by n. As for the constant 1.96, recall that earlier I rounded +1.95996…Standard Errors and used +2 Standard Errors; now I’m being more precise by using +1.96 Standard Errors, which is more common.


Each of the 1,000 surveyors calculates their individual interval using their sample proportion value, and we expect that 95% of the surveyors’ 95% confidence intervals will contain the population proportion (0.4 in this example). You might want to reread that sentence a few times, keeping in mind that although we, as know-it-alls, know that the population proportion is 0.4, none of the surveyors have any idea what it is.

It’s in this context that the term “confidence” in “confidence interval” came about: we are confident that 95% of all 95% confidence intervals for sample statistic values obtained via random sampling will contain the population statistic value. (But no individual surveyor will know whether their confidence interval contains the population statistic value or not!)

In a nutshell:

  • The 1,000 surveyors calculate their individual 95% confidence intervals.
  • About 950 of them will have an interval containing the population proportion.
  • About 50 of them won’t.

Let’s look at the 95% confidence intervals constructed via the formula by two surveyors: The first got a sample proportion of 0.38, and the second got a sample proportion of 0.34.
Surveyor #1 Result.
Using the formula with a sample proportion of 0.38 and a sample size of 1000, the 95% confidence interval is 0.35 to 0.41 (rounded).
Only we, being know-it-alls, know that this 95% confidence interval contains the population proportion of 0.4
Surveyor #2 Result.
Using the formula with a sample proportion of 0.34 and a sample size of 1000, the 95% confidence interval is 0.31 to 0.37.
Only we, being know-it-alls, know that this 95% confidence interval does not contain the population proportion of 0.4. Only we know that this surveyor is one of the 5% of unlucky surveyors who just happened to get a misleading random sample. This is called a Type I Error.

In summary,


  1. We expect the 95% confidence interval around the population proportion to contain 95% of all sample proportions obtained by random sampling. We’ve been seeing this all along.
  2. We expect 95% of all the 95% confidence intervals based on random sample proportions to contain the population proportion. We see this for the first time here; more detail is given next.

Table 3.1 is divided into three sections, left to right, and shows what the various surveyors will get. Overall, the Table shows the confidence intervals for surveyors with sample proportions of 0.3 through 0.5; sample proportions 0.30 through 0.36 are in the left section, 0.37 through 0.43 are in the middle section (shaded), and 0.44 through 0.50 are in the right section. Notice that the expected 950 surveyors in the middle section (shaded) with sample proportions within the interval of 0.37 to 0.43 also have 95% confidence intervals that contain the population proportion
of 0.40. The expected 50 surveyors with sample proportions outside the interval of 0.37 to 0.43—the left and right sections of the Table—do not have 95% confidence intervals that contain the population proportion of 0.40.


Again, in summary, and for emphasis:

  1. We expect the 95% confidence interval around the population proportion to contain 95% of all sample proportions obtained by random sampling.
  2. We expect 95% of all the 95% confidence intervals based on random sample proportions to contain the population proportion.

Because of these two facts, we will reach the same conclusion whether we


  1. check if a sample proportion is outside the 95% interval surrounding a hypothesized population proportion, or
  2. check if the hypothesized population proportion is outside the 95% interval surrounding a sample proportion.

The analysis can be done either way.

Here’s a quick analogy: Suppose a stamping plant that makes coins was malfunctioning and produced unbalanced (i.e., unfair) coins. Unbeknownst to
anyone, these unfair coins favored tails, and the chance of coming up heads is only 0.4. Now say 1,000 people flip these coins 1000 times each, while counting and then determining the proportion of heads. What would the 1,000 peoples’ results be like? What would an analysis of a single coin and its 1000 flips be like? Answer: Just like the survey example above. Just replace the words “agree” and “disagree” with “heads” and “tails”. We expect 95% of the coin-flippers will get 95% confidence intervals that contain 0.4, and 5% of the coin-flippers will get 95% confidence interval that do not contain 0.4. In other words, we expect 95% of the results to be veridical and 5% of the results to be misleading. But no one knows whether their results are veridical or misleading. The reason I use the word “veridical” is because it’s the perfect word: “Coinciding
with reality.” I’m using it to mean the opposite of misleading.

References

J.E. Kotteman. Statistical Analysis Illustrated – Foundations . Published via Copyleft. You are free to copy and distribute the content of this article.

CITE THIS AS:
Stephanie Glen. "Calculating Confidence Intervals" From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/calculating-confidence-intervals/
---------------------------------------------------------------------------

Need help with a homework or test question? With Chegg Study, you can get step-by-step solutions to your questions from an expert in the field. Your first 30 minutes with a Chegg tutor is free!

Need help with a homework or test question? With Chegg Study, you can get step-by-step solutions to your questions from an expert in the field. Your first 30 minutes with a Chegg tutor is free!

Comments? Need to post a correction? Please post a comment on our Facebook page.