What is a Binomial Confidence Interval?
Let’s say you needed a 100(1-α) confidence interval (where α is the significance level) on a certain parameter p for a binomial distribution. Exactly how you would achieve this depends on the values for n (your sample size) and p:
- Large sample size (> 15) and large p (≥ 0.1): The normal approximation method works well (Herson, 2009) unless the proportion is close to 0 or 1 (Razdolsky, 2014). The general rule of thumb is that you can use the normal approximation when n * p and n * q (q is just 1 – p) are greater than 5. For more on this, see: Using the normal approximation to solve a binomial distribution problem.
- Large sample size (> 15) and small p (< 0.1): The Poisson approximation for the binomial is a better choice (Montgomery, 2001).
- Small samples (15 or under): a binomial table should be used to find the binomial confidence interval for p.
All of the formulas associated with a binomial confidence interval work on the assumption of an underlying binomial distribution. In other words, your experiment has a fixed number of trials with two outcomes, a “success” or “failure.” Success and failure are generic terms for two opposing outcomes, which could be yes/no, black/white, voted/didn’t vote, or a myriad of other options,
1. Large N, Large P (Normal Approximation)
The formula for the CI on parameter p is:
The unbiased point estimator, p is the proportion of “successes” in a Bernoulli trial. As a formula, that’s:
Z alpha/2 is an alpha level’s z-score for a two tailed test. See: What is Z Alpha/2?
2. Approximating using the Poisson Distribution
You have a couple of choices here. The first is to use one of the many calculators available online. A good one is this one, which gives you values for a 95% CI. The second option is to use a table, such as the one at the bottom of this article: Table for 95% exact confidence intervals for the Poisson Distribution.
Let’s use an example to show how the table works. You have a small sample of observations (n = 6) from a total population of 10,000. For this low level of occurrence in the population, the Poisson gives a good approximation to the binomial. To find the mean(μ) and the associated confidence interval:
- Locate the 95% low and high values in the table for 95% exact confidence intervals for the Poisson Distribution.. For n = 6, the low is 2.202 and the high is 13.06.
- Divide the numbers you found in the table by the number of population members. In this example, there are 10,000 members, so the confidence interval is:
- 2.202 / 10,000 = 0.00022
- 13.06 / 10,000 = 0.001306
The following table shows the first few values for an exact 95% Confidence limit for the Poisson Distribution (adapted from Appendix 1 of Ahlbom’s Biostatistics for Engineers):
|# Observed Events||95% Low||95% High|
Herson, J. (2009). Data and Safety Monitoring Committees in Clinical Trials. CRC Press.
Montgomery, D. (2001). Introduction to Statistical Quality Control. 4th Edition. Wiley & Sons.
Radolsky, L. (2014). Probability-Based Structural Fire Load. Cambridge University Press.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.