Regression Analysis > Prediction Interval
What is a Prediction Interval?
A prediction interval is a type of confidence interval (CI) used with predictions in regression analysis; it is a range of values that predicts the value of a new observation, based on your existing model.
Prediction and confidence intervals are often confused with each other. However, they are not quite the same thing.
- A confidence interval is a range of values associated with a population parameter. For example, the mean of a population.
- A prediction interval is where you expect a future value to fall.
The Uncertainties with Intervals
Just like most things in statistics, it doesn’t mean that you can predict with certainty where one single value will fall.
Confidence intervals are always associated with a confidence level, representing a degree of uncertainty (data is random, and so results from statistical analysis are never 100% certain).
For example, you might say that the mean life of a battery (at a 95% confidence level) is 100 to 110 hours. This tells you that a battery will fall into the range of 100 to 110 hours 95% of the time.
Similarly, the prediction interval tells you where a value will fall in the future, given enough samples, a certain percentage of the time. A 95% prediction interval of 100 to 110 hours for the mean life of a battery tells you that future batteries produced will fall into that range 95% of the time. There is a 5% chance that a battery will not fall into this interval.
When to Use It
It’s very common to use the confidence interval in place of the prediction interval, especially in econometrics. However, you should use a prediction interval instead of a confidence level if you want accurate results. Let’s say you calculate a confidence interval for the mean daily expenditure of your business and find it’s between $5,000 and $6,000. That tells you where the mean probably lies. If you use that CI to make a prediction interval, you will have a much narrower interval. For example, the prediction interval might be $2,500 to $7,500 at the same confidence level. If you do use the confidence interval, it’s highly likely that interval will have more error, meaning that values will fall outside that interval more often than you predict.
How to Find a Prediction Interval
By hand, the formula is:
You probably won’t want to use the formula though, as most statistical software will include the prediction interval in output for regression. Look for it next to the confidence interval in the output as 95% PI or similar wording.
- SPSS: Follow the instructions on page 3 of this PDF by Andy Chang of Youngstown State University.
- Minitab: Click the “Options” tab on the Simple Regression dialog box, then check the PI option.
Guang-Hwa “Andy” Chang. Linear Regression in SPSS. Retrieved July 3, 2017 from: http://gchang.people.ysu.edu/SPSSE/SPSS_lab2Regression.pdf
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.