Statistics Definitions > Residual

## What is a Residual in Regression?

When you perform simple linear regression (or any other type of regression analysis), you get a line of best fit. The data points usually don’t fall *exactly *on this regression equation line; they are scattered around. A residual is the vertical distance between a data point and the regression line. Each data point has one residual. They are positive if they are above the regression line and negative if they are below the regression line. If the regression line actually passes through the point, the residual at that point is zero.

As residuals are the difference between any data point and the regression line, they are sometimes called “**errors**.” Error in this context doesn’t mean that there’s something wrong with the analysis; it just means that there is some unexplained difference. In other words, the residual is the error that isn’t explained by the regression line.

The residual(e) can also be expressed with an **equation**. The *e* is the difference between the predicted value (ŷ) and the observed value. The scatter plot is a set of data points that are observed, while the regression line is the prediction.

**Residual = Observed value – predicted value**

e = y – ŷ

e = y – ŷ

## The Sum and Mean of Residuals

The sum of the residuals always equals zero (assuming that your line is actually the line of “best fit.” If you want to know why (involves a little algebra), see here and here. The mean of residuals is also equal to zero, as the mean = the sum of the residuals / the number of items. The sum is zero, so 0/n will always equal zero.

**Next**: Standardized Residuals.

Check out our YouTube channel for hundreds of help videos on elementary statistics!

------------------------------------------------------------------------------**Need help with a homework or test question?** Chegg offers 30 minutes of free tutoring, so you can try them out before committing to a subscription. Click here for more details.

If you prefer an **online interactive environment** to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*.