Statistics Definitions > Mean Squared Error

**Contents:**

MSE Definition

MSE Criterion

## Mean Squared Error Definition

The mean squared error tells you how close a regression line is to a set of points. It does this by taking the distances from the points to the regression line (these distances are the “errors”) and squaring them. The squaring is necessary to remove any negative signs. It also gives more weight to larger differences. It’s called the **mean **squared error as you’re finding the average of a set of errors.

## Mean Squared Error Example

General steps to calculate the mean squared error from a set of X and Y values:

- Find the regression line.
- Insert your X values into the linear regression equation to find the new Y values (Y’).
- Subtract the new Y value from the original to get the error.
- Square the errors.
- Add up the errors.
- Find the mean.

**Sample Problem: **Find the mean squared error for the following set of values: (43,41),(44,45),(45,49),(46,47),(47,44).

Step 1:Find the regression line. I used this online calculator and got the regression line y= 9.2 + 0.8x.

Step 2: Find the new Y’ values:

9.2 + 0.8(43) = 43.6

9.2 + 0.8(44) = 44.4

9.2 + 0.8(45) = 45.2

9.2 + 0.8(46) = 46

9.2 + 0.8(47) = 46.8

Step 3: Find the error (Y – Y’):

41 – 43.6 = -2.6

45 – 44.4 = 0.6

49 – 45.2 = 3.8

47 – 46 = 1

44 – 46.8 = -2.8

Step 4: Square the Errors:

-2.6^{2} = 6.76

0.6^{2} = 0.36

3.8^{2} = 14.44

1^{2} = 1

-2.8^{2} = 7.84

This table shows the results so far:

Step 5: Add all of the squared errors up: 6.76 + 0.36 + 14.44 + 1 + 7.84 = 30.4.

Step 6: Find the mean squared error:

30.4 / 5 = 6.08.

## What does the Mean Squared Error Tell You?

The smaller the means squared error, the closer you are to finding the line of best fit. Depending on your data, it may be impossible to get a very small value for the mean squared error. For example, the above data is scattered wildly around the regression line, so 6.08 is as good as it gets (and is in fact, the line of best fit). Note that I used an online calculator to get the regression line; where the mean squared error really comes in handy is if you were finding an equation for the regression line by hand: you could try several equations, and the one that gave you the smallest mean squared error would be the line of best fit.

## MSE Criterion

Sometimes, a statistical model or estimator must be “tweaked” to get the best possible model or estimator. The MSE criterion is a tradeoff between (squared) bias and variance and is defined as:

“T is a minimum [MSE] estimator of θ if MSE(T, θ) ≤ MSE(T’ θ), where T’ is any alternative estimator of θ (Panik).”

**References:**

Michael Panik. Endocrine Manifestations of Systemic Autoimmune Diseases.

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!

I’m using the Time Serie Decomposition Plot in Minitab to forecast stability program data. This plot gives me the MAPE, MAD, and MSD. Can I used this measures of accuracy to plot a range or intervals that counts for the variability of forecasted values?

Not really. Those statistics are only useful for comparing models, not plotting a range/intervals for individual models.