Previous article: Excel 2013 Regression Analysis How To
Watch the video or read the steps below:
Excel Regression Analysis Output Explained
In the previous article, I explained how to perform Excel regression analysis. After you’ve gone through the steps, Excel will spit out your results, which will look something like this:
Excel Regression Analysis Output Explained: Multiple Regression
Here’s a breakdown of what each piece of information in the output means:
EXCEL REGRESSION ANALYSIS OUTPUT PART ONE: REGRESSION STATISTICS
- Multiple R. This is the correlation coefficient. It tells you how strong the linear relationship is. For example, a value of 1 means a perfect positive relationship and a value of zero means no relationship at all. It is the square root of r squared (see #2).
- R squared. This is r2, the Coefficient of Determination. It tells you how many points fall on the regression line. for example, 80% means that 80% of the variation of y-values around the mean are explained by the x-values. In other words, 80% of the values fit the model.
- Adjusted R square. The adjusted R-square adjusts for the number of terms in a model. You’ll want to use this instead of #2 if you have more than one x variable.
- Standard Error of the regression: An estimate of the standard deviation of the error μ. This is not the same as the standard error in descriptive statistics! The standard error of the regression is the precision that the regression coefficient is measured; if the coefficient is large compared to the standard error, then the coefficient is probably different from 0.
- Observations. Number of observations in the sample.
EXCEL REGRESSION ANALYSIS OUTPUT EXPLAINED PART TWO: ANOVA
- SS = Sum of Squares.
- Regression MS = Regression SS / Regression degrees of freedom.
- Residual MS = mean squared error (Residual SS / Residual degrees of freedom).
- F: Overall F test for the null hypothesis.
- Significance F: The significance associated P-Value.
The second part of output you get in Excel is rarely used, compared to the regression output above. It splits the sum of squares into individual components (see: Residual sum of squares), so it can be harder to use the statistics in any meaningful way. If you’re just doing basic linear regression (and have no desire to delve into individual components) then you can skip this section of the output.
For example, to calculate R2 from this table, you would use the following formula:
R2 = 1 – residual sum of squares (SS Residual) / Total sum of squares (SS Total).
In the above table, residual sum of squares = 0.0366 and the total sum of squares is 0.75, so:
R2 = 1 – 0.0366/0.75=0.9817
EXCEL REGRESSION ANALYSIS PART THREE: INTERPRET REGRESSION COEFFICIENTS
This section of the table gives you very specific information about the components you chose to put into your data analysis. Therefore the first column (in this case, House / Square Feet) will say something different, according to what data you put into the worksheet. For example, it might say “height”, “income” or whatever variables you chose.
The columns are:
- Coefficient: Gives you the least squares estimate.
- Standard Error: the least squares estimate of the standard error.
- T Statistic: The T Statistic for the null hypothesis vs. the alternate hypothesis.
- P Value: Gives you the p-value for the hypothesis test.
- Lower 95%: The lower boundary for the confidence interval.
- Upper 95%: The upper boundary for the confidence interval.
The most useful part of this section is that it gives you the linear regression equation:
y = mx + b.
y = slope * x + intercept.
For the above table, the equation would be approximately:
y = 3.14 – 0.65X1 + 0.024X2.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you’re are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.