Explained Variance / Variation

What is Explained Variance?

Explained variance (also called explained variation) is used to measure the discrepancy between a model and actual data. In other words, it’s the part of the model’s total variance that is explained by factors that are actually present and isn’t due to error variance.

Higher percentages of explained variance indicates a stronger strength of association. It also means that you make better predictions (Rosenthal & Rosenthal, 2011).

r² = R² = η²

Explained variance can be denoted with r². In ANOVA, it’s called eta squared (η²) and in regression analysis, it’s called the Coefficient of Determination (R²). The three terms are basically synonymous, except that R² assumes that changes in the dependent variable are due to a linear relationship with the independent variable; Eta² does not have this underlying assumption.

In ANOVA, explained variance is calculated with the “eta-squared (η²)” ratio Sum of Squares(SS)_between to SS_total; It’s the proportion of variances for between group differences.

R² in regression has a similar interpretation: what proportion of variance in Y can be explained by X (Warner, 2013).

The Problems with Multiple Predictors

In general, the more predictor variables you add, the higher the explained variance. The amount of overlapping variance (the variance explained by more than one predictors) also increases. However, there comes a point of diminishing returns when new predictors in the model result in an inability to tell which predictor is producing what result. Furthermore, if you add two highly correlated predictors to a model, you introduce the possibility of multicollinearity .

On the other hand, adding too few predictors can also pose a problem: Omitting a predictor variable that can potentially explain some of the variance results in bias. Therefore, a careful balance must be made between too many predictors and too few.

References

Rosenthal, G. & Rosenthal, J. (2011). Statistics and Data Interpretation for Social Work. Springer Publishing Company.
Warner, R. (2013). Applied Statistics: From Bivariate Through Multivariate Techniques. SAGE.

What is Explained Variance?

r2 = R2 = η2

The Problems with Multiple Predictors

References

r² = R² = η²