Statistics Definitions > Adjusted r^{2}

Watch the video or read the article below:

## Adjusted R2: Overview

Adjusted R^{2} is a special form of R^{2}, the coefficient of determination.

R^{2} shows how well terms (data points) fit a curve or line. Adjusted R^{2} also indicates how well terms fit a curve or line, but adjusts for the number of terms in a model. If you add more and more **useless** variables to a model, adjusted r-squared will decrease. If you add more **useful** variables, adjusted r-squared will increase.

Adjusted R^{2} will always be less than or equal to R^{2}. You only need R^{2} when working with** samples**. In other words, R^{2} isn’t necessary when you have data from an entire population.

- N is the number of points in your data sample.
- K is the number of independent regressors, i.e. the number of variables in your model, excluding the constant.

If you already know R^{2} then it’s a fairly simple formula to work. However, if you do not already have R^{2} then you’ll probably not want to calculate this by hand! (If you must, see How to Calculate the Coefficient of Determination). There are many statistical packages that can calculated adjusted r squared for you. Adjusted r squared is given as part of Excel regression output. See: Excel regression analysis output explained.

## Meaning of Adjusted R2

Both R^{2} and the adjusted R^{2} give you an idea of how many data points fall within the line of the regression equation. However, there is **one main difference** between R^{2} and the adjusted R^{2}: R^{2} assumes that every single variable explains the *variation in the dependent variable*. The adjusted R^{2} tells you the percentage of *variation explained by only the independent variables that actually affect the dependent variable*.

## How Adjusted R2 Penalizes You

The adjusted R^{2} will penalize you for adding independent variables (K in the equation) that do not fit the model. Why? In regression analysis, it can be tempting to add more variables to the data as you think of them. Some of those variables will be significant, but you can’t be sure that significance is just by chance. The adjusted R^{2} will compensate for this by that penalizing you for those extra variables.

While **values are usually positive,** they can be **negative **as well. This could happen if your R^{2} is zero; After the adjustment, the value can dip below zero. This usually indicates that your model is a poor fit for your data. Other problems with your model can also cause sub-zero values, such as not putting a constant term in your model.

### Problems with R2 that are corrected with an adjusted R2

- R
^{2}increases with every predictor added to a model. As R^{2}always increases and never decreases, it can appear to be a better fit with the more terms you add to the model. This can be completely misleading. - Similarly, if your model has too many terms and too many high-order polynomials you can run into the problem of over-fitting the data. When you over-fit data, a misleadingly high R
^{2}value can lead to misleading projections.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you’re are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Thank you for useful information about adjusted r2

Very well presented, simple and clear. And with a great accent! Statistics is more fun to learn when accents are involved. Thanks

Thank you very much for the useful explanation regarding R squared and adjusted r squared. I’m not quite sure how to interpret the adjusted r squared if its required to just compare in one estimated regression model. I know how to tell the difference of adjusted r squared for two different regression model because r squared is about dof but I am totally do not understand how to comment the difference between small variation in r squared and adjusted r squared for only one estimated regression model without changing any independent variables. you make it clear the difference between r squared and adjusted r squared. tqvm!

Thank you for useful information.

Using the formula for adjusted r.square,I have obtained a different answer from the one computed using spss package,however the diff is negligable,I dont know why?

Without seeing your data, I can’t really say. I’d have a colleague check your inputs.

Thanks for your usefull information i am very grtefull.

Hi ! I have a question. I have a set of four variables, and a lot of observations. When I suppress a single extreme observation of a variable, my r square falls for every variables, as the adjusted R squared does.

What could explain that ?

I’d have to look at your data to say for sure. But, I’d say that observation makes for a better model at first glance. But just because the model fits your data really well doesn’t mean that it’s a good model :)

so this means ( what I infer ) if we add more and more values to the data for fitting then fit of model for linear regression will adjust the value of coefficient of determination and becomes adjusted R^2.

Am I right?

Hello, Avaneesh,

That’s not correct. The adjustment doesn’t happen automatically. The formulas are different, so you either choose to used r-squared or you choose to use adjusted-r-squared.

In a two model situation if the coefficient of determination of one model is greater than the other model how will this be interpreted

The model with the higher CoD has a better goodness of fit.