Hypothesis Testing > Degrees of Freedom

Degrees of freedom are used in hypothesis testing.

**Contents **(click to skip to that section):

- What are Degrees of Freedom?
- DF: Two Samples
- Degrees of Freedom in ANOVA
- Why Do Critical Values Decrease While DF Increase?

## What are Degrees of Freedom?

Degrees of freedom of an estimate is **the number of independent pieces of information that went into calculating the estimate**. It’s not quite the same as the number of items in the sample. In order to get the df for the estimate, you have to subtract 1 from the number of items. Let’s say you were finding the mean weight loss for a low-carb diet. You could use 4 people, giving 3 degrees of freedom (4 – 1 = 3), or you could use one hundred people with df = 99.

In math terms (where “n” is the number of items in your set):

Degrees of Freedom = n – 1

**Why do we subtract 1 from the number of items?** Another way to look at degrees of freedom is that they are **the number of values that are free to vary **in a data set. What does “free to vary” mean? Here’s an example using the mean (average):

**Q**. Pick a set of numbers that have a mean (average) of 10.

**A**. Some sets of numbers you might pick: 9, 10, 11 or 8, 10, 12 or 5, 10, 15.

Once you have chosen the first two numbers in the set, the third is fixed. In other words, **you can’t choose the third item in the set**. The only numbers that are free to vary are the first two. You can pick 9 + 10 or 5 + 15, but once you’ve made that decision you **must** choose a particular number that will give you the mean you are looking for. So degrees of freedom for a set of three numbers is TWO.

For example: if you wanted to find a confidence interval for a sample, degrees of freedom is n – 1. “N’ can also be the number of classes or categories. See: Critical chi-square value for an example.

Back to Top

## Degrees of Freedom: Two Samples

If you have two samples and want to find a parameter, like the mean, you have two “n”s to consider (sample 1 and sample 2). Degrees of freedom in that case is:

Degrees of Freedom (Two Samples): (N

_{1}+ N_{2}) – 2.

## Degrees of Freedom in ANOVA

Degrees of freedom becomes a little more complicated in ANOVA tests. Instead of a simple parameter (like finding a mean), ANOVA tests involve comparing known means in sets of data. For example, in a one-way ANOVA you are comparing two means in two cells. The grand mean (the average of the averages) would be:

Mean _{1} + mean _{2} = grand mean.

What if you chose mean _{1} and you knew the grand mean? You wouldn’t have a choice about Mean_{2}, so your degrees of freedom for a two-group ANOVA is 1.

Two Group ANOVA df1 = n – 1

For a three-group ANOVA, you can vary two means so degrees of freedom is 2.

It’s actually a *little* more complicated because there are **two** degrees of freedom in ANOVA: df1 and df2. The explanation above is for df1. Df2 in ANOVA is the total number of observations in all cells – degrees of freedoms lost because the cell means are set.

Two Group ANOVA df2 = n – k

The “k” in that formula is the number of cell means or groups/conditions.

For example, let’s say you had 200 observations and four cell means. Degrees of freedom in this case would be: Df2 = 200 – 4 = 196.

Back to Top

## Why Do Critical Values Decrease While DF Increase?

*Thanks to Mohammed Gezmu for this question.*

Let’s take a look at the t-score formula in a hypothesis test:

When n increases, the t-score goes up. This is because of the square root in the denominator: as it gets larger, the fraction s/√n gets smaller and the t-score (the result of another fraction) gets bigger. As the degrees of freedom are defined above as n-1, you would think that the t-critical value should get bigger too, but they don’t: they get *smaller*. This seems counter-intuitive.

However, **think about what a t-test is actually for**. You’re using the t-test because you don’t know the standard deviation of your population and therefore you don’t know the shape of your graph. It could have short, fat tails. It could have long skinny tails. You just have no idea. The degrees of freedom affect the shape of the graph in the t-distribution; as the df get larger, the area in the tails of the distribution get smaller. As df approaches infinity, the t-distribution will look like a normal distribution. When this happens, you can be certain of your standard deviation (which is 1 on a normal distribution).

Let’s say you took repeated sample weights from four people, drawn from a population with an unknown standard deviation. You measure their weights, calculate the mean difference between the sample pairs and repeat the process over and over. The tiny sample size of 4 will result a t-distribution with fat tails. The fat tails tell you that you’re more likely to have extreme values in your sample. You test your hypothesis at an alpha level of 5%, which **cuts off the last 5% of your distribution**. The graph below shows the t-distribution with a 5% cut off. This gives a critical value of 2.6. (**Note**: I’m using a hypothetical t-distribution here as an example–the CV is not exact).

Now look at the normal distribution. We have less chance of extreme values with the normal distribution. Our 5% alpha level cuts off at a CV of 2.

Back to the original question “Why Do Critical Values Decrease While DF Increases?” Here’s the short answer:

Degrees of freedom are related to sample size (n-1). If the df increases, it also stands that the sample size is increasing; the graph of the t-distribution will have skinnier tails, pushing the critical value towards the mean.

**Reference**:

Gerard Dallal. The Little Handbook of Statistical Practice. Retrieved December 26 2015 from here.

Alistair W Kerr, Howard K Hall, Stephen A Kozub. (2002). Doing Statistics with SPSS. Sage Publications. p.68. Available here.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you’re are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

The example really helps clarify the meaning of degrees of freedom. Could you also discuss *why* it is necessary to account for these in various statistical tests?

Hi, Sarah,

Thanks for your comment. I’m not sure I completely understand your question. I attempted to explain why in the first paragraph; if you have x numbers, then your test result will vary if you have, say x+1 numbers or x+10 numbers. Could you tell me what you think needs expanding?

Thanks :)

Thank you for your examples. Why does critical values decrease while df increase?

Hello, Mohammed,

Thanks for your question. I have added it to the article (please see the article for the full explanation). The short answer: Degrees of freedom are related to sample size (n-1). If the df increases, it also stands that the sample size is increasing; the graph of the t-distribution will have skinnier tails, pushing the critical value towards the mean.

Regards,

Stephanie

Dear Stephanie, Thank you very much for your consideration. I read all the ideas included ; and got some insight about my question.

Hi All,

I have a question with regards to degree of freedom.

I have two samples one with sample size of 1926 and standard error of estimate equal to 18.9

other is with sample size of 41 and standard error of estimate equal to 11.11.

I was wondering since the sample size are different is there any way that I can compare these two results?

They are a result of two fitting regression model for observed and predicted data for two different sample size.

“Is there any way that I can compare these two results?”

What are you comparing? A mean?

No I am comparing based on my standard deviation.

I don’t see any reason why you couldn’t compare them, as long as your sample sizes are sufficient.

Thank you for your reply.

The part that I don’t know is how can I compare them since the sample size is different.

for example if I need to consider their df for this comparison how should I do that.

I am pretty new with F-test

If you the sample size you chose for both sets is sound (e.g. you used random sampling), then you can compare the results from both sets without having to adjust for sample size.

Thank you Andale for your response.

But would that be correct? I mean if I have 100 samples, won’t it give me lower SD comparing to the one if I had 20 samples ?

I am concern about having n-2 in my deviator, which will effect the result.

Standard deviation is the spread of scores. It’s not affected by sample size. Here’s an example: if you survey 1000 people, their average IQ score should be around 100 with a standard deviation of 15. If you survey 200, you’ll get the same standard deviation.

Why z test does not depends on degrees of freedom as t test depends on dg

There is only one z-distribution. The t-test has several, depending on the sample size.

Why is it that in chi square goodness of fit test for eg 9:3:3:1 ratio you have number of categories minus 1 df – so 3df, but if you are doing chi square test for independence in a 2×2 table you have (number of rows – 1) (number of columns minus 1 ) df so 1df? There are 4 categories in a 2×2 table so why not 4-1 = 3df? You can put 9:3:3:1 into a 2×2 table so why not 1df?

The 2×2 table usually does not have 3 categories, it usually has 2 (one on the top row and one in the bottom).

Sorry – don’t get that. If you are testing whether men and women have different preferences for dogs and cats you would have:

cat. dog

women

men

So 2rows x 2 columns (2-1)(2-1) = 1df in chi square test of independence. That looks like 4 categories to me 1) women who prefer cats 2) women who prefer dogs, men who prefer cats and men who prefer dogs. So why isn’t it 4-1 df = 3?

I cannot see how that is different from genetics test for 9:3:3:1 ratio something like

round. wrinkled

green

yellow

Test of goodness of fit – 4 categories – round green, wrinkled green, round yellow, wrinkled yellow

df is 4-1 = 3 but I can’t see why it isn’t rows-1, x columns -1 ie (2-1) (2-1) = 1

“Women who prefer dogs” or “men who prefer cats” aren’t categories. They are joint frequencies.

For a more mathematical explanation of why a 2×2 table has one df, see this explanation (second paragraph).

I am sorry but still don’t get it. What is the difference between women (as opposed to men) who prefer cats to dogs and round peas (as oposed to wrinkled) that are green not yellow? why is one a joint frequency and the other not?

I’m in a first year college stats class and currently on the chapter regarding comparing two means. The prior chapter showed n-1 as degrees of freedom formula. However this next chapter is saying you can’t use that but because the formula is so complicated you can use technology to calculate it. For the life of me I cannot figure out how to calculate the degrees of freedom on the TI-83. I’m not looking for a probability, I need the actual degrees of freedom number since it is not given and can’t use n-1. And apparently (N1+N2)-2 can’t be used either otherwise why would the text suggest using technology to calculate the df? Super confused.

When it says “the the formula is so complicated you can use technology to…” I suspect they are referring to a specific formulas that uses the df (as opposed to the df itself). Can you send a screenshot of the text? andalepublishing at gmail.

It’s not allowing me to paste it into this comment box. Is there an email I should send it to?

Thank you,

Julie

What could be the probable reason for below data ?

Source DF Seq SS Adj SS Adj MS F-value P-value

Main Effects 2 1.00000 1.00000 0.50000 * *

Talc Concentration 1 1.00000 1.00000 1.00000 * *

Lubrication Time 1 0.00000 0.00000 0.00000 * *

2-Way Interactions 1 1.00000 1.00000 1.00000 * *

Talc concentration*Lubrication Time 1 1.00000 1.00000 1.00000 * *

Curvature 1 1.171429 1.71429 1.71429 * *

Residual Error 2 0.000000 0.00000 0.00000

Pure Error 2 0.000000 0.00000 0.00000

Total 6 3.71429

Which part of the data?