Hypothesis Testing > Unequal Sample Sizes
Problems with Unequal Sample Sizes
Unequally sized groups are common in research and may be the result of simple randomization, planned differences in group size or study dropouts. Unequal sample sizes can lead to:
- Unequal variances between samples, which affects the assumption of equal variances in tests like ANOVA. Having both unequal sample sizes and variances dramatically affects statistical power and Type I error rates (Rusticus & Lovato, 2014).
- A general loss of power. Equal-sized groups maximize statistical power.
- Issues with confounding variables.
Where, exactly this starts to matter isn’t clear. Keppel (1993) states that a rule of thumb for a magic cut-off point doesn’t seem to exist. That said, you don’t need equally sided groups to calculate accurate statistics, and most software will adjust for differences.
Some tests are set up specifically to deal with the problem of unequal sample sizes and unequal variances:
- Dunnett’s T3 or Dunnett’s C can be used for pairwise comparisons. Use T3 for small samples, and C for larger samples.
- Games-Howell Pairwise Comparison Test: an extension of the Tukey-Kramer test to handle unequal variances. Although it has more power (narrower confidence intervals) than Dunnett’s tests, alpha inflation can be a problem.
- Tamhane’s T2: combines Sidak’s multiplicative inequality test with Welch’s approximate solution.
- Welch’s Test for Unequal Variances is a modified Student’s t-test. The modified degrees of freedom tends to increase the test power for samples with unequal variance.
For unequal sample sizes that have equal variance, the following parametric post hoc tests can be used. All are considered conservative (Shingala):
- Dunnet’s test,
- Fisher’s test,
- Gabriel’s test.
- Hochberg’s GT2,
- Sidak’s test,
- Scheffe’s test,
- Tukey-Kramer test.
Non parametric options for unequal sample sizes are:
- Dunn pairwise,
- Dunn control,
Hochberg, Y. Tamhane, Y. Multiple Comparison Procedures, John Wiley & Sons, 1987.
Keppel, G. (1993). Design and Analysis: A Researcher’s Handbook. Pearson.
Parra-Frutos, I. Comput Stat (2013) Testing homogeneity of variances with unequal sample sizes. 28: 1269. doi:10.1007/s00180-012-0353-x.
Rusticus, S. & Lovato, C. (2014). Impact of Sample Size and variability on the Power and Type I Error Rates of Equivalence Tests: A Simulation Study. Practical Assessment, Research & Evaluation. Vol. 19, No. 11. August.
Shingala, C. et. al. / International Journal of New Technologies in Science and Engineering
Vol. 2, Issue 5,Nov 2015, ISSN 2349-0780
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!