Statistics Definitions > Test-Retest Reliability / Repeatability
Test-Retest ReliabilityTest-Retest Reliability (sometimes called retest reliability) measures test consistency — the reliability of a test measured over time. In other words, give the same test twice to the same people at different times to see if the scores are the same. For example, test on a Monday, then again the following Monday. The two scores are then correlated.
Bias is a known problem with this type of reliability test, due to:
- Feedback between tests,
- Participants gaining knowledge about the purpose of the test, so they are more prepared the second time around.
This reliability test can also take a long time to calculate correlations for. Depending upon the length of time between the two tests, this could be months or even years.
Calculating Test-Retest Reliability Coefficients
Finding a correlation coefficient for the two sets of data is one of the most common ways to find a correlation between the two tests. Test-retest reliability coefficients (also called coefficients of stability) vary between 0 and 1, where:
- 1 : perfect reliability,
- ≥ 0.9: excellent reliability,
- ≥ 0.8 < 0.9: good reliability,
- ≥ 0.7 < 0.8: acceptable reliability,
- ≥ 0.6 < 0.7: questionable reliability,
- ≥ 0.5 < 0.6: poor reliability,
- < 0.5: unacceptable reliability,
- 0: no reliability.
On this scale, a correlation of .9(90%) would indicate a very high correlation (good reliability) and a value of 10% a very low one (poor reliability).
- For measuring reliability for two tests, use the Pearson Correlation Coefficient. One disadvantage: it overestimates the true relationship for small samples (under 15).
- If you have more than two tests, use Intraclass Correlation. This can also be used for two tests, and has the advantage it doesn’t overestimate relationships for small samples. However, it is more challenging to calculate, compared to the simplicity of Pearson’s.