Statistics How To

Robust Statistics / Estimation (Robustness)

Statistics Definitions > Robust Statistics

What are Robust Statistics?

Robust statistics are resistant to outliers. In other words, if your data set contains very high or very low values, then some statistics will be good estimators for population parameters, and some statistics will be poor estimators. For example, the mean is very susceptible to outliers (it’s non-robust), while the median is not affected by outliers (it’s robust).

Robust Estimators:

Non-robust Estimators:

Robust Statistics are different from robust tests, which are defined as tests that will still work well even if one or more assumptions are altered or violated. For example, Levene’s test for equality of variances is still robust even if the assumption of normality is violated.

When You Shouldn’t Rely on Robustness.

robust statistics

Robust statistics work on the assumption that your data follows a normal distribution.

Robust statistics assume that your underlying distribution is normal, so you shouldn’t use them for skewed or multimodal distributions. These statistics work on the assumption that the underlying data is approximately normal; if you use these statistics on a differently-shaped distribution, they will give misleading results. That said, they don’t work well for all normally shaped distributions, like mixtures of two normal distributions (called a contaminated distribution).

While robust statistics are resistant to outliers, they are not always appropriate for the same reason; it also means that the statistics you present give no idea about outliers. For example, the median house price where I live is about $250,000. That doesn’t sound too impressive, and you could be forgiven for thinking I must live in a pretty “average” town. However, I live by the river, and while most homes sell for about that price, about 1% of homes are on the river and sell for $2-3 million.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!
Robust Statistics / Estimation (Robustness) was last modified: October 17th, 2017 by Andale