When most people think of regression analysis, they think of Ordinary Least Squares (OLS) regression. OLS regression is a great tool and has its place, but it’s not the be-all and end-all of regression analyses. In fact, for some situations, you’re better off using quantile regression.
Quantile regression is a type of regression analysis that estimates the conditional median of the response variable. One advantage of quantile regression over OLS regression is that quantile regression estimates are more robust against outliers in the response measurements.
So, when might you want to use quantile regression instead of OLS? Let’s take a look.
Comparing Means and Medians
To understand when you might want to use quantile regression, it helps to understand the difference between means and medians. The mean is what we typically think of when we think of an average – you add up all the values and then divide by the number of values. The median is the value in the middle when you order all the values from smallest to largest.
For example, let’s say you want to find the average age of your five friends: Sarah (22), Mike (24), Emily (26), Jake (28), and David (30). The mean age would be 26 ((22 + 24 + 26 + 28 + 30)/5). The median age would be 26 as well ((22 + 24)/2). In this case, the mean and median are the same value.
But what if one friend was much older or younger than all the others? Let’s say Sarah’s age was changed to 62. Now the mean age is 30 ((22 + 24 + 26 + 28 + 62)/5), but the median is still 26 ((22 + 24)/2). As you can see, adding an outlier can have a big impact on the mean but not necessarily on the median.
Applications of Median Regression
So why does this matter? Well, let’s say you’re analyzing data on car accidents. You have a dataset with 100 observations on different drivers who were involved in accidents. You’re interested in understanding how different factors – like age, gender, driving experience, etc – affect accident rates.
If you were to run an OLS regression on this data, an outlier – like a driver who is much older or much younger than all the other drivers in the dataset – could have a big impact on your results. But if you ran a median regression instead, that outlier would have less impact because you’re looking at the middle value rather than the mean/average value. As such, median regressions can often be more robust against outliers than OLS regressions.
Quantile Regression: Conclusion
Quantile regressions are a type of regression analysis that can be very useful in certain situations. They estimate the conditional median of a response variable and are more robust against outliers than OLS regressions. If you’re ever working with data that may contain outliers, quantile regression should definitely be something you consider using in your analysis.