You may want to read this article first: What is the Logit Model?
What is Ordered Logistic Regression?
Ordered Logistic Regression (also called the logit model or cumulative link model) is a sub-type of logistic regression where the Y-category is ordered. It is used when your dependent variable has:
- A meaningful order, and
- More than two categories (or levels).
Examples of suitable variables include:
- Opinion polls (agree/neutral/disagree),
- Socioeconomic status (low/medium/high),
- Scores on a test (excellent/average/poor).
- Product sizes ordered (large, medium, small).
Logistic regression and ordered logistic regression differ with calculations of probabilities. Where logistic regression assigns probabilities that a variable will take on a specific value, ordered logit assigns probabilities that values will fall below a certain threshold.
Cautions with the Ordered Logit Model
Using ordered logistic regression is a judgment call, and it may not be the best fit for your data (Menard, 1997). The model — and it’s results — can be difficult to understand for laypersons. That said, it is usually the best method for analyzing truly ordered data.
When you use this model, what you’re basically doing is treating your variables as if the underlying structure is an interval scale or ratio scale. You should consider all options before deciding on a logit model:
- If you have more than 5 categories, consider treating your variables as continuous variables. You can then use Ordinary Least Squares regression.
- If you are unsure whether your variable is truly ordered (for example, you have manager/assistant manager/head supervisor), you have a couple of options:
- Ignore the ordinality and use multinomial logistic regression instead. Be aware though, that if you use multinomial models for data that is truly ordered, you could overestimate the number of parameters — increasing the risk of missing a statistically significant result.
- Treat the variable as ordered and use the slogit model instead.
The fuzzy boundaries between all of these regression analyses are the main reason why deciding to use the logit model is such a subjective — and sometimes challenging — choice.
A variety of ordered logistic models exist. The one you’re most likely to come across — and the one used by most statistical software packages — is the Proportional Odds Model. This model assumes that the coefficients for each level are the same. If the coefficients are not the same, the data can be handled by a generalized ordered logistic model or partial proportional odds model.
Other, less-used, models which can be used for ordered responses include:
- Adjacent category model
- Continuation-ratio model
- Heterogenous choice model
- Location scale logistic model
- McFadden’s choice model
- Parallel line model
- Rank ordered regression
- Stereotype logistic model
The ordered logit model isn’t usually calculated by hand. Most statistical packages have commands to run the procedure, including:
- Stata (use ologit). Numeric values represent the categories. these can be any numbers, but the higher the number, the higher the item. For example: Poor (1), Acceptable (2), Excellent (3).
- SAS: use PROC LOGISTIC.
- R: Use the ‘Ordinal’ package.
Menard, S. (1997) “Applied Logistic Regression Analysis.” SAGE.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need to post a correction? Please post a comment on our Facebook page.