Regression Analysis > Multinomial Logistic Regression
What is Multinomial Logistic Regression?
Multinomial logistic regression is used when you have a categorical dependent variable with two or more unordered levels (i.e. two or more discrete outcomes). It is practically identical to logistic regression, except that you have multiple possible outcomes instead of just one.
For example, children’s food choices are influenced by their parents’ choices and the children’s pastimes (e.g. sports enthusiast vs. gamer). You could study the relationship between a child’s food choices with their parents’ choices and children’s pastimes. The dependent variable levels would be the different food choices (fast food, healthy choices, protein packed, vegan etc.). Or you might study how workers’ education levels and time on the job affect promotions. The independent variables would be education levels and time on the job, and the levels of the dependent variable might be promotion to team-leader roles, sales positions, or management positions.
One level of the dependent variable is chosen as the reference category. This is typically the most common or the most frequent category. In the first example above, this might be “fast food”. The probability of being in any of the other categories is compared to the probability of being in the reference category. These relative probabilities are the predicted log odds (the logarithmic of the odds).
Running Multinomial Logistic Regression
This type of regression is usually performed with software. Essentially, the software will run a series of individual binomial logistic regressions for M – 1 categories (one calculation for each category, minus the reference category). When M = 2, multinomial logistic regression, ordered logistic regression, and logistic regression are equal.
Before the advent of computer software, you would have run these individual regressions and then compared the results. The software takes away that chore, and estimates parameters simultaneously, resulting in more efficiency.
- The model is specified correctly with no extraneous variables.
- Cases are independent.
- There is no multicollinearity between the independent variables.
Multinomial logistic regression works the same way as other types of regression: you’re looking for a relationship between the independent and dependent variables. The output will give you sets of coefficients for each variable. The output for each software package will vary. UCLA has several excellent resources on interpreting results. For example, annotated SPSS outputand annotated STATA output.
Multinomial logistic regression is know by a variety of other names:
- Conditional maximum entropy model,
- Maximum entropy classifier,
- Multiclass logistic regression.
- Multinomial logit,
- Polytomous logistic regression,
- Softmax regression.
- Multinomial probit regression: has independent normal error terms.
- Ordinal logistic regression: a better choice if your ordered (the model will be more parsimonious).
- Discriminant function analysis (multiple groups): for multinomial outcome variables.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!