What is the Generalized Linear Model?
The generalized linear model (GLZ) is a way to make predictions from sets of data. It takes the idea of a general linear model (for example, a linear regression equation) a step further. A general linear model (GLM) is the type of model you probably came across in elementary statistics. Ordinary least squares regression is one example of a GLM. They are also found in ANOVA and T Tests. The generalized linear model on the other hand, is much more complex, drawing from an array of different distributions to find the “best fit” model. The model uses, among other techniques, Bayesian hypothesis testing to predict outcomes.
Why is the Generalized Linear Model Needed?
Regular linear regression predicts that a constant change in one variable (x) will lead to a constant change in another variable (y). When the data fits a normal distribution, this type of model works well. Unfortunately, many different types of data to not fit this simple model very well at all. Here’s an example:
Your model predicts that in a certain city, for every degree difference in temperature, 100 more ice-cream cones are sold. At 80 degrees, 1,000 people in the city buy ice cream.
This sounds reasonable. If 1,000 buy ice cream when it’s 80 degrees out, you can certainly see 2,000 people buying ice cream cream when it’s 90 degrees. But going the other way on the model: when it’s 60 degrees out, -1,000 people buy ice creams. That doesn’t make any sense at all. A more logical model would show an increase in sales over a certain temperature and a decrease below a certain point.
The Generalized Linear Model and Probability Distributions.
Elements of the Generalized Linear Model.
Three elements make up the generalized linear model:
- A probability distribution from the exponential family (as outlined above).
- A linear predictor η = Xβ . The linear predictor gives you information about the model’s independent variables.
- The link function relates the linear predictor to the expected value. The following table shows some examples of link functions for various types of models.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.