Probability and Statistics > Regression analysis > General Linear Model (GLM)
What is a General Linear Model?
The General Linear Model (GLM) is a useful framework for comparing how several variables affect different continuous variables. In its simplest form, GLM is described as:
Data = Model + Error (Rutherford, 2001, p.3)
GLM is the foundation for several statistical tests, including ANOVA, ANCOVA and regression analysis. Despite their differences, each fits the definition of Data = Model + Error:
- In ANOVA, “data” is the dependent variable scores, the model is the experimental conditions, and the “error” is the part of the model not explained by the data.
- In regression analysis, the independent predictors make up the “model” and the residuals are the “error” component.
- ANCOVA is a blend of ANOVA and regression and so can also be described as Data = Model + Error.
ANCOVA is the “typical” GLM and uses at least one numerical predictor and one qualitative predictor; Some people use the term “GLM” and ANCOVA interchangeably.
Identical Procedures
Formulas
The formula for the general linear model is:
Where:
= the dependent variable (also called the predicted, explanatory, or response variable).
- β0 = the intercept — always a constant (i.e. the value never changes within the model).
- β1 = a weight or slope (also called a coefficient). Determines how much weight one variable contributes to the model. If everything in the equation holds constant, β0 gives the predicted change in Y for a unit change in X.
- X = a variable.
If this looks familiar to the regression equation, that’s because they are one and the same. However, the key word in general linear model is general; the procedure can handle a wide variety of variables, including a non-numerical one. During the procedure, the GLM changes the non-numerical variable to a number before any calculations are made.
When the GLM βs (pronounced “betas”) are standardized with a mean of zero and a standard deviation of 1 (i.e. they are given z-scores), they are called beta weights. Otherwise, they are usually called Bs (as in the letter B in the English alphabet). The GLM equation with standardized βs is:
Emergence of the GLMM
Although many software packages still refer to certain procedures as “GLM”, the concept of a general linear model is seen by some as somewhat dated. It’s well recognized that the models can have non-linear components. There’s even some debate about the “general” part:
Calling it “general” seems quaint. It is certainly misleading ~ Stroup (2016).
Stroup prefers the term generalized linear mixed model (GLMM), of which GLM is a subtype. GLMMs combine GLMs with mixed models, which allow random effects models (GLMs only allow fixed effects). However, GLMM is a new approach:
GLMMs are still part of the statistical frontier, and not all of the answers about how to use them are known (even by experts) ~ Bolker.
Next: The Generalized Linear Model (GLZ)
Note: Don’t confuse the general linear model with the Generalized Linear Model (GLZ). GLZ is a variant of GLM which uses Bayesian hypothesis testing to predict outcomes.
References
Bolker, B. (2017). Draft PDF posted on website: http://ms.mcmaster.ca/~bolker/classes/s4c03/notes/GLMM_Bolker_draft5.pdf
Rutherford (2001). Introducing Anova and Ancova: A GLM Approach. SAGE.
Stroup, W. (2016). Generalized Linear Mixed Models: Modern Concepts, Methods and Applications. CRC Press.