Linear Discriminant Analysis: Simple Definition

What is Linear Discriminant Analysis?

In statistics, pattern recognition and machine learning, linear discriminant analysis (LDA), also called canonical Variate Analysis (CVA), is a way to study differences between objects. This sorting method uses a linear combination of features to characterize classes. More specifically, scores that separate an object from one particular class to the next are expressed as “—linear combinations of the explanatory variables that optimally separate the a priori* defined groups (classes)” (Šmilauer & Lepš, 2014).

Developed in 1936 by R. A. Fisher, it is both simple and robust, and the models it generates are often quite as good as those generated by more complicated algorithms.

LDA is similar to logistic regression and probit regression, and also, to some degree, analysis of variance (ANOVA). Although it has the term “linear” in the title, it can be expanded to the analysis of non-linear systems, using nonlinear spline basis functions (Decker & Lenz, 2007).

Objectives of Linear Discriminant Analysis

LDA has two broad objectives Elston et. al, 2002):

Prediction: find a rule which allows objects to be sorted into predefined classes.
Analyze: build a model that can help the user discover patterns and order in data.

Time Series Analysis

When used to analyze business cycles, it’s important to note that the technique ignores the underlying chronological order of time series cycles (Decker & Lenz, 2007).

Note

*A priori is—relating to what can be known through an understanding of how certain things work [i.e. a hypothesis] rather than by observation” ~ Miriam Webster.

References

Decker, R. & Lenz, H. (2007). Advances in Data Analysis: Proceedings of the 30th Annual Conference of the Gesellschaft für Klassifikation e.V., Freie Universität Berlin, March 8-10, 2006. Springer Science & Business Media.
Elston et. al (Eds.) (2002). Biostatistical Genetics and Genetic Epidemiology. John Wiley & Sons.
Šmilauer, P. & Lepš, J. (2014). Multivariate Analysis of Ecological Data using CANOCO 5. Cambridge University Press.