Home > E-book list > Discriminant Function Analysis

Garson, G. D. (2012). Discriminant Function Analysis. Asheboro, NC: Statistical Associates Publishers.

Instant availablity without passwords in Kindle format on Amazon: click here.
Tutorial on the free Kindle for PC Reader app: click here.
Obtain the free Kindle Reader app for any device: click here.
Delayed availability with passwords in free pdf format: right-click here and save file.
Register to obtain a password: click here.
Statistical Associates Publishers home page.
About the author
Table of Contents
ASIN number (e-book counterpart to ISBN): ASIN: B0095JEBIO
@c 2012 by G. David Garson and Statistical Associates Publishers. worldwide rights reserved in all languages and on all media. Permission is not granted to copy, distribute, or post e-books or passwords.



Discriminant function analysis, also known as discriminant analysis or simply DA, is used to classify cases into the values of a categorical dependent, usually a dichotomy. If discriminant function analysis is effective for a set of data, the classification table of correct and incorrect estimates will yield a high percentage correct. Discriminant function analysis is found in SPSS under Analyze>Classify>Discriminant. If the specified grouping variable has two categories, the procedure is considered "discriminant analysis" (DA). If there are more than two categories the procedure is considered "multiple discriminant analysis" (MDA).

Multiple discriminant analysis (MDA) is a cousin of multiple analysis of variance (MANOVA), sharing many of the same assumptions and tests. MDA is sometimes also called discriminant factor analysis or canonical discriminant analysis.

While binary and multinomial logistic regression, treated in a separate Statistical Associates "Blue Book" volume, is often used in place of DA or MDA respectively, if the assumptions of discriminant analysis are met, it has greater power than logistic regression: there is less chance of Type II errors - accepting a false null hypothesis. If the data violate assumptions of discriminant analysis, outlined below, then logistic regression may be preferred because it usually involves fewer violations of assumptions (independent variables needn't be normally distributed, linearly related, or have equal within-group variances), is robust, handles categorical as well as continuous variables, and has coefficients which many find easier to interpret. Logistic regression is preferred when data are not normal in distribution or group sizes are very unequal.

There are several purposes for DA and/or MDA:

    To classify cases into groups using a discriminant prediction equation.
    To test theory by observing whether cases are classified as predicted.
    To investigate differences between or among groups.
    To determine the most parsimonious way to distinguish among groups.
    To determine the percent of variance in the dependent variable explained by the independents.
    To determine the percent of variance in the dependent variable explained by the independents over and above the variance accounted for by control variables, using sequential discriminant analysis.
    To assess the relative importance of the independent variables in classifying the dependent variable.
    To discard variables which are little related to group distinctions.
    To infer the meaning of MDA dimensions which distinguish groups, based on discriminant loadings. 

Discriminant analysis has basic two steps: (1) an F test (Wilks' lambda) is used to test if the discriminant model as a whole is significant, and (2) if the F test shows significance, then the individual independent variables are assessed to see which differ significantly in mean by group and these are used to classify the dependent variable.

Discriminant analysis shares all the usual assumptions of correlation, requiring linear and homoscedastic relationships and untruncated interval or near interval data. Like multiple regression and most statistical procedures, DA also assumes proper model specification (inclusion of all important independents and exclusion of causally extraneous but correlated variables). DA also assumes the dependent variable is a true dichotomy since data which are forced into dichotomous coding are truncated, attenuating correlation.

The full content is now available from Statistical Associates Publishers. Click here.

Below is the unformatted table of contents.


Table of Contents

Overview	6
Key Terms and Concepts	7
Variables	7
Discriminant functions	7
Pairwise group comparisons	8
Output statistics	8
Examples	9
SPSS user interface	9
The "Statistics" button	10
The "Classify" button	10
The "Save" button	13
The "Bootstrap" button	13
The "Method" button	14
SPSS Statistical output for two-group DA	16
The "Analysis Case Processing Summary" table	16
The "Group Statistics" table	16
The "Tests of Equality of Group Means" table	16
The "Pooled Within-Group Matrices" and "Covariance Matrices" tables.	18
The "Box's Test of Equality of Covariance Matrices" tables	18
The "Eigenvalues" table	19
The "Wilks' Lambda" table	21
The "Standardized Canonical Discriminant Function Coefficients" table	21
The "Structure Matrix" table	23
The "Canonical Discriminant Functions Coefficients" table	23
The "Functions at Group Centroids" table	24
The "Classification Processing Summary" table	24
The "Prior Probabilities for Groups" table	25
The "Classification Function Coefficients" table	25
The "Casewise Statistics" table	26
Separate-groups graphs of canonical discriminant functions	27
The "Classification Results" table	27
SPSS Statistical output for three-group MDA	28
Overview and example	28
MDA and DA similarities	28
The "Eigenvalues" table	29
The "Wilks' Lambda" table	29
The "Structure Matrix" table	30
The "Territorial Map"	31
Combined-groups plot	34
Separate-groups plots	34
SPSS Statistical output for stepwise discriminant analysis	35
Overview	35
Example	35
Stepwise discriminant analysis in SPSS	36
Assumptions	41
Proper specification	41
True categorical dependent variables	41
Independence	41
No lopsided splits	41
Adequate sample size	41
Interval data	42
Variance	42
Random error	42
Homogeneity of variances (homoscedasticity)	42
Homogeneity of covariances/correlations	42
Absence of perfect multicollinearity	43
Low multicollinearity of the independents	43
Linearity	43
Additivity	43
Multivariate normality	43
Frequently Asked Questions	44
Isn't discriminant analysis the same as cluster analysis?	44
When does the discriminant function have no constant term?	44
How important is it that the assumptions of homogeneity of variances and of multivariate normal distribution be met?	44
In DA, how can you assess the relative importance of the discriminating variables?	44
Dummy variables	45
In DA, how can you assess the importance of a set of discriminating variables over and above a set of control variables? (What is sequential discriminant analysis?)	45
What is the maximum likelihood estimation method in discriminant analysis (logistic discriminate function analysis)?	45
What are Fisher's linear discriminant functions?	46
I have heard DA is related to MANCOVA. How so?	46
How does MDA work?	46
How can I tell if MDA worked?	46
For any given MDA example, how many discriminant functions will there be, and how can I tell if each is significant?	47
What are Mahalonobis distances?	47
How are the multiple discriminant scores on a single case interpreted in MDA?	47
Likewise in MDA, there are multiple standardized discriminant coefficients - one set for each discriminant function. In dichotomous DA, the ratio of the standardized discriminant coefficients is the ratio of the importance of the independent variables. But how are the multiple set of standardized coefficients interpreted in MDA?	48
Are the multiple discriminant functions the same as factors in principal-components factor analysis?	48
What is the syntax for discriminant analysis in SPSS?	48
Bibliography	50
Pagecount:	52