Statistical Associates Publishers

## Canonical Correlation: 10 Worst Pitfalls and Mistakes

1. Inappropriate observed variables.
The dependent and covariate sets of measured variables should each contain variables which intercorrelate. Correlating arbitrarily composed sets will yield arbitray results.

2. Interpreting only the first canonical correlation.
The dependent and covariate sets of variables may be related along more than one significant dimension. While the canonical correlation for the first dimension is always the most important and might be the only significant one, it is quite possible that there will be more than one.

3. Having only one significant variable in a set.
Canonical correlation is intended for many-to-many relationships. If a set has only one significant measured variable, it is not appropriate.

4. Violating linearity.
Canonical correlation is a member of the general linear model family and assumes linear relationships. However, nonlinear canonical correlation is available.

5. Treating ordinal data as interval in level.
This is the same violation as is common in multiple linear regression. Nominal and ordinal variables are often best treated using nonlinear canonical correlation.

6. Failing to undertake redundancy analysis .
The redundancy coefficient, which measures the percent of variance in one set of measured variables that may be predicted by the canonical variable of the other set, should be reported along with the canonical correlation. The reason for this recommendation is because it is possible for the canonical variates to correlate highly, yet each variate may not extract significant proportions of variance from their respective sets of original variables. Redundancy analyis assesses the magnitude of relationships.

7. Using canonical weights to interpret and label canonical dimensions .
Canonical structure coefficients should be used along with canonical weights. Of the two, the former is primary for imputing labels to dimensions.

8. Not assessing the model using the canonical variate adequacy coefficient .
The canonical variate adequacy coefficient is the average of all the squared structure coefficients for one set of variables with respect to a given canonical variable. It is a measure of how well a given canonical variable represents the original variance in that set of original variables.