Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of factors and as such is a "non-dependent" procedure (that is, it does not assume a dependent variable is specified). Factor analysis could be used for any of the following purposes:
" To reduce a large number of variables to a smaller number of factors for modeling purposes, where the large number of variables precludes modeling all the measures individually. As such, factor analysis is integrated in structural equation modeling (SEM), helping confirm the latent variables modeled by SEM. However, factor analysis can be and is often used on a stand-alone basis for similar purposes.
" To establish that multiple tests measure the same factor, thereby giving justification for administering fewer tests. Factor analysis originated a century ago with Charles Spearman's attempts to show that a wide variety of mental tests could be explained by a single underlying intelligence factor (a notion now rejected, by the way).
" To validate a scale or index by demonstrating that its constituent items load on the same factor, and to drop proposed scale items which cross-load on more than one factor. " To select a subset of variables from a larger set, based on which original variables have the highest correlations with the principal component factors.
" To create a set of factors to be treated as uncorrelated variables as one approach to handling multicollinearity in such procedures as multiple regression
" To identify clusters of cases and/or outliers.
" To determine network groups by determining which sets of people cluster together (using Q-mode factor analysis, discussed below)
A non-technical analogy: A mother sees various bumps and shapes under a blanket at the bottom of a bed. When one shape moves toward the top of the bed, all the other bumps and shapes move toward the top also, so the mother concludes that what is under the blanket is a single thing - her child. Similarly, factor analysis takes as input a number of measures and tests, analogous to the bumps and shapes. Those that move together are considered a single thing, which it labels a factor. That is, in factor analysis the researcher is assuming that there is a "child" out there in the form of an underlying factor, and he or she takes simultaneous movement (correlation) as evidence of its existence. If correlation is spurious for some reason, this inference will be mistaken, of course, so it is important when conducting factor analysis that possible variables which might introduce spuriousness, such as anteceding causes, be taken into account.
New in the 2013 edition: Greatly restructured for better readability.
The full content is now available from Statistical Associates Publishers. Click here.
Below is the unformatted table of contents.
FACTOR ANALYSIS Table of Contents Overview 8 Data 10 Key Concepts and Terms 10 Exploratory factor analysis (EFA) 10 Exploratory vs. confirmatory factor analysis (CFA) 10 Factor Analytic Data Modes 11 R-mode factor analysis 11 Q-mode factor analysis 11 Other rarer modes of factor analysis 12 Types of factor extraction 13 Principal components analysis (PCA) 13 Principal factor analysis (PFA) 14 PCA and PFA compared 14 Other Extraction Methods 16 Types of factor rotation 17 Rotation methods 17 No rotation 18 Varimax rotation 19 Quartimax rotation 20 Equamax rotation 21 Direct oblimin (oblique) rotation 22 Promax rotation 23 Other rotation methods 24 Summary 24 Factor analysis in SPSS 24 Data setup 24 The "Factor" dialog 24 Descriptives and Options 25 Extraction 26 Rotation 27 Factor Scores 28 Statistical output in SPSS 29 Factor loadings 29 Plot of factor loadings (factor space plot) 31 Factor, component, pattern, and structure matrices 33 Communality 34 Uniqueness 36 Eigenvalues 36 Extraction sums of squared loadings 37 Trace 37 Factor scores 38 Bartlett scores 39 Saving factor scores 40 Criteria for number of factors to model 40 Parallel analysis 42 Other Criteria 43 Using reproduced correlation residuals to validate the choice of number of factors 43 Summary 44 Factor analysis in SAS 44 SAS interface 44 SAS syntax 45 Rotation methods in SAS 47 Statistical output in SAS 48 Factor loadings in SAS output 48 SAS output for communalities 48 SAS output for eigenvalues 48 SAS scree plot output 49 SAS factor loadings plots 51 Factor analysis in Stata 52 Stata interface 52 Importing data into Stata 53 Stata syntax 55 Statistical output in Stata 58 Stata output for eigenvalues 58 Factor loadings in Stata output 59 Stata output for communalities 59 Stata scree plot output 59 Stata loading plots 60 Categorical principal components analysis (CATPCA) 62 Overview 62 SPSS categorical principal components analysis 63 Data considerations 63 CATPCA user interface in SPSS 63 The “Optimal Scaling” dialog 63 The main CATPCA dialog 64 The “Discretize” button dialog 66 The “Missing” button dialog 67 The “Options” button dialog 68 The “Output” button dialog 70 The “Save” button dialog 72 The “Object” button dialog 72 The “Category” button dialog 73 The “Loading” button dialog 75 SPSS CATPCA statistical output 75 The “Model Summary” table 75 The “Component Loadings” table 77 The “Component Loadings” plot 79 The “Variance Accounted For” table 80 The “Object Points Labeled by Casenumbers” plot 81 The “Object Scores” table 82 The “Biplot Component Loadings and Objects” plot 83 The “Quantifications” table 84 The “Category Points” plot 85 The “Projected Centroids” table and plot 87 SAS categorical principal components analysis 89 Overview 89 SAS syntax 89 The PROC PRINQUAL procedure 90 SAS PROC PRINQUAL output 92 Principal components analysis of transformed data 94 Stata categorical principal components analysis 97 Overview 97 Example 98 The polychoric correlation matrix 98 The “Principal component analysis” table 99 The “Scoring Coefficients” table 100 Saved object scores 102 A second example using the factormat command 104 The structural equation modeling approach to factor analysis 106 Testing error in the measurement model 106 Redundancy test of one-factor vs. multi-factor models 107 Measurement invariance test comparing a model across groups 107 Orthogonality tests 107 Assumptions 107 Valid imputation of factor labels 108 Proper specification/no selection bias 108 No outliers 108 Continuous data 108 Linearity 110 Multivariate normality 110 Homoscedasticity 110 Orthogonality 111 Existence of underlying dimensions 111 Moderate to moderate-high intercorrelations without multicollinearity 111 Absence of high multicollinearity 111 No perfect multicollinearity 112 Sphericity 112 Adequate sample size 112 Frequently Asked Questions 112 How does factor analysis compare with cluster analysis and multidimensional scaling? 112 How many cases do I need to do factor analysis? 115 How do I input my data as a correlation matrix rather than raw data? 116 How many variables do I need in factor analysis? The more, the better? 117 What is KMO? What is it used for? 117 Why is normality not required for factor analysis when it is an assumption of correlation, on which factor analysis rests? 119 Is it necessary to standardize one's variables before applying factor analysis? 119 Can you pool data from two samples together in factor analysis? 120 How does factor comparison of the factor structure of two samples work? 120 Why is rotation of axes necessary? 121 Why are the factor scores I get the same when I request rotation and when I do not? 121 Why is oblique rotation less common in social science? 121 When should oblique rotation be used? 122 What is hierarchical factor analysis and how does it relate to oblique rotation? 122 How high does a factor loading have to be to consider that variable as a defining part of that factor? 123 What is simple factor structure, and is the simpler, the better? 123 How is factor analysis related to validity? 124 What is the matrix of standardized component scores, and for what might it be used in research? 124 What are the pros and cons of common factor analysis compared to PCA? 124 Why are my PCA results different in SAS compared to SPSS? 125 How do I do Q-mode factor analysis of cases rather than variables? 125 How else may I use factor analysis to identify clusters of cases and/or outliers? 125 Can factor analysis handle hierarchical/multilevel data? 125 What do I do if I want to factor categorical variables? 126 Bibliography 126 Pagecount: 131