Abstract
This chapter introduces exploratory and confirmatory factor analysis. It starts with a section on correlation coefficients since factor analytic techniques are based on covariance/correlation matrices. Special emphasis is on tetrachoric/polychoric correlations for ordinal input data. This is followed by elaborations on exploratory factor analysis including practical aspects such as determining the number of factors and rotation techniques to facilitate factor interpretation. A recent development is Bayesian exploratory factor analysis which, in addition to the loadings, also estimates the number of factors and allows them to be correlated. This approach is explored in a separate section. The second part of this chapter consists of a detailed treatment of confirmatory factor analysis which lays the groundwork for structural equation models presented in the next chapter. In confirmatory factor analysis, the number of factors and the assignment of indicators to factors are determined by substantive considerations. Several extensions in terms of multigroup, longitudinal, and multilevel settings are presented. The chapter concludes with a Bayesian approach to confirmatory factor analysis.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Eigenvalues will be introduced in Sect. 6.1.1.
- 2.
This call gives a warning that the matrix is not positive definite.
- 3.
In EFA, residuals are defined by \((\mathbf R - \hat {\mathbf P})\), where R is the sample correlation matrix and \(\hat {\mathbf P}\) the estimated model correlation matrix.
- 4.
An overview of rotation techniques and corresponding comparisons can be found in Browne (2001).
- 5.
Thanks to Rémi Piatek and Sylvia Frühwirth-Schnatter for their support with this application.
- 6.
In their original paper, Conti et al. (2014) use the more restrictive assumption of at least three manifest variables per active factor, to rule out potential identification problems due to extreme cases with zero correlation between some factors. With correlated factors, however, the weaker assumption of two manifest variables per factor is sufficient for identification.
- 7.
We switch the notation for the input data (Y instead of X) in order to be consistent with the standard SEM model formulation presented in the next chapter.
- 8.
- 9.
At the time this book was written, lavaan allows for two-level structures only. Also, thanks to Yves Rosseel for sharing the code.
References
Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis (2nd ed.). London: Hodder Arnold.
Bartholomew, D. J., Steele, F., Moustaki, I., & Galbraith, J. I. (2008). Analysis of multivariate social science data (2nd ed.). Boca Raton: CRC Press.
Bergh, R., Akrami, N., Sidanius, J., & Sibley, C. (2016). Is group membership necessary for understanding prejudice? A re-evaluation of generalized prejudice and its personality correlates. Journal of Personality and Social Psychology, 111, 367–395.
Browne, M. W. (2001). An overview of analytic rotation in exploratory factor analysis. Multivariate Behavioral Research, 36, 111–150.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Beverly Hills: Sage.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.
Conti, G., Frühwirth-Schnatter, S., Heckman, J. J., & Piatek, R. (2014). Bayesian exploratory factor analysis. Journal of Econometrics, 183, 31–57.
Denwood, M. J. (2016). runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. Journal of Statistical Software, 71(9), 1–25. https://www.jstatsoft.org/article/view/v071i09
Drasgow, F. (1986). Polychoric and polyserial correlations. In: S. Kotz & N. L. Johnson (Eds.), Encyclopedia of statistical sciences (Vol. 7, pp. 68–74). New York: Wiley.
Epskamp, S. (2015). semPlot: Unified visualizations of structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 22, 474–483.
Finney, S. J., & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In: G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed., pp. 439–492). Charlotte: Information Age Publishing.
Hendrickson, A. E., & White, P. O. (1964). PROMAX: A quick method for rotation to oblique simple structure. British Journal of Mathematical and Statistical Psychology, 17, 65–70.
Ho, A. K., Sidanius, J., Kteily, N., Sheehy-Skeffington, J., Pratto, F., Henkel, K. E., Foels, R., & Stewart, A. L. (2015). The nature of social dominance orientation: Theorizing and measuring preferences for intergroup inequality using the new SDO7 scale. Journal of Personality and Social Psychology, 109, 1003–1028.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179–185.
Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). New York: Routledge.
Hsu, H. Y., Kwok, O. M., Lin, J. H., & Acosta, S. (2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50, 197–215.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70, 631–639.
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.
Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141–151.
Kaplan, D. (2014). Bayesian statistics for the social sciences. New York: Guilford.
Kenny, D. A. (1979). Correlation and causality. New York: Wiley.
Kirk, D. B. (1973). On the numerical approximation of the bivariate normal (tetrachoric) correlation coefficient. Psychometrika, 38, 259–268.
Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). New York: Guilford Press.
Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R , JAGS , and Stan (2nd ed.). Cambridge: Academic.
Little, T. D. (2013). Longitudinal structural equation modeling. New York: Guilford.
MacCallum, R. C. (2009). Factor analysis. In: R. E. Millsap, & A. Maydeu-Olivares (Eds.), The Sage handbook of quantitative methods in psychology (pp. 123–177) London: Sage.
Mair, P., Hofmann, E., Gruber, K., Zeileis, A., & Hornik, K. (2015). Motivation, values, and work design as drivers of participation in the R open source project for statistical computing. Proceedings of the National Academy of Sciences of the United States of America 112, 14788–14792.
McDonald, R., & Mulaik, S. A. (1979). Determinacy of common factors: A nontechnical review. Psychological Bulletin, 86, 430–445.
Merkle, E. C., & Rosseel, Y. (2018). blavaan: Bayesian structural equation models via parameter expansion. Journal of Statistical Software, 85(4), 1–30.
Muthén, B. O., & Hofacker, C. (1988). Testing the assumptions underlying tetrachoric correlations. Psychometrika, 83, 563–578.
Piatek, R. (2017). BayesFM: Bayesian inference for factor modeling. R package version 0.1.2. https://CRAN.R-project.org/package=BayesFM
Raiche, G., & Magis, D. (2011). nFactors:An R package for parallel analysis and non graphical solutions to the Cattell scree test. R package version 2.3.3. http://CRAN.R-project.org/package=nFactors
Revelle, W. (2015). An introduction to psychometric theory with applications in R. Freely available online. http://www.personality-project.org/r/book/
Revelle, W. (2017). psych: Procedures for psychological, psychometric, and personality research. R package version 1.7.8. http://CRAN.R-project.org/package=psych
Revelle, W., & Rocklin, T. (1979). Very simple structure: An alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behavioral Research, 14, 403–414.
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17, 354–373.
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. http://www.jstatsoft.org/v48/i02/
Savalei, V. (2011). What to do about zero frequency cells when estimating polychoric correlations. Structural Equation Modeling: A Multidisciplinary Journal, 18, 253–273.
semTools Contributors. (2016). semTools: Useful tools for structural equation modeling. R package version 0.4-14. https://CRAN.R-project.org/package=semTools
Sidanius, J., & Pratto, F. (2001). Social dominance: An intergroup theory of social hierarchy and oppression. Cambridge: Cambridge University Press.
Sidanius, J., Levin, S., van Laar, C., & Sears, D. O. (2010). The diversity challenge: Social identity and intergroup relations on the college campus. New York: The Russell Sage Foundation.
Tisak, J., & Meredith, W. (1990). Longitudinal factor analysis. In: A. von Eye (Ed.), Statistical methods in longitudinal research (Vol. 1, pp. 125–149). San Diego: Academic.
Treiblmaier, H. (2006). Datenqualität und individualisierte Kommunikation [Data Quality and Individualized Communication]. Wiesbaden: DUV Gabler Edition Wissenschaft.
Treiblmaier, H., Bentler, P. M., & Mair, P. (2011). Formative constructs implemented via common factors. Structural Equation Modeling: A Multidisciplinary Journal 18, 1–17.
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.
Vaughn-Coaxum, R., Mair, P., & Weisz, J. R. (2016) Racial/ethnic differences in youth depression indicators: An item response theory analysis of symptoms reported by White, Black, Asian, and Latino youths. Clinical Psychological Science 4, 239–253.
Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41, 321–327.
Wei, T., & Simko, V. (2016) corrplot: Visualization of a correlation matrix. R package version 0.77. https://CRAN.R-project.org/package=corrplot
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Mair, P. (2018). Factor Analysis. In: Modern Psychometrics with R. Use R!. Springer, Cham. https://doi.org/10.1007/978-3-319-93177-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-93177-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93175-3
Online ISBN: 978-3-319-93177-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)