Skip to main content

Factor Analysis

  • Chapter
  • First Online:

Part of the book series: Use R! ((USE R))

Abstract

This chapter introduces exploratory and confirmatory factor analysis. It starts with a section on correlation coefficients since factor analytic techniques are based on covariance/correlation matrices. Special emphasis is on tetrachoric/polychoric correlations for ordinal input data. This is followed by elaborations on exploratory factor analysis including practical aspects such as determining the number of factors and rotation techniques to facilitate factor interpretation. A recent development is Bayesian exploratory factor analysis which, in addition to the loadings, also estimates the number of factors and allows them to be correlated. This approach is explored in a separate section. The second part of this chapter consists of a detailed treatment of confirmatory factor analysis which lays the groundwork for structural equation models presented in the next chapter. In confirmatory factor analysis, the number of factors and the assignment of indicators to factors are determined by substantive considerations. Several extensions in terms of multigroup, longitudinal, and multilevel settings are presented. The chapter concludes with a Bayesian approach to confirmatory factor analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Eigenvalues will be introduced in Sect. 6.1.1.

  2. 2.

    This call gives a warning that the matrix is not positive definite.

  3. 3.

    In EFA, residuals are defined by \((\mathbf R - \hat {\mathbf P})\), where R is the sample correlation matrix and \(\hat {\mathbf P}\) the estimated model correlation matrix.

  4. 4.

    An overview of rotation techniques and corresponding comparisons can be found in Browne (2001).

  5. 5.

    Thanks to Rémi Piatek and Sylvia Frühwirth-Schnatter for their support with this application.

  6. 6.

    In their original paper, Conti et al. (2014) use the more restrictive assumption of at least three manifest variables per active factor, to rule out potential identification problems due to extreme cases with zero correlation between some factors. With correlated factors, however, the weaker assumption of two manifest variables per factor is sufficient for identification.

  7. 7.

    We switch the notation for the input data (Y instead of X) in order to be consistent with the standard SEM model formulation presented in the next chapter.

  8. 8.

    Note that compared to Eq. (2.4) we slightly change the notation (i.e., Ψ instead of Φ, and Θ instead of Ψ) in order to be consistent with the names of the output objects in the lavaan package (Rosseel, 2012), which is used throughout this chapter.

  9. 9.

    At the time this book was written, lavaan allows for two-level structures only. Also, thanks to Yves Rosseel for sharing the code.

References

  • Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis (2nd ed.). London: Hodder Arnold.

    MATH  Google Scholar 

  • Bartholomew, D. J., Steele, F., Moustaki, I., & Galbraith, J. I. (2008). Analysis of multivariate social science data (2nd ed.). Boca Raton: CRC Press.

    MATH  Google Scholar 

  • Bergh, R., Akrami, N., Sidanius, J., & Sibley, C. (2016). Is group membership necessary for understanding prejudice? A re-evaluation of generalized prejudice and its personality correlates. Journal of Personality and Social Psychology, 111, 367–395.

    Article  Google Scholar 

  • Browne, M. W. (2001). An overview of analytic rotation in exploratory factor analysis. Multivariate Behavioral Research, 36, 111–150.

    Article  Google Scholar 

  • Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Beverly Hills: Sage.

    Google Scholar 

  • Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.

    Article  Google Scholar 

  • Conti, G., Frühwirth-Schnatter, S., Heckman, J. J., & Piatek, R. (2014). Bayesian exploratory factor analysis. Journal of Econometrics, 183, 31–57.

    Article  MathSciNet  Google Scholar 

  • Denwood, M. J. (2016). runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. Journal of Statistical Software, 71(9), 1–25. https://www.jstatsoft.org/article/view/v071i09

    Article  Google Scholar 

  • Drasgow, F. (1986). Polychoric and polyserial correlations. In: S. Kotz & N. L. Johnson (Eds.), Encyclopedia of statistical sciences (Vol. 7, pp. 68–74). New York: Wiley.

    Google Scholar 

  • Epskamp, S. (2015). semPlot: Unified visualizations of structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 22, 474–483.

    Article  MathSciNet  Google Scholar 

  • Finney, S. J., & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In: G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed., pp. 439–492). Charlotte: Information Age Publishing.

    Google Scholar 

  • Hendrickson, A. E., & White, P. O. (1964). PROMAX: A quick method for rotation to oblique simple structure. British Journal of Mathematical and Statistical Psychology, 17, 65–70.

    Article  Google Scholar 

  • Ho, A. K., Sidanius, J., Kteily, N., Sheehy-Skeffington, J., Pratto, F., Henkel, K. E., Foels, R., & Stewart, A. L. (2015). The nature of social dominance orientation: Theorizing and measuring preferences for intergroup inequality using the new SDO7 scale. Journal of Personality and Social Psychology, 109, 1003–1028.

    Article  Google Scholar 

  • Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179–185.

    Article  Google Scholar 

  • Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). New York: Routledge.

    Book  Google Scholar 

  • Hsu, H. Y., Kwok, O. M., Lin, J. H., & Acosta, S. (2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50, 197–215.

    Article  Google Scholar 

  • Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.

    Article  Google Scholar 

  • Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70, 631–639.

    MathSciNet  MATH  Google Scholar 

  • Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.

    Article  Google Scholar 

  • Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141–151.

    Article  Google Scholar 

  • Kaplan, D. (2014). Bayesian statistics for the social sciences. New York: Guilford.

    Google Scholar 

  • Kenny, D. A. (1979). Correlation and causality. New York: Wiley.

    MATH  Google Scholar 

  • Kirk, D. B. (1973). On the numerical approximation of the bivariate normal (tetrachoric) correlation coefficient. Psychometrika, 38, 259–268.

    Article  Google Scholar 

  • Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). New York: Guilford Press.

    MATH  Google Scholar 

  • Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R , JAGS , and Stan (2nd ed.). Cambridge: Academic.

    MATH  Google Scholar 

  • Little, T. D. (2013). Longitudinal structural equation modeling. New York: Guilford.

    Google Scholar 

  • MacCallum, R. C. (2009). Factor analysis. In: R. E. Millsap, & A. Maydeu-Olivares (Eds.), The Sage handbook of quantitative methods in psychology (pp. 123–177) London: Sage.

    Chapter  Google Scholar 

  • Mair, P., Hofmann, E., Gruber, K., Zeileis, A., & Hornik, K. (2015). Motivation, values, and work design as drivers of participation in the R open source project for statistical computing. Proceedings of the National Academy of Sciences of the United States of America 112, 14788–14792.

    Article  Google Scholar 

  • McDonald, R., & Mulaik, S. A. (1979). Determinacy of common factors: A nontechnical review. Psychological Bulletin, 86, 430–445.

    Article  Google Scholar 

  • Merkle, E. C., & Rosseel, Y. (2018). blavaan: Bayesian structural equation models via parameter expansion. Journal of Statistical Software, 85(4), 1–30.

    Article  Google Scholar 

  • Muthén, B. O., & Hofacker, C. (1988). Testing the assumptions underlying tetrachoric correlations. Psychometrika, 83, 563–578.

    Article  Google Scholar 

  • Piatek, R. (2017). BayesFM: Bayesian inference for factor modeling. R package version 0.1.2. https://CRAN.R-project.org/package=BayesFM

  • Raiche, G., & Magis, D. (2011). nFactors:An R package for parallel analysis and non graphical solutions to the Cattell scree test. R package version 2.3.3. http://CRAN.R-project.org/package=nFactors

  • Revelle, W. (2015). An introduction to psychometric theory with applications in R. Freely available online. http://www.personality-project.org/r/book/

  • Revelle, W. (2017). psych: Procedures for psychological, psychometric, and personality research. R package version 1.7.8. http://CRAN.R-project.org/package=psych

  • Revelle, W., & Rocklin, T. (1979). Very simple structure: An alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behavioral Research, 14, 403–414.

    Article  Google Scholar 

  • Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17, 354–373.

    Article  Google Scholar 

  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. http://www.jstatsoft.org/v48/i02/

    Article  Google Scholar 

  • Savalei, V. (2011). What to do about zero frequency cells when estimating polychoric correlations. Structural Equation Modeling: A Multidisciplinary Journal, 18, 253–273.

    Article  MathSciNet  Google Scholar 

  • semTools Contributors. (2016). semTools: Useful tools for structural equation modeling. R package version 0.4-14. https://CRAN.R-project.org/package=semTools

  • Sidanius, J., & Pratto, F. (2001). Social dominance: An intergroup theory of social hierarchy and oppression. Cambridge: Cambridge University Press.

    Google Scholar 

  • Sidanius, J., Levin, S., van Laar, C., & Sears, D. O. (2010). The diversity challenge: Social identity and intergroup relations on the college campus. New York: The Russell Sage Foundation.

    Google Scholar 

  • Tisak, J., & Meredith, W. (1990). Longitudinal factor analysis. In: A. von Eye (Ed.), Statistical methods in longitudinal research (Vol. 1, pp. 125–149). San Diego: Academic.

    Chapter  Google Scholar 

  • Treiblmaier, H. (2006). Datenqualität und individualisierte Kommunikation [Data Quality and Individualized Communication]. Wiesbaden: DUV Gabler Edition Wissenschaft.

    Google Scholar 

  • Treiblmaier, H., Bentler, P. M., & Mair, P. (2011). Formative constructs implemented via common factors. Structural Equation Modeling: A Multidisciplinary Journal 18, 1–17.

    Article  MathSciNet  Google Scholar 

  • Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.

    Article  Google Scholar 

  • Vaughn-Coaxum, R., Mair, P., & Weisz, J. R. (2016) Racial/ethnic differences in youth depression indicators: An item response theory analysis of symptoms reported by White, Black, Asian, and Latino youths. Clinical Psychological Science 4, 239–253.

    Article  Google Scholar 

  • Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41, 321–327.

    Article  Google Scholar 

  • Wei, T., & Simko, V. (2016) corrplot: Visualization of a correlation matrix. R package version 0.77. https://CRAN.R-project.org/package=corrplot

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mair, P. (2018). Factor Analysis. In: Modern Psychometrics with R. Use R!. Springer, Cham. https://doi.org/10.1007/978-3-319-93177-7_2

Download citation

Publish with us

Policies and ethics