A Comparison of Confirmatory Factor Analysis of Binary Data on the Basis of Tetrachoric Correlations and of Probability-Based Covariances: A Simulation Study

  • Karl SchweizerEmail author
  • Xuezhu Ren
  • Tengfei Wang
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 89)


Although tetrachoric correlations provide a theoretically well-founded basis for the investigation of binary data by means of confirmatory factor analysis according to the congeneric model, the outcome does not always meet the expectations. As expected from analyzing the procedure of computing tetrachoric correlations, the data must show a high quality for achieving good results. In a simulations study it was demonstrated that such a quality could be established by a very large sample size. Robust maximum likelihood estimation improved model-data fit but not the appropriateness of factor loadings. In contrast, probability-based covariances and probability-based correlations as input to confirmatory factor analysis yielded a good model-data fit in all sample sizes. Probability-based covariances in combination with the weighted congeneric model additionally performed best concerning the absence of dependency on item marginals in factor loadings whereas probability-based correlations did not. The results demonstrated that it is possible to find a link function that enables the use of probability-based covariances for the investigation of binary data.


Confirmatory factor analysis Binary data Congeneric model Weighted congeneric model Tetrachoric correlation Probability-based covariance Link function 


  1. Bryant FB, Satorra A (2012) Principles and practice of scaled difference chi-square testing. Struct Equ Model 19:373–398CrossRefMathSciNetGoogle Scholar
  2. Divgi DR (1979) Calculation of the tetrachoric correlation coefficient. Psychometrika 44:169–172CrossRefzbMATHGoogle Scholar
  3. Fan W, Hancock GR (2012) Robust means modeling: an alternative for hypothesis testing of independent means under variance heterogeneity and nonnormality. J Educ Behav Stat 37:137–156CrossRefGoogle Scholar
  4. Genest C, Lévesque J-M (2009) Estimating correlation from dichotomized normal variables. J Stat Plan Inference 139:3785–3794CrossRefzbMATHGoogle Scholar
  5. Hu L, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Model 6:1–55CrossRefGoogle Scholar
  6. Jöreskog KG (1970) A general method for analysis of covariance structure. Biometrika 57: 239–257CrossRefzbMATHMathSciNetGoogle Scholar
  7. Jöreskog KG (1971) Statistical analysis of sets of congeneric tests. Psychometrika 36:109–133CrossRefzbMATHGoogle Scholar
  8. Jöreskog KG, Sörbom D (2001) Interactive LISREL: user’s guide. Scientific Software International Inc., LincolnwoodGoogle Scholar
  9. Jöreskog KG, Sörbom D (2006) LISREL 8.80. Scientific Software International Inc., LincolnwoodGoogle Scholar
  10. Kline RB (2005) Principles and practice of structural equation modeling, 2nd edn. Guliford, New YorkzbMATHGoogle Scholar
  11. Kubinger KD (2003) On artificial results due to using factor analysis for dichotomous variables. Psychol Sci 45:106–110Google Scholar
  12. McCullagh P, Nelder JA (1985) Generalized linear models. Chapman and Hall, LondonGoogle Scholar
  13. Muthén B (1984) A general structural equation model with dichotomous, ordered categorical, and continuous variable indicators. Psychometrika 49:115–132CrossRefGoogle Scholar
  14. Muthén B (1993) Goodness of fit with categorical and other nonnormal variables. In: Bollen KA, Long JS (eds) Testing structural equation models. Sage, Newbury Park, Thousand Oaks, pp 205–234Google Scholar
  15. Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135:370–384CrossRefGoogle Scholar
  16. Ogasawara H (2010) Accurate distribution and its asymptotic expansion for the tetrachoric correlation coefficient. J Multivar Anal 101:936–948CrossRefzbMATHMathSciNetGoogle Scholar
  17. Owen DB (1956) Tables for computing bivariate normal probabilities. Ann Math Stat 27: 1075–1090CrossRefzbMATHMathSciNetGoogle Scholar
  18. Pearson K (1900) Mathematical contributions to the theory of evolution. VII. On the correlation of characters not quantitatively measurable. Philos Trans R Soc Lond 195:1–47CrossRefGoogle Scholar
  19. Satorra A, Bentler PM (1994) Corrections to the test statistics and standard errors on covariance structure analysis. In: von Eye A, Glogg CC (eds) Latent variable analysis. Sage, Thousand Oaks, pp 399–419Google Scholar
  20. Schweizer K (2013) A threshold-free approach to the study of the structure of binary data. Int J Stat Probab 2:67–75CrossRefGoogle Scholar
  21. Tallis GM (1962) The maximum likelihood estimation of correlation from contingency tables. Biometrics 18:342–353CrossRefzbMATHMathSciNetGoogle Scholar
  22. Torgerson WS (1958) Theory and method of scaling. Wiley, New YorkGoogle Scholar
  23. West SG, Finch JF, Curran PJ (1995) Structural equation models with nonnormal variables: problems and remedies. In: Hoyle RH (ed) Structural equation modeling: concepts, issues, and applications. Sage, Thousand Oaks, pp 56–75Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of PsychologyGoethe University FrankfurtFrankfurt a. M.Germany
  2. 2.Institute of Psychology, Huazhong University of Science and TechnologyWuhanChina

Personalised recommendations