Advertisement

Evaluating methods for handling missing ordinal data in structural equation modeling

  • Fan JiaEmail author
  • Wei Wu
Article

Abstract

Missing ordinal data are common in studies using structural equation modeling (SEM). Although several methods for dealing with missing ordinal data have been available, these methods often have not been systematically evaluated in SEM. In this study, we used Monte Carlo simulation to evaluate and compare five existing methods, including one direct robust estimation method and four multiple imputation methods, to deal with missing ordinal data. On the basis of the simulation results, we provide practical guidance to researchers in terms of the best way to deal with missing ordinal data in SEM.

Keywords

Missing ordinal data Structural equation modeling Robust estimation Multiple imputation 

Notes

References

  1. Asparouhov, T., & Muthén, B. (2010). Multiple imputation with Mplus (Technical report). www.statmodel.com.
  2. Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley.CrossRefGoogle Scholar
  3. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.CrossRefGoogle Scholar
  4. Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.CrossRefGoogle Scholar
  5. Cowles, M. K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6, 101–111.  https://doi.org/10.1007/BF00162520 CrossRefGoogle Scholar
  6. Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5, and 7 response categories: A comparison of categorical variable estimators using simulated data. British Journal of Mathematical and Statistical Psychology, 47, 309–326.CrossRefGoogle Scholar
  7. Doove, L., van Buuren, S., & Dusseldorp, E. (2014). Recursive partitioning for missing data imputation in the presence of interaction effects. Computational Statistics and Data Analysis, 72, 92–104.CrossRefGoogle Scholar
  8. Enders, C. K. (2001a). The impact of nonnormality on full information maximum-likelihood estimation for structural equation models with missing data. Psychological Methods, 6, 352–370.  https://doi.org/10.1037/1082-989X.6.4.352 CrossRefPubMedGoogle Scholar
  9. Enders, C. K. (2001b). A primer on maximum likelihood algorithms available for use with missing data. Structural Equation Modeling, 8, 128–141.CrossRefGoogle Scholar
  10. Enders, C. K. (2010). Applied missing data analysis. New York, NY: Guilford Press.Google Scholar
  11. Enders, C. K., & Mansolf, M. (2018). Assessing the fit of structural equation models with multiply imputed data. Psychological Methods, 23, 76–93.  https://doi.org/10.1037/met0000102 CrossRefPubMedGoogle Scholar
  12. Ferrari, P. A., & Barbiero, A. (2012). Simulating ordinal data. Multivariate Behavioral Research, 47, 566–589.CrossRefGoogle Scholar
  13. Finney, S. J., & DiStefano, C. (2006). Non-normal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (pp. 269–314). Greenwich, CT: Information Age.Google Scholar
  14. Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576.CrossRefGoogle Scholar
  15. Green, S. B., Akey, T. M., Fleming, K. K., Hershberger, S. L., & Marquis, J. G. (1997). Effect of the number of scale points on chi-square fit indices in confirmatory factor analysis. Structural Equation Modeling, 4, 108–120.CrossRefGoogle Scholar
  16. Honaker, J., King, G., & Blackwell, M. (2011). Amelia II: A program for missing data. Journal of Statistical Software, 45, 1–47.CrossRefGoogle Scholar
  17. Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling An overview and a meta-analysis. Sociological Methods and Research, 26, 329–367.CrossRefGoogle Scholar
  18. Li, C.-H. (2016). The performance of ML, DWLS, and ULS estimation with robust corrections in structural equation models with ordinal variables. Psychological Methods, 21, 369–387.  https://doi.org/10.1037/met0000093 CrossRefPubMedGoogle Scholar
  19. Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2, 18–22.Google Scholar
  20. Muthén, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171–189.CrossRefGoogle Scholar
  21. Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431–462.CrossRefGoogle Scholar
  22. Muthén, L. K., & Muthén, B. (2012). Mplus user’s guide (Version 7). Los Angeles, CA: Muthén & Muthén.Google Scholar
  23. Palomo, J., Dunson, D. B., & Bollen, K. (2011). Bayesian structural equation modeling. In S.-Y. Lee (Ed.), Handbook of latent variable and related models (pp. 163–188). Amsterdam, The Netherlands: Elsevier.Google Scholar
  24. R Core Team. (2015). R: A language and environment for statistical computing. R Foundation Statistical Computing, Vienna, Austria. Retrieved from www.R-project.org Google Scholar
  25. Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17, 354–373.CrossRefGoogle Scholar
  26. Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36.  https://doi.org/10.18637/jss.v048.i02 CrossRefGoogle Scholar
  27. Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.CrossRefGoogle Scholar
  28. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.CrossRefGoogle Scholar
  29. Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association, 473–489.  https://doi.org/10.1080/01621459.1996.10476908
  30. Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variable analysis: Applications for developmental research (pp. 399–419). Thousand Oaks, CA: Sage.Google Scholar
  31. Savalei, V., & Falk, C. F. (2014). Robust two-stage approach outperforms robust full information maximum likelihood with incomplete nonnormal data. Structural Equation Modeling, 21, 280–302.  https://doi.org/10.1080/10705511.2014.882692 CrossRefGoogle Scholar
  32. Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.CrossRefGoogle Scholar
  33. Shah, A. D., Bartlett, J. W., Carpenter, J., Nicholas, O., & Hemingway, H. (2014). Comparison of random forest and parametric imputation models for imputing missing data using MICE: A CALIBER study. American Journal of Epidemiology, 179, 764–774.  https://doi.org/10.1093/aje/kwt312 CrossRefPubMedPubMedCentralGoogle Scholar
  34. van Buuren, S. (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research, 16, 219–242.CrossRefGoogle Scholar
  35. van Buuren, S. (2012). Flexible imputation of missing data. Boca Raton: Chapman and Hall/CRC Press.Google Scholar
  36. van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76, 1049–1064.CrossRefGoogle Scholar
  37. van Buuren, S., & Groothuis-Oudshoorn, K. (2011). MICE: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45, 1–67.CrossRefGoogle Scholar
  38. White, I. R., Royston, P., & Wood, A. M. (2011). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine, 30, 377–399.CrossRefGoogle Scholar
  39. Wu, W., Jia, F., & Enders, C. (2015). A comparison of imputation strategies for ordinal missing data on Likert scale variables. Multivariate Behavioral Research, 50, 484–503.  https://doi.org/10.1080/00273171.2015.1022644 CrossRefPubMedGoogle Scholar
  40. Yuan, K.-H., & Bentler, P. M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30, 165–200.CrossRefGoogle Scholar

Copyright information

© The Psychonomic Society, Inc. 2019

Authors and Affiliations

  1. 1.University of KansasLawrenceUSA
  2. 2.Indiana University–Purdue University IndianapolisIndianapolisUSA

Personalised recommendations