Robustness of Mixture IRT Models to Violations of Latent Normality

  • Sedat SenEmail author
  • Allan S. Cohen
  • Seock-Ho Kim
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 89)


Unidimensional item response theory (IRT) models assume that a single model applies to all people in the population. Mixture IRT models can be useful when subpopulations are suspected. The usual mixture IRT model is typically estimated assuming normally distributed latent ability. Research on normal finite mixture models suggests that latent classes potentially can be extracted even in the absence of population heterogeneity if the distribution of the data is nonnormal. Empirical evidence suggests, in fact, that test data may not always be normal. In this study, we examined the sensitivity of mixture IRT models to latent nonnormality. Single-class IRT data sets were generated using different ability distributions and then analyzed with mixture IRT models to determine the impact of these distributions on the extraction of latent classes. Preliminary results suggest that estimation of mixed Rasch models resulted in spurious latent class problems in the data when distributions were bimodal and uniform. Mixture 2PL and mixture 3PL IRT models were found to be more robust to nonnormal latent ability distributions. Two popular information criterion indices, Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC), were used to inform model selection. For most conditions, the performance of BIC index was better than the AIC for selection of the correct model.


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723. doi:10.1109/TAC.1974.1100705CrossRefzbMATHMathSciNetGoogle Scholar
  2. Alexeev N, Templin J, Cohen AS (2011) Spurious latent classes in the mixture Rasch model. J Educ Meas 48:313–332. doi:10.1111/j.1745-3984.2011.00146.xCrossRefGoogle Scholar
  3. Arminger G, Stein P, Wittenberg J (1999) Mixtures of conditional mean- and covariance-structure models. Psychometrika 64:475–494. doi:10.1007/BF02294568CrossRefzbMATHGoogle Scholar
  4. Bauer DJ (2007) Observations on the use of growth mixture models in psychological research. Multivar Behav Res 42:757–786. doi:10.1080/00273170701710338CrossRefGoogle Scholar
  5. Bauer DJ, Curran PJ (2003) Distributional assumptions of growth mixture models: implications for over-extraction of latent trajectory classes. Psychol Methods 8:338–363. doi:10.1037/1082-989X.8.3.338CrossRefGoogle Scholar
  6. Bock RD, Aitkin M (1981) Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 46:443–459. doi:10.1007/BF02293801CrossRefMathSciNetGoogle Scholar
  7. Bock RD, Zimowski MF (1997) Multiple group IRT. In: van der Linden WJ, Hambleton RK (eds) Handbook of modern item response theory. Springer, New York, pp 433–448CrossRefGoogle Scholar
  8. Bolt DM, Cohen AS, Wollack JA (2002) Item parameter estimation under conditions of test speededness: application of a mixture Rasch model with ordinal constraints. J Educ Meas 39:331–348. doi:10.1111/j.1745-3984.2002.tb01146.xCrossRefGoogle Scholar
  9. Bozdogan H (1987) Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52:345–370CrossRefzbMATHMathSciNetGoogle Scholar
  10. Clogg CC (1995) Latent class models. In: Arminger G, Clogg CC, Sobel ME (eds) Handbook of statistical modeling for the social and behavioral sciences. Plenum Press, New York, pp. 311–359CrossRefGoogle Scholar
  11. Cohen AS, Bolt DM (2005) A mixture model analysis of differential item functioning. J Educ Meas 42:133–148. doi:10.1111/j.1745-3984.2005.00007CrossRefGoogle Scholar
  12. Cohen AS, Gregg N, Deng M (2005) The role of extended time and item content on a high-stakes mathematics test. Learn Disabil Res Pract 20:225–233. doi:10.1111/j.1540-5826.2005.00138.xCrossRefGoogle Scholar
  13. Congdon P (2003) Applied Bayesian modelling. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  14. Cowles MK, Carlin BP (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904. doi:10.1080/01621459.1996.10476956CrossRefzbMATHMathSciNetGoogle Scholar
  15. Embretson SE, Reise SP (2000) Item response theory for psychologists. Erlbaum, MahwahGoogle Scholar
  16. Fleishman AI (1978) A method for simulating non-normal distributions. Psychometrika 43:521–532. doi:10.1007/BF02293811CrossRefzbMATHGoogle Scholar
  17. Florida Department of Education (2002) Florida Comprehensive Assessment Test. Tallahassee, FL: AuthorGoogle Scholar
  18. Frick H, Strobl C, Leisch F, Zeileis A (2012) Flexible Rasch mixture models with package psychomix. J Stat Softw 48(7):1–25. Retrieved from Google Scholar
  19. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472. Retrieved from CrossRefGoogle Scholar
  20. Jedidi K, Jagpal HS, DeSarbo WS (1997) Finite mixture structural equation models for response-based segmentation and unobserved heterogeneity. Mark Sci 16:39–59. doi:10.1287/mksc.16.1.39CrossRefGoogle Scholar
  21. Kolen MJ, Brennan RL (2004) Test equating: methods and practices, 2nd edn. Springer, New YorkCrossRefGoogle Scholar
  22. Li F, Duncan TE, Duncan SC (2001) Latent growth modeling of longitudinal data: a finite growth mixture modeling approach. Struct Equ Model 8:493–530. doi:10.1207/S15328 007SEM0804_01CrossRefMathSciNetGoogle Scholar
  23. Li F, Cohen AS, Kim S-H, Cho S-J (2009) Model selection methods for mixture dichotomous IRT models. Appl Psychol Meas 33:353–373. doi:10.1177/0146621608326422CrossRefMathSciNetGoogle Scholar
  24. Lo Y, Mendell NR, Rubin DB (2001) Testing the number of components in a normal mixture. Biometrika 88:767–778. doi:10.1093/biomet/88.3.767CrossRefMathSciNetGoogle Scholar
  25. Lubke GH, Muthén BO (2005) Investigating population heterogeneity with factor mixture models. Psychol Methods 10:21–39. doi:10.1037/1082-989X.10.1.21CrossRefGoogle Scholar
  26. McLachlan G, Peel D (2000) Finite mixture models. Wiley, New YorkzbMATHGoogle Scholar
  27. Mislevy RJ, Verhelst N (1990) Modeling item responses when different subjects employ different solution strategies. Psychometrika 55:195–215. doi:10.1007/BF02295283CrossRefGoogle Scholar
  28. Muthén LK, Muthén BO (2011) Mplus user’s guide, 6th edn. Author, Los AngelesGoogle Scholar
  29. Nylund KL, Asparouhov T, Muthén BO (2007) Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study. Struct Equ Model 14:535–569. doi:10.1080/10705510701575396CrossRefMathSciNetGoogle Scholar
  30. Pearson ES, Please NW (1975) Relation between the shape of population distribution and the robustness of four simple test statistics. Biometrika 62:223–241. doi:10.1093/biomet/62.2.223CrossRefzbMATHMathSciNetGoogle Scholar
  31. Plummer M, Best N, Cowles K, Vines K (2006) CODA: convergence diagnosis and output analysis for MCMC. R News 6:7–11. Retrieved from Google Scholar
  32. Preinerstorfer D, Formann AK (2011) Parameter recovery and model selection in mixed Rasch models. Br J Math Stat Psychol 65:251–262. doi:10.1111/j.2044-8317.2011.02020.xCrossRefMathSciNetGoogle Scholar
  33. R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. Retrieved from Google Scholar
  34. Raftery AE, Lewis S (1992) How many iterations in the Gibbs sampler. Bayesian Stat 4:763–773Google Scholar
  35. Reckase MD (2009) Multidimensional item response theory. Springer, New YorkCrossRefGoogle Scholar
  36. Rost J (1990) Rasch models in latent classes: an integration of two approaches to item analysis. Appl Psychol Meas 14:271–282. doi:10.1177/014662169001400305CrossRefGoogle Scholar
  37. Rost J, von Davier M (1993) Measuring different traits in different populations with the same items. In: Steyer R, Wender KF, Widaman KF (eds) Psychometric methodology. Proceedings of the 7th European meeting of the psychometric society in Trier. Gustav Fischer, Stuttgart, pp 446–450Google Scholar
  38. Rost J, Carstensen CH, von Davier M (1997) Applying the mixed-Rasch model to personality questionaires. In: Rost R, Langeheine R (eds) Applications of latent trait and latent class models in the social sciences. Waxmann, New York, pp 324–332Google Scholar
  39. Samuelsen KM (2005) Examining differential item functioning from a latent class perspective. Doctoral dissertation, University of MarylandGoogle Scholar
  40. SAS Institute (2008) SAS/STAT 9.2 user’s guide. SAS Institute, CaryGoogle Scholar
  41. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464. doi:10.1214/aos/1176344136CrossRefzbMATHGoogle Scholar
  42. Sclove LS (1987) Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52:333–343. doi:10.1007/BF02294360CrossRefGoogle Scholar
  43. Seong TJ (1990). Sensitivity of marginal maximum likelihood estimation of item and ability parameters to the characteristics of the prior ability distributions. Appl Psychol Meas 14:299–311. doi:10.1177/014662169001400307CrossRefGoogle Scholar
  44. Spiegelhalter DJ, Best NG, Carlin BP (1998) Bayesian deviance, the effective number of parameters, and the comparison of arbitrarily complex models. Research Report No. 98-009. MRC Biostatistics Unit, CambridgeGoogle Scholar
  45. Spiegelhalter D, Thomas A, Best N (2003) WinBUGS (version 1.4) [Computer software]. Biostatistics Unit, Institute of Public Health, CambridgeGoogle Scholar
  46. Thissen D (2003) MULTILOG: multiple, categorical item analysis and test scoring using item response theory (Version 7.03) [Computer software]. Scientific Software International, ChicagoGoogle Scholar
  47. Titterington DM, Smith AFM, Makov UE (1985) Statical analysis of finite mixture distributions. Wiley, ChichesterGoogle Scholar
  48. Tofighi D, Enders CK (2007) Identifying the correct number of classes in a growth mixture model. In: Hancock GR, Samuelsen KM (eds) Mixture models in latent variable research. Information Age, Greenwich, pp 317–341Google Scholar
  49. Vermunt JK, Magidson J (2005) Latent GOLD (Version 4.0) [Computer software]. Statistical Innovations, Inc., BelmontGoogle Scholar
  50. von Davier M (2001) WINMIRA 2001 [Computer software]. Assessment Systems Corporation, St. PaulGoogle Scholar
  51. von Davier M (2005) mdltm: software for the general diagnostic model and for estimating mixtures of multidimensional discrete latent traits models [Computer software]. ETS, PrincetonGoogle Scholar
  52. von Davier M, Rost J (1997) Self monitoring-A class variable? In: Rost J, Langeheime R (eds) Applications of latent trait and latent class models in the social sciences. Waxmann, Muenster, pp 296–305Google Scholar
  53. von Davier M, Rost J (2007) Mixture distribution item response models. In: Rao CR, Sinharay S (eds) Handbook of statistics. Psychometrics, vol 26. Elsevier, Amsterdam, pp 643–661Google Scholar
  54. von Davier M, Rost J, Carstensen CH (2007) Introduction: extending the Rasch model. In: von Davier M, Carstensen CH (eds) Multivariate and mixture distribution Rasch models: extensions and applications. Springer, New York, pp 1–12CrossRefGoogle Scholar
  55. Wall MM, Guo J, Amemiya Y (2012) Mixture factor analysis for approximating a nonnormally distributed continuous latent factor with continuous and dichotomous observed variables. Multivar Behav Res 47:276–313. doi:10.1080/00273171.2012.658339CrossRefGoogle Scholar
  56. Wollack JA, Cohen AS, Wells CS (2003) A method for maintaining scale stability in the presence of test speededness. J Educ Meas 40:307–330. doi:10.1111/j.1745-3984.2003.tb01149.xCrossRefGoogle Scholar
  57. Woods CM (2004) Item response theory with estimation of the latent population distribution using spline-based densities. Unpublished doctoral dissertation, University of North Carolina at Chapel HillGoogle Scholar
  58. Yamamoto KY, Everson HT (1997) Modeling the effects of test length and test time on parameter estimation using the HYBRID model. In: Rost J, Langeheine R (eds) Applications of latent trait and latent class models in the social sciences. Waxmann, Munster, pp 89–98Google Scholar
  59. Zwinderman AH, Van den Wollenberg AL (1990) Robustness of marginal maximum likelihood estimation in the Rasch model. Appl Psychol Meas 14:73–81. doi:10.1177/014662 169001400107CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.University of GeorgiaAthensUSA
  2. 2.Harran University, SanliurfaTurkey for Sedat Sen and University of GeorgiaGeorgiaUSA

Personalised recommendations