Nonparametric Item Response Theory and Mokken Scale Analysis, with Relations to Latent Class Models and Cognitive Diagnostic Models

  • L. Andries van der ArkEmail author
  • Gina Rossi
  • Klaas Sijtsma
Part of the Methodology of Educational Measurement and Assessment book series (MEMA)


As the focus of this chapter, we discuss nonparametric item response theory for ordinal person scales, specifically the monotone homogeneity model and Mokken scale analysis, which is the data-analysis procedure used for investigating the compliance between the monotone homogeneity model and data. Next, we discuss the unrestricted latent class model as an even more liberal model for investigating the scalability of a set of items, producing nominal scales, but we also discuss an ordered latent class model that one can use to investigate assumptions about item response functions in the monotone homogeneity model and other nonparametric item response models. Finally, we discuss cognitive diagnostic models, which are the core of this volume, and which are a further deepening of latent class models, providing diagnostic information about the people who responded to a set of items. A data analysis example, using item scores of 1210 respondents on 44 items from the Millon Clinical Multiaxial Inventory III, demonstrates how the monotone homogeneity model, the latent class model, and two cognitive diagnostic models can be used jointly to understand one’s data.


  1. Andrews, R. L., & Currim, I. S. (2003). A comparison of segment retention criteria for finite mixture logit models. Journal of Marketing Research, 40, 235–243. CrossRefGoogle Scholar
  2. Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573. CrossRefGoogle Scholar
  3. Bouwmeester, S., Vermunt, J. K., & Sijtsma, K. (2007). Development and individual differences in transitive reasoning: A fuzzy trace theory approach. Developmental Review, 27, 41–74. CrossRefGoogle Scholar
  4. Brusco, M. J., Köhn, H.-F., & Steinley, D. (2015). An exact method for partitioning dichotomous items within the framework of the monotone homogeneity model. Psychometrika, 80, 949–967. CrossRefGoogle Scholar
  5. Chiu, C. Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225–250. CrossRefGoogle Scholar
  6. Croon, M. (1990). Latent class analysis with ordered latent classed. British Journal of Mathematical and Statistical Psychology, 43, 171–192. CrossRefGoogle Scholar
  7. Croon, M. A. (1991). Investigating Mokken scalability of dichotomous items by means of ordinal latent class analysis. British Journal of Mathematical and Statistical Psychology, 44, 315–331. CrossRefGoogle Scholar
  8. de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115–130. CrossRefGoogle Scholar
  9. de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199. CrossRefGoogle Scholar
  10. de la Torre, J., & Douglas, J. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353. CrossRefGoogle Scholar
  11. de la Torre, J, van der Ark, L. A., & Rossi, G. (2018). Analysis of clinical data from cognitive diagnosis modeling framework. Measurement and Evaluation in Counseling and Development, 51, 281–296.
  12. Douglas, J. A. (2001). Asymptotic identifiability of nonparametric item response models. Psychometrika, 66, 531–540. CrossRefGoogle Scholar
  13. Douglas, R., Fienberg, S. E., Lee, M.-L. T., Sampson, A. R., & Whitaker, L. R. (1991). Positive dependence concepts for ordinal contingency tables. In H. W. Block, A. R. Sampson, & T. H. Savits (Eds.), Topics in statistical dependence (pp. 189–202). Hayward, CA: Institute of Mathematical Statistics. Retrieved from Google Scholar
  14. Ellis, J. L. (2014). An inequality for correlations in unidimensional monotone latent variable models for binary variables. Psychometrika, 79, 303–316. CrossRefGoogle Scholar
  15. Formann, A. K., & Kohlmann, T. (2002). Three-parameter linear logistic latent class analysis. In J. A. Hagenaars & A. L. McCutcheon (Eds.),. Applied latent class analysis (pp. 183–210). Cambridge, UK: Cambridge University Press.Google Scholar
  16. Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231. CrossRefGoogle Scholar
  17. Grayson, D. A. (1988). Two-group classification in latent trait theory: Scores with monotone likelihood ratio. Psychometrika, 53, 383–392. CrossRefGoogle Scholar
  18. Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301–321. CrossRefGoogle Scholar
  19. Hagenaars, J. A., & McCutcheon, A. L. (2002). Applied latent class analysis. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  20. Heinen, T. (1996). Latent class and discrete latent trait models. Similarities and differences. Thousand Oaks, CA: Sage.Google Scholar
  21. Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1997). Stochastic ordering using the latent trait and the sum score in polytomous IRT models. Psychometrika, 62, 331–347. CrossRefGoogle Scholar
  22. Hemker, B. T., van der Ark, L. A., & Sijtsma, K. (2001). On measurement properties of continuation ratio models. Psychometrika, 66, 487–506. CrossRefGoogle Scholar
  23. Hoijtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using Gibbs sampler and posterior predictive checks. Psychometrika, 62, 171–189. CrossRefGoogle Scholar
  24. Holland, P. W., & Rosenbaum, P. R. (1986). Conditional association and unidimensionality in monotone latent variable models. The Annals of Statistics, 14, 1523–1543.. Retrieved from CrossRefGoogle Scholar
  25. Hubert, M., & Vandervieren, E. (2008). An adjusted boxplot for skewed distributions. Computational Statistics & Data Analysis, 52, 5186–5201. CrossRefGoogle Scholar
  26. Junker, B. W. (1993). Conditional association, essential independence and monotone unidimensional item response models. The Annals of Statistics, 21, 1359–1378. Retrieved from CrossRefGoogle Scholar
  27. Junker, B. W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81. CrossRefGoogle Scholar
  28. Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272. CrossRefGoogle Scholar
  29. Karabatsos, G., & Sheu, C.-F. (2004). Order-constrained Bayes inference for dichotomous models of unidimensional nonparametric IRT. Applied Psychological Measurement, 28, 110–125. CrossRefGoogle Scholar
  30. Leighton, J. A., & Gierl, M. J. (2007). Cognitive diagnostic assessment for education. Theory and applications. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  31. Ligtvoet, R., & Vermunt, J. K. (2012). Latent class models for testing monotonicity and invariant item ordering for polytomous items. British Journal of Mathematical and Statistical Psychology, 65, 237–250. CrossRefGoogle Scholar
  32. Linzer, D. A. (2011). Reliable inference in highly stratified contingency tables: Using latent class models as density estimators. Political Analysis, 19, 173–187. CrossRefGoogle Scholar
  33. Linzer, D. A., & Lewis, J. B. (2011). poLCA: An R package for polytomous variable latent class analysis. Journal of Statistical Software, 42(10), 1–29. CrossRefGoogle Scholar
  34. MacReady, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2, 99–120. CrossRefGoogle Scholar
  35. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174. CrossRefGoogle Scholar
  36. Michell, J. (1999). Measurement in psychology. A critical history of a methodological concept. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  37. Millon, T., Millon, C., Davis, R., & Grossman, S. (2009). MCMI-III Manual (4th ed.). Minneapolis, MN: Pearson Assessments.Google Scholar
  38. Mokken, R. J. (1971). A theory and procedure of scale analysis. The Hague, The Netherlands/Berlin, Germany: Mouton/De Gruyter.CrossRefGoogle Scholar
  39. Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176. CrossRefGoogle Scholar
  40. Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630. CrossRefGoogle Scholar
  41. Ramsay, J. O. (2016). Functional approaches to modeling response data. In W. J. van der Linden (Ed.), Handbook of item response theory. Volume one. Models (pp. 337–350). Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
  42. Rossi, G., Elklit, A., & Simonsen, E. (2010). Empirical evidence for a four factor framework of personality disorder organization: Multigroup confirmatory factor analysis of the million clinical multiaxial inventory–III personality disorders scales across Belgian and Danish data samples. Journal of Personality Disorders, 24, 128–150. CrossRefGoogle Scholar
  43. Rossi, G., Sloore, H., & Derksen, J. (2008). The adaptation of the MCMI-III in two non-English-speaking countries: State of the art of the Dutch language version. In T. Millon & C. Bloom (Eds.), The Millon inventories: A practitioner’s guide to personalized clinical assessment (2nd ed., pp. 369–386). New York, NY: The Guilford Press.Google Scholar
  44. Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement. Theory, methods, and applications. New York, NY: The Guilford Press.Google Scholar
  45. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Psychometric Monograph No. 17). Richmond, VA: Psychometric Society. Retrieved from
  46. Schwarz, G. E. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464. CrossRefGoogle Scholar
  47. Sijtsma, K., & Hemker, B. T. (1998). Nonparametric polytomous IRT models for invariant item ordering, with results for parametric models. Psychometrika, 63, 183–200. CrossRefGoogle Scholar
  48. Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.CrossRefGoogle Scholar
  49. Sijtsma, K., & Molenaar, I. W. (2016). Mokken models. In W. J. van der Linden (Ed.), Handbook of item response theory. Volume one. Models (pp. 303–321). Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
  50. Sijtsma, K., & van der Ark, L. A. (2017). A tutorial on how to do a Mokken scale analysis on your test and questionnaire data. British Journal of Mathematical and Statistical Psychology, 70, 137–158. CrossRefGoogle Scholar
  51. Sorrel, M. A., Olea, J., Abad, F. J., de la Torre, J., Aguado, D., & Lievens, F. (2016). Validity and reliability of situational judgement test scores: A new approach based on cognitive diagnosis models. Organizational Research Methods, 19, 506–532. CrossRefGoogle Scholar
  52. Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55, 293–326. CrossRefGoogle Scholar
  53. Stout, W. F. (2002). Psychometrics: From practice to theory and back. Psychometrika, 67, 485–518. CrossRefGoogle Scholar
  54. Straat, J. H., van der Ark, L. A., & Sijtsma, K. (2013). Comparing optimization algorithms for item selection in Mokken scale analysis. Journal of Classification, 30, 75–99. CrossRefGoogle Scholar
  55. Straat, J. H., van der Ark, L. A., & Sijtsma, K. (2016). Using conditional association to identify locally independent item sets. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 12, 117–123. CrossRefGoogle Scholar
  56. Suppes, P., & Zanotti, M. (1981). When are probabilistic explanations possible? Synthese, 48, 191–199. CrossRefGoogle Scholar
  57. Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305. CrossRefGoogle Scholar
  58. Tijmstra, J., Hessen, D. J., van der Heijden, P. G. M., & Sijtsma, K. (2013). Testing manifest monotonicity using order-constrained statistical inference. Psychometrika, 78, 83–97. CrossRefGoogle Scholar
  59. van der Ark, L. A. (2005). Stochastic ordering of the latent trait by the sum score under various polytomous IRT models. Psychometrika, 70, 283–304. CrossRefGoogle Scholar
  60. van der Ark, L. A. (2007). Mokken scale analysis in R. Journal of Statistical Software, 20(11), 1–19. CrossRefGoogle Scholar
  61. van der Ark, L. A. (2012). New developments in Mokken scale analysis in R. Journal of Statistical Software, 48(5), 1–27. CrossRefGoogle Scholar
  62. van der Ark, L. A., & Bergsma, W. P. (2010). A note on stochastic ordering of the latent trait using the sum of polytomous item scores. Psychometrika, 75, 272–279. CrossRefGoogle Scholar
  63. van der Ark, L. A., van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test-score reliability. Applied Psychological Measurement, 35, 380–392.CrossRefGoogle Scholar
  64. van der Linden, W. J. (Ed.). (2016). Handbook of item response theory. Volume one. Models. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
  65. van Onna, M. J. H. (2002). Bayesian estimation and model selection in ordered latent class models for polytomous items. Psychometrika, 67, 519–538. CrossRefGoogle Scholar
  66. van Schuur, W. H. (2011). Ordinal item response theory. Mokken scale analysis. Thousand Oaks, CA: Sage.CrossRefGoogle Scholar
  67. Vermunt, J. K. (2001). The use of restricted latent class models for defining and testing nonparametric and parametric item response theory models. Applied Psychological Measurement, 25, 283–294. CrossRefGoogle Scholar
  68. Vermunt, J. K., van Ginkel, J. R., van der Ark, L. A., & Sijtsma, K. (2008). Multiple imputation of incomplete categorical data using latent class analysis. Sociological Methodology, 38, 369–397.CrossRefGoogle Scholar
  69. von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–307. CrossRefGoogle Scholar
  70. von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52, 8–28. Retrieved from Google Scholar
  71. von Davier, M. (2014). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67, 49–71. CrossRefGoogle Scholar
  72. von Davier, M., & Haberman, S. (2014). Hierarchical diagnostic classification models morphing into unidimensional ‘diagnostic’ classification models – A commentary. Psychometrika, 79, 340–346.
  73. Wetzel, E., Xu, X., & Von Davier, M. (2015). An alternative way to model population ability distributions in large-scale educational surveys. Educational and Psychological Measurement, 75, 739–763.CrossRefGoogle Scholar
  74. Yang, C.-C., & Yang, C.-C. (2007). Separating latent classes by information criteria. Journal of Classification, 24, 183–203. CrossRefGoogle Scholar
  75. Yang, X., & Embretson, S. E. (2007). Construct validity and cognitive diagnostic assessment. In J. A. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education. Theory and applications (pp. 119–145). Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  76. Yen, W. (1981). Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5, 245–262. CrossRefGoogle Scholar
  77. Zheng, Y., & Chiu, C.-Y. (2016). NPCD: Nonparametric methods for cognitive diagnosis. R package version 1.0–10 [computer software]. Retrieved from
  78. Zijlstra, W. P., van der Ark, L. A., & Sijtsma, K. (2007). Outlier detection in test and questionnaire data. Multivariate Behavioral Research, 42(3), 531–555.Google Scholar
  79. Zijlmans, E. A. O., van der Ark, L. A., Tijmstra, J., & Sijtsma, K. (2018). Methods for estimating item-score reliability. Applied Psychological Measurement, 42, 553–570.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • L. Andries van der Ark
    • 1
    Email author
  • Gina Rossi
    • 2
  • Klaas Sijtsma
    • 3
  1. 1.Research Institute of Child Development and EducationUniversity of AmsterdamAmsterdamThe Netherlands
  2. 2.Research Group Personality and PsychopathologyVrije Universiteit BrusselBrusselsBelgium
  3. 3.Department of Methodology and Statistics, TSBTilburg UniversityTilburgThe Netherlands

Personalised recommendations