How to Conduct a Study with Diagnostic Models

Lee, Young-Sun; Luna-Bazaldua, Diego A.

doi:10.1007/978-3-030-05584-4_25

How to Conduct a Study with Diagnostic Models

Young-Sun Lee⁵ &
Diego A. Luna-Bazaldua⁶

Chapter
First Online: 12 October 2019

1527 Accesses
1 Citations

Part of the book series: Methodology of Educational Measurement and Assessment ((MEMA))

Abstract

In recent years there has been a wave of new assessment designs, measurement methods, and frameworks to connect psychometrics with cognitive science due to the need to enhance traditional and new assessments in order to provide more information about the examinees and the quality of the assessment tools. The purpose of this chapter is to explore the use of a set of guidelines developed for CDM retrofitting using data from the 2007 TIMSS test administration as an example. Three research questions for the study are: Is it feasible to use a retrofitting approach using TIMSS data? Does relative model fit improve when using CDMs compared to IRT models? What additional information regarding the examinees’ skills and items are gained from using CDM retrofitting?

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332.
Article Google Scholar
Baker, F., & Kim, S.-H. (2004). Item response theory. New York, NY: Marcel Dekker.
Book Google Scholar
Bradshaw, L., & Templin, J. (2013). Combining item response theory and diagnostic classification models: A psychometric model for scaling ability and diagnosing misconceptions. Psychometrika, 79(3), 403–425.
Article Google Scholar
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.
Article Google Scholar
Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419–437.
Article Google Scholar
Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50(2), 123–140.
Article Google Scholar
Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis. Applied Psychological Measurement, 37(8), 598–618.
Article Google Scholar
Chiu, C.-Y., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30(2), 225–250.
Article Google Scholar
Chiu, C.-Y., Douglas, J., & Li, X. (2009). Cluster analysis for cognitive diagnosis: Theory and applications. Psychometrika, 74(4), 633–665.
Article Google Scholar
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Orlando, FL: Holt, Rinehart, and Winston.
Google Scholar
Cui, Y., & Leighton, J. P. (2009). The hierarchy consistency index: Evaluating person fit for cognitive diagnostic assessment. Journal of Educational Measurement, 46(4), 429–449.
Article Google Scholar
Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476.
Article Google Scholar
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45(4), 343–362.
Article Google Scholar
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115–130.
Article Google Scholar
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.
Article Google Scholar
de la Torre, J., & Chiu, C. Y. (2016). A general method of empirical Q-matrix validation. Psychometrika, 81(2), 253–273.
Article Google Scholar
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353.
Article Google Scholar
de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595–624.
Article Google Scholar
de la Torre, J., & Lee, Y.-S. (2010). A note on the invariance of the DINA model parameters. Journal of Educational Measurement, 47(1), 115–127.
Article Google Scholar
de la Torre, J., & Lee, Y.-S. (2013). Evaluating the wald test for item-level comparison of saturated and reduced models in cognitive diagnosis. Journal of Educational Measurement, 50(4), 355–373.
Article Google Scholar
de la Torre, J., & Minchen, N. D. (this volume). The G-DINA model framework. In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models. Cham, Switzerland: Springer.
Google Scholar
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8–26.
Article Google Scholar
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36(6), 447–468.
Article Google Scholar
DiBello, L. V., Stout, W. F., & Roussos, L. (1995). Unified cognitive psychometric assessment likelihood-based classification techniques. In P. D. Nichols, S. F. Chipman, & R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 361–390). Hillsdale, NJ: Erlbaum.
Google Scholar
Embretson, S. E., & Daniel, R. C. (2008). Understanding and quantifying cognitive complexity level in mathematical problem solving items. Psychology Science, 50, 328–344.
Google Scholar
Embretson, S. E., & Gorin, J. (2001). Improving construct validity with cognitive psychology principles. Journal of Educational Measurement, 38(4), 343–368.
Article Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Psychology Press.
Google Scholar
Embretson, S. E., & Yang, X. (2013). A multicomponent latent trait model for diagnosis. Psychometrika, 78(1), 14–36.
Article Google Scholar
Foy, P., & Olson, J. F. (2009). TIMSS 2007 user guide for the international database. Chestnut Hill, MA: International Association for the Evaluation of Educational Achievement.
Google Scholar
Geisinger, K. F. (2012). Norm- and criterion-referenced testing. In H. Cooper, P. M. Camic, D. L. Long, A. T. Panter, D. Rindksopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology, Vol 1: Foundations, planning, measures, and psychometrics (pp. 371–393). Washington, DC: American Psychological Association.
Chapter Google Scholar
George, A. C., Robitzsch, A., Kiefer, T., Gross, J., & Uenlue, A. (2016). The R package CDM for cognitive diagnosis models. Journal of Statistical Software, 74(2), 1–24.
Article Google Scholar
Gierl, M. J., & Cui, Y. (2008). Defining characteristics of diagnostic classification models and the problem of retrofitting in cognitive diagnostic assessment. Measurement, 6, 263–275.
Google Scholar
Haberman, S. J., & von Davier, M. (2006). Some notes on models for cognitively based skills diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26): Psychometrics (pp. 1031–1038). Amsterdam, The Netherlands: Elsevier.
Google Scholar
Hansen, M., Cai, L., Monroe, S., & Li, Z. (2016). Limited-information goodness-of-fit testing of diagnostic classification item response models. British Journal of Mathematical and Statistical Psychology, 69(3), 225–252.
Article Google Scholar
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210.
Article Google Scholar
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272.
Article Google Scholar
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.
Article Google Scholar
Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49(1), 59–81.
Article Google Scholar
Lei, P. W., & Li, H. (2016). Performance of fit indices in choosing correct cognitive diagnostic models and Q-matrices. Applied Psychological Measurement, 40(6), 405–417.
Article Google Scholar
Leighton, J. P., Gierl, M. J., & Hunka, S. M. (2004). The attribute hierarchy method for cognitive assessment: A variation on Tatsuoka’s rule-space approach. Journal of Educational Measurement, 41(3), 205–237.
Article Google Scholar
Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564.
Article Google Scholar
Liu, R., Huggins-Manley, A. C., & Bulut, O. (2017). Retrofitting diagnostic classification models to responses from IRT-based assessment forms. Educational and Psychological Measurement, Advance online publication. https://doi.org/10.1177/0013164416685599.
Liu, Y., Douglas, J. A., & Henson, R. A. (2009). Testing person fit in cognitive diagnosis. Applied Psychological Measurement, 33(8), 579–598.
Article Google Scholar
Magidson, J., & Vermunt, J. K. (2001). Latent class factor and cluster models, Bi-plots, and related graphical displays. Sociological Methodology, 31(1), 223–264.
Article Google Scholar
Maydeu-Olivares, A., Cai, L., & Hernández, A. (2011). Comparing the fit of item response theory and factor analysis models. Structural Equation Modeling: A Multidisciplinary Journal, 18(3), 333–356.
Article Google Scholar
Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49(4), 305–328.
Article Google Scholar
McLachlan, G. J., & Krishnan, T. (1996). The EM algorithm and extensions. New York, NY: Wiley.
Google Scholar
Mislevy, R. J., Oranje, A., Bauer, M. I., von Davier, A., Hao, J., Corrigan, S., … John, M. (2014). Psychometric considerations in game-based assessment. Redwood, CA: GlassLab.
Google Scholar
Mislevy, R. J., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55(2), 195–215.
Article Google Scholar
Mullis, I. V. S., Martin, M. O., Foy, P., Olson, J. F., Preuschoff, C., Erberber, E., … Galia, J. (2009). TIMSS 2007 international mathematics report: Findings from IEA’s trends in international mathematics and science study at the fourth and eighth grades. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College.
Google Scholar
Mullis, I. V. S., Martin, M. O., Ruddock, G. J., O’Sullivan, C. Y., Arora, A., & Erberber, E. (2007). TIMSS 2007 assessment frameworks. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College.
Google Scholar
Neyman, J., & Pearson, E. S. (1992). On the problem of the most efficient tests of statistical hypotheses. In S. Kotz & N. L. Johnson (Eds.), Breakthroughs in statistics (pp. 73–108). New York, NY: Springer.
Chapter Google Scholar
Olson, J. F., Martin, M. O., & Mullis, I. V. S. (2008). TIMSS 2007 technical report. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College.
Google Scholar
Park, Y. S., & Lee, Y.-S. (2014). An extension of the DINA model using covariates examining factors affecting response probability and latent classification. Applied Psychological Measurement, 38(5), 376–390.
Article Google Scholar
R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org
Reckase, M. (2009). Multidimensional item response theory. New York, NY: Springer.
Book Google Scholar
Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25.
Article Google Scholar
Robitzsch, A., Kiefer, T., & Wu, M. (2017). TAM: Test analysis modules (R package version 2.6-2). Retrieved from https://CRAN.R-project.org/package=TAM
Rojas, G., de la Torre, J., & Olea, J. (2012, April). Choosing between general and specific cognitive diagnosis models when the sample size is small. Paper presented at the meeting of the National Council on Measurement in Education, Vancouver, Canada.
Google Scholar
Rupp, A. A. (2007). The answer is in the question: A guide for describing and investigating the conceptual foundations and statistical properties of cognitive psychometric models. International Journal of Testing, 7, 95–125.
Article Google Scholar
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford.
Google Scholar
Rupp, A. A., & Templin, J. L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 6(4), 219–262.
Google Scholar
Schwarz, G. (1976). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Article Google Scholar
Skaggs, G., Wilkins, J. L. M., & Hein, S. F. (2016). Grain size and parameter recovery with TIMSS and the general diagnostic model. International Journal of Testing, 16(4), 310–330.
Article Google Scholar
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583–639.
Article Google Scholar
Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold, & M. Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Hillsdale, NJ: Erlbaum.
Google Scholar
Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317–339.
Article Google Scholar
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287–305.
Article Google Scholar
Templin, J. L., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(2), 37–50.
Article Google Scholar
von Davier, M. (2005). A general diagnostic model applied to language testing data (ETS Research Report No. RR-05-16). Princeton, NJ: ETS.
Google Scholar
von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement, 7(1), 67–74.
Google Scholar
von Davier, M. (2014). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS research report series. http://onlinelibrary.wiley.com/doi/10.1002/ets2.12043/abstract
von Davier, M., & Haberman, S. (2014). Hierarchical diagnostic classification models morphing into unidimensional ‘diagnostic’ classification models—A commentary. Psychometrika, 79(2), 340–346.
Article Google Scholar
Wilson, M. (2008). Cognitive diagnosis using item response models. Zeitschrift für Psychologie/Journal of Psychology, 216(2), 74–88.
Article Google Scholar
Xu, G., & Zhang, S. (2016). Identifiability of diagnostic classification models. Psychometrika, 81(3), 625–649.
Article Google Scholar
Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data (ETS Research Report No. RR-08-27). Princeton, NJ: ETS.
Google Scholar
Yamamoto, K. (1989). Hybrid model of IRT and latent class models (ETS Research Report No. RR-89-41). Princeton, NJ: Educational Testing Service.
Google Scholar
Yan, D., Mislevy, R. J., & Almond, R. G. (2003). Design and analysis in a cognitive assessment (Research Report No. RR-03–32). Princeton, NJ: Educational Testing Service.
Google Scholar

Download references

Acknowledgements

Dr. Luna Bazaldua thanks UNAM for the PAPIIT research grant IA303018.

Author information

Authors and Affiliations

Teachers College, Columbia University, New York, NY, USA
Young-Sun Lee
School of Psychology, National Autonomous University of Mexico, Mexico City, Mexico
Diego A. Luna-Bazaldua

Authors

Young-Sun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Diego A. Luna-Bazaldua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Young-Sun Lee .

Editor information

Editors and Affiliations

National Board of Medical Examiners (NBME), Philadelphia, PA, USA
Matthias von Davier
Teachers College, Columbia University, New York, NY, USA
Young-Sun Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lee, YS., Luna-Bazaldua, D.A. (2019). How to Conduct a Study with Diagnostic Models. In: von Davier, M., Lee, YS. (eds) Handbook of Diagnostic Classification Models. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-030-05584-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-05584-4_25
Published: 12 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05583-7
Online ISBN: 978-3-030-05584-4
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics