Abstract
As a response to the ever-increasing demand for diagnostic assessments that can provide more informative feedback about students’ knowledge state, assessment design frameworks are needed that can help designers incorporate relevant cognitive theories into the development, implementation, and analysis process. In this chapter, we describe one prominent framework for principled diagnostic assessment design called evidence-centered design (ECD) (e.g., Mislevy et al. A brief introduction to evidence-centered design. CSE Technical Report 632. Los Angeles: The National Center for Research on Evaluation, Standards, Student Testing (CRESST), Center for Studies in Education, UCLA, 2004) as well as a class of statistical models called diagnostic classification models (DCMs) (e.g., Rupp et al. Diagnostic assessment methods: theory and application. The Guilford Press, New York, 2010) that can make inferences about student profiles within this framework. With respect to DCMs we describe key terminology, concepts, and a unified estimation framework known as the log-linear cognitive diagnosis model (LCDM) (Henson et al. Psychometrika 74(2):191–210, 2009). We present three examples to illustrate how particular DCMs can be specified to address different cognitive theories concerning the process of knowledge processing. At the end of this chapter, we illustrate the utility of DCMs with a real-data set on arithmetic ability in elementary school to illustrate the type of diagnostic inferences we can make about students’ attribute profiles.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agresti, A. (2010). Categorical data analysis (2nd ed.). New York: Wiley.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Buck, G., & Tatsuoka, K. K. (1998). Application of the rule-space procedure to language testing: Examining attributes of a free response listening test. Language Testing, 15(2), 119–157.
Choi, H-. J., Templin, J. L., Cohen, A. S., & Atwood, C. H., (2010, April). The impact of model misspecification on estimation accuracy in diagnostic classification models (DCMs). Paper presented at the annual meeting of the National Council for Measurement and Education, Denver, CO.
Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Pacific Grove: Wadsworth.
de Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Press.
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115–130.
DiBello, L., Roussos, L., & Stout, W. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26, pp. 979–1030). Amsterdam: Elsevier.
Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396.
Frey, A., Hartig, J., & Rupp, A. (2009). An NCME instructional module on booklet designs in large-scale assessments of student achievement: Theory and practice. Educational Measurement: Issues and Practice, 28, 39–53.
Gan, G., Ma, C., & Wu, J. (2007). Data clustering: Theory, algorithms, and applications. Alexandria: American Statistical Association.
Gierl, M. J., Tan, X., & Wang, C. (2005). Identifying content and cognitive dimensions on the SAT (Research Rep. No. 2005–2011). New York: College Examination Board.
Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272.
Kunina-Habenicht, O., Rupp, A., & Wilhelm, O. (2010, May). Modelling the latent structure of a diagnostic mathematics assessment within a general log-linear modelling framework. Presented at the annual meeting of the National Council for Measurement in Education (NCME), Denver, Colorado.
Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49, 59–81.
Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2009). A practical illustration of multidimensional diagnostic skills profiling: Comparing results from confirmatory factor analysis and diagnostic classification models. Studies in Educational Evaluation, 35, 64–70.
Leighton, J., & Gierl, M. (2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge: Cambridge University Press.
Leighton, J. P., & Gierl, M. J. (2011). The learning sciences in educational assessment: The role of cognitive models. New York: Cambridge University Press.
Levy, R., & Mislevy, R. J. (2004). Specifying and refining a measurement model for a computer-based interactive assessment. International Journal of Testing, 4, 333–369.
Linn, R. L. (1986). Testing and assessment in education: Policy issues. American Psychologist, 41, 1153–1160.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah: Erlbaum.
McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Mislevy, R. J., & Riconscente, M. (2005). Evidence-centered assessment design: Layers, structures, and terminology (PADI Technical Rep. 9). Menlo Park: SRI International.
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–62.
Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2004). A brief introduction to evidence-centered design (CSE Technical Rep. 632). Los Angeles: The National Center for Research on Evaluation, Standards, Student Testing (CRESST), Center for Studies in Education, UCLA.
Mok, M. M. C. (2010). Self-directed learning oriented assessment: Assessment that informs learning & empowers the learner. Hong Kong: Pace Publications Ltd.
Muthén, L. K., & Muthén, B. O. (1998–2010). Mplus (Version 6) [Computer software]. Los Angeles: Muthén & Muthén.
National Research Council. (2001). Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press.
Nugent, R., Ayers, E., & Dean, N. (2009). Conditional subspace clustering of skill mastery: Identifying skills that separate students. In Proceedings from the 2nd international conference on educational data mining (pp. 101–110). Retrieved July 19, 2010, from www.educationaldatamining.org/EDM2009/
Nugent, R., Dean, N., & Ayers, E. (2010). Skill set profile clustering: The empty K-means algorithm with automatic specification of starting cluster centers. In Proceedings from the 3rd international conference on educational data mining (pp. 151–160). Retrieved July 19, 2010, from http://educationaldatamining.org/EDM2010/
O’Reilly, T. P., Sheehan, K. M., & Bauer, M. I. (2008, March). Cognitively based assessments of, for, and as learning: Bridging the gap between research and practice. Presented at the annual meeting of the American Educational Research Association, New York.
Reckase, M. (2009). Multidimensional item response theory. New York: Springer.
Roussos, L. A., DiBello, L. V., Stout, W., Hartz, S. M., Henson, R. A., & Templin, J. L. (2007). The fusion model skills diagnosis system. In J. Leighton & M. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 275–318). Cambridge: Cambridge University Press.
Rupp, A., & Templin, J. (2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68, 78–96.
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York: The Guilford Press.
Rupp, A. A., Levy, R., DiCerbo, K., Sweet, S., et al. (in press). Putting ECD into practice: The interplay of theory and data in evidence models within a digital learning environment. Journal of Educational Data Mining.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. Boca Raton: Chapman & Hall/CRC.
Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.
Tatsuoka, K. K. (1990). Toward an integration of item response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold, & M. G. Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Hillsdale: Erlbaum.
Templin, J. L. (2004). Generalized linear mixed proficiency models. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychology Methods, 11(3), 287–305.
Templin, J. L., Henson, R. A., & Douglas, J. (2011). General theory and estimation of cognitive diagnosis models: Using Mplus to derive model estimates.
Thadani, V., Stevens, R. H., & Tao, A. (2009). Measuring complex features of science instruction: Developing tools to investigate the link between teaching and learning. The Journal of the Learning Sciences, 18, 285–322.
Toulmin, S. E. (1958). The uses of argument. Cambridge: Cambridge University Press.
von Davier, M. (2005). A general diagnostic model applied to language testing data. (ETS Research Rep. No. RR-05–16). Princeton: Educational Testing Service.
von Davier, M. (2006). Multidimensional latent trait modelling (MDLTM) [Software program]. Princeton: Educational Testing Service.
von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52, 8–28.
West, P., Rutstein, D. W., Mislevy, R. J., Liu, J., Levy, R., DiCerbo, K. E., Crawford, A., Choi, Y., & Behrens, J. (2009, June). A Bayes net approach to modeling learning progressions and task performances. Paper presented at the Learning Progressions in Science (LeaPS) conference, Iowa City, IA.
Yen, W. M., & Fitzpatrick, A. R. (2006). Item response theory. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 111–153). Westport: American Council on Education.
Acknowledgements
We would like to thank Olga Kunina-Habenicht for giving us access to the data that was used for the example in the last section in this chapter. The work of Dr. Kunina-Habenicht, including the design, implementation, and analysis, was funded, in part, by grant number RU-424/3–1 from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) in the Priority Research Program entitled “Models of Competencies.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Choi, HJ., Rupp, A.A., Pan, M. (2012). Standardized Diagnostic Assessment Design and Analysis: Key Ideas from Modern Measurement Theory. In: Mok, M. (eds) Self-directed Learning Oriented Assessments in the Asia-Pacific. Education in the Asia-Pacific Region: Issues, Concerns and Prospects, vol 18. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-4507-0_4
Download citation
DOI: https://doi.org/10.1007/978-94-007-4507-0_4
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-4506-3
Online ISBN: 978-94-007-4507-0
eBook Packages: Humanities, Social Sciences and LawEducation (R0)