Skip to main content

Standardized Diagnostic Assessment Design and Analysis: Key Ideas from Modern Measurement Theory

  • Chapter
  • First Online:

Part of the book series: Education in the Asia-Pacific Region: Issues, Concerns and Prospects ((EDAP,volume 18))

Abstract

As a response to the ever-increasing demand for diagnostic assessments that can provide more informative feedback about students’ knowledge state, assessment design frameworks are needed that can help designers incorporate relevant cognitive theories into the development, implementation, and analysis process. In this chapter, we describe one prominent framework for principled diagnostic assessment design called evidence-centered design (ECD) (e.g., Mislevy et al. A brief introduction to evidence-centered design. CSE Technical Report 632. Los Angeles: The National Center for Research on Evaluation, Standards, Student Testing (CRESST), Center for Studies in Education, UCLA, 2004) as well as a class of statistical models called diagnostic classification models (DCMs) (e.g., Rupp et al. Diagnostic assessment methods: theory and application. The Guilford Press, New York, 2010) that can make inferences about student profiles within this framework. With respect to DCMs we describe key terminology, concepts, and a unified estimation framework known as the log-linear cognitive diagnosis model (LCDM) (Henson et al. Psychometrika 74(2):191–210, 2009). We present three examples to illustrate how particular DCMs can be specified to address different cognitive theories concerning the process of knowledge processing. At the end of this chapter, we illustrate the utility of DCMs with a real-data set on arithmetic ability in elementary school to illustrate the type of diagnostic inferences we can make about students’ attribute profiles.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Agresti, A. (2010). Categorical data analysis (2nd ed.). New York: Wiley.

    Google Scholar 

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.

    Article  Google Scholar 

  • Buck, G., & Tatsuoka, K. K. (1998). Application of the rule-space procedure to language testing: Examining attributes of a free response listening test. Language Testing, 15(2), 119–157.

    Google Scholar 

  • Choi, H-. J., Templin, J. L., Cohen, A. S., & Atwood, C. H., (2010, April). The impact of model misspecification on estimation accuracy in diagnostic classification models (DCMs). Paper presented at the annual meeting of the National Council for Measurement and Education, Denver, CO.

    Google Scholar 

  • Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Pacific Grove: Wadsworth.

    Google Scholar 

  • de Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Press.

    Google Scholar 

  • de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115–130.

    Article  Google Scholar 

  • DiBello, L., Roussos, L., & Stout, W. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26, pp. 979–1030). Amsterdam: Elsevier.

    Google Scholar 

  • Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396.

    Article  Google Scholar 

  • Frey, A., Hartig, J., & Rupp, A. (2009). An NCME instructional module on booklet designs in large-scale assessments of student achievement: Theory and practice. Educational Measurement: Issues and Practice, 28, 39–53.

    Article  Google Scholar 

  • Gan, G., Ma, C., & Wu, J. (2007). Data clustering: Theory, algorithms, and applications. Alexandria: American Statistical Association.

    Book  Google Scholar 

  • Gierl, M. J., Tan, X., & Wang, C. (2005). Identifying content and cognitive dimensions on the SAT (Research Rep. No. 2005–2011). New York: College Examination Board.

    Google Scholar 

  • Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

    Google Scholar 

  • Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210.

    Article  Google Scholar 

  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272.

    Article  Google Scholar 

  • Kunina-Habenicht, O., Rupp, A., & Wilhelm, O. (2010, May). Modelling the latent structure of a diagnostic mathematics assessment within a general log-linear modelling framework. Presented at the annual meeting of the National Council for Measurement in Education (NCME), Denver, Colorado.

    Google Scholar 

  • Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49, 59–81.

    Google Scholar 

  • Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2009). A practical illustration of multidimensional diagnostic skills profiling: Comparing results from confirmatory factor analysis and diagnostic classification models. Studies in Educational Evaluation, 35, 64–70.

    Article  Google Scholar 

  • Leighton, J., & Gierl, M. (2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Leighton, J. P., & Gierl, M. J. (2011). The learning sciences in educational assessment: The role of cognitive models. New York: Cambridge University Press.

    Google Scholar 

  • Levy, R., & Mislevy, R. J. (2004). Specifying and refining a measurement model for a computer-based interactive assessment. International Journal of Testing, 4, 333–369.

    Article  Google Scholar 

  • Linn, R. L. (1986). Testing and assessment in education: Policy issues. American Psychologist, 41, 1153–1160.

    Article  Google Scholar 

  • Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.

    Google Scholar 

  • McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah: Erlbaum.

    Google Scholar 

  • McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.

    Book  Google Scholar 

  • Mislevy, R. J., & Riconscente, M. (2005). Evidence-centered assessment design: Layers, structures, and terminology (PADI Technical Rep. 9). Menlo Park: SRI International.

    Google Scholar 

  • Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–62.

    Article  Google Scholar 

  • Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2004). A brief introduction to evidence-centered design (CSE Technical Rep. 632). Los Angeles: The National Center for Research on Evaluation, Standards, Student Testing (CRESST), Center for Studies in Education, UCLA.

    Google Scholar 

  • Mok, M. M. C. (2010). Self-directed learning oriented assessment: Assessment that informs learning & empowers the learner. Hong Kong: Pace Publications Ltd.

    Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (1998–2010). Mplus (Version 6) [Computer software]. Los Angeles: Muthén & Muthén.

    Google Scholar 

  • National Research Council. (2001). Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press.

    Google Scholar 

  • Nugent, R., Ayers, E., & Dean, N. (2009). Conditional subspace clustering of skill mastery: Identifying skills that separate students. In Proceedings from the 2nd international conference on educational data mining (pp. 101–110). Retrieved July 19, 2010, from www.educationaldatamining.org/EDM2009/

  • Nugent, R., Dean, N., & Ayers, E. (2010). Skill set profile clustering: The empty K-means algorithm with automatic specification of starting cluster centers. In Proceedings from the 3rd international conference on educational data mining (pp. 151–160). Retrieved July 19, 2010, from http://educationaldatamining.org/EDM2010/

  • O’Reilly, T. P., Sheehan, K. M., & Bauer, M. I. (2008, March). Cognitively based assessments of, for, and as learning: Bridging the gap between research and practice. Presented at the annual meeting of the American Educational Research Association, New York.

    Google Scholar 

  • Reckase, M. (2009). Multidimensional item response theory. New York: Springer.

    Book  Google Scholar 

  • Roussos, L. A., DiBello, L. V., Stout, W., Hartz, S. M., Henson, R. A., & Templin, J. L. (2007). The fusion model skills diagnosis system. In J. Leighton & M. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 275–318). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Rupp, A., & Templin, J. (2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68, 78–96.

    Article  Google Scholar 

  • Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York: The Guilford Press.

    Google Scholar 

  • Rupp, A. A., Levy, R., DiCerbo, K., Sweet, S., et al. (in press). Putting ECD into practice: The interplay of theory and data in evidence models within a digital learning environment. Journal of Educational Data Mining.

    Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  • Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. Boca Raton: Chapman & Hall/CRC.

    Book  Google Scholar 

  • Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.

    Article  Google Scholar 

  • Tatsuoka, K. K. (1990). Toward an integration of item response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold, & M. G. Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Hillsdale: Erlbaum.

    Google Scholar 

  • Templin, J. L. (2004). Generalized linear mixed proficiency models. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

    Google Scholar 

  • Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychology Methods, 11(3), 287–305.

    Article  Google Scholar 

  • Templin, J. L., Henson, R. A., & Douglas, J. (2011). General theory and estimation of cognitive diagnosis models: Using Mplus to derive model estimates.

    Google Scholar 

  • Thadani, V., Stevens, R. H., & Tao, A. (2009). Measuring complex features of science instruction: Developing tools to investigate the link between teaching and learning. The Journal of the Learning Sciences, 18, 285–322.

    Article  Google Scholar 

  • Toulmin, S. E. (1958). The uses of argument. Cambridge: Cambridge University Press.

    Google Scholar 

  • von Davier, M. (2005). A general diagnostic model applied to language testing data. (ETS Research Rep. No. RR-05–16). Princeton: Educational Testing Service.

    Google Scholar 

  • von Davier, M. (2006). Multidimensional latent trait modelling (MDLTM) [Software program]. Princeton: Educational Testing Service.

    Google Scholar 

  • von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52, 8–28.

    Google Scholar 

  • West, P., Rutstein, D. W., Mislevy, R. J., Liu, J., Levy, R., DiCerbo, K. E., Crawford, A., Choi, Y., & Behrens, J. (2009, June). A Bayes net approach to modeling learning progressions and task performances. Paper presented at the Learning Progressions in Science (LeaPS) conference, Iowa City, IA.

    Google Scholar 

  • Yen, W. M., & Fitzpatrick, A. R. (2006). Item response theory. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 111–153). Westport: American Council on Education.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Olga Kunina-Habenicht for giving us access to the data that was used for the example in the last section in this chapter. The work of Dr. Kunina-Habenicht, including the design, implementation, and analysis, was funded, in part, by grant number RU-424/3–1 from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) in the Priority Research Program entitled “Models of Competencies.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to André A. Rupp .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Choi, HJ., Rupp, A.A., Pan, M. (2012). Standardized Diagnostic Assessment Design and Analysis: Key Ideas from Modern Measurement Theory. In: Mok, M. (eds) Self-directed Learning Oriented Assessments in the Asia-Pacific. Education in the Asia-Pacific Region: Issues, Concerns and Prospects, vol 18. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-4507-0_4

Download citation

Publish with us

Policies and ethics