Advertisement

Utilizing Process Data for Cognitive Diagnosis

  • Hong JiaoEmail author
  • Dandan Liao
  • Peida Zhan
Chapter
Part of the Methodology of Educational Measurement and Assessment book series (MEMA)

Abstract

Process data, different from item responses, essentially shows the interactions between test-takers, item presentation including stems and options, technology-enhanced help features, as well as the computer interface. With the availability of process data in addition to product data, additional auxiliary information from the response process can be utilized to serve different assessment purposes such as enhancing accuracy in ability estimation, facilitating cognitive diagnosis, and aberrant responding behavior detection. Response time (RT) is the most frequently studied process data contained in log files in current psychometric modeling, although other process data is available such as the number of clicks, the frequency of use of help features, frequency of answer changes, and data collected using eye-tracking devices. Process data is worthy of exploration and the integration with product data can enhance our evidence base for assessment purposes. This chapter will focus on the use of RT as one important type of process data in cognitive diagnostic modeling.

References

  1. Bolsinova, M., & Maris, G. (2016). A test for conditional independence between response time and accuracy. British Journal of Mathematical and Statistical Psychology, 69(1), 62–79. https://doi.org/10.1111/bmsp.12059 CrossRefGoogle Scholar
  2. Bolsinova, M., & Tijmstra, J. (2016). Posterior predictive checks for conditional independence between response time and accuracy. Journal of Educational and Behavioral Statistics, 41(2), 123–145. https://doi.org/10.3102/1076998616631746 CrossRefGoogle Scholar
  3. Bolsinova, M., & Tijmstra, J. (2017). Improving precision of ability estimation: Getting more from response times. British Journal of Mathematical and Statistical Psychology, 71(1), 13–38. https://doi.org/10.1111/bmsp.12104
  4. Bolsinova, M., De Boeck, P., & Tijmstra, J. (2016). Modeling conditional dependence between response time and accuracy. Psychometrika, 1–23.https://doi.org/10.1007/s11336-016-9537-6
  5. Bolsinova, M., Tijmstra, J., & Molenaar, D. (2017). Response moderation models for conditional dependence between response time and response accuracy. British Journal of Mathematical and Statistical Psychology, 70(2), 257–279. https://doi.org/10.1111/bmsp.12076 CrossRefGoogle Scholar
  6. Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40, 454–476.  https://doi.org/10.3102/1076998615595403 CrossRefGoogle Scholar
  7. de la Torre, J. (2008). An empirically-based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343–362.  https://doi.org/10.1111/j.1745-3984.2008.00069.x CrossRefGoogle Scholar
  8. de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.  https://doi.org/10.1007/s11336-011-9207-7 CrossRefGoogle Scholar
  9. de la Torre, J., & Douglas, J. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353.  https://doi.org/10.1007/BF02295640 CrossRefGoogle Scholar
  10. de la Torre, J., & Song, H. (2009). Simultaneous estimation of overall and domain abilities: A higher-order IRT model approach. Applied Psychological Measurement, 33, 620–639. https://doi.org/10.1177/0146621608326423 CrossRefGoogle Scholar
  11. DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35, 8–26.  https://doi.org/10.1177/0146621610377081 CrossRefGoogle Scholar
  12. DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36, 447–468.  https://doi.org/10.1177/0146621612449069 CrossRefGoogle Scholar
  13. Embretson, S. E. (2015). The multicomponent latent trait model for diagnosis: Applications to heterogeneous test domains. Applied Psychological Measurement, 39(1), 16–30.  https://doi.org/10.1177/0146621614552014 CrossRefGoogle Scholar
  14. Embretson, S. E., & Yang, X. (2013). A multicomponent latent trait model for diagnosis. Psychometrika, 78, 14–36. https://doi.org/10.1007/s11336-012-9296-y CrossRefGoogle Scholar
  15. Finkelman, M., Kim, W., Weissman, A., & Cook, R. (2014). Cognitive diagnostic models and computerized adaptive testing: Two new item-selection methods that incorporate response times. Journal of Computerized Adaptive Testing, 2(3), 59–76. http://dx.doi.org/10.7333/1412-0204059 CrossRefGoogle Scholar
  16. Fox, J. P., & Marianti, S. (2016). Joint modeling of ability and differential speed using responses and response times. Multivariate Behavioral Research, 51(4), 540–553. https://doi.org/10.1080/00273171.2016.1171128 CrossRefGoogle Scholar
  17. Fox, J. P., & Marianti, S. (2017). Person-fit statistics for joint models for accuracy and speed. Journal of Educational Measurement, 54(2), 243–262. https://doi.org/10.1111/jedm.12143 CrossRefGoogle Scholar
  18. Gaviria, J. L. (2005). Increase in precision when estimating parameters in computer assisted testing using response time. Quality & Quantity, 39(1), 45–69. https://doi.org/10.1007/s11135-004-0437-y CrossRefGoogle Scholar
  19. Glas, C. A., & van der Linden, W. J. (2010). Marginal likelihood inference for a model for item responses and response times. British Journal of Mathematical and Statistical Psychology, 63(3), 603–626. https://doi.org/10.1348/000711009X481360 CrossRefGoogle Scholar
  20. Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608–626.CrossRefGoogle Scholar
  21. Goldhammer, F., Naumann, J., & Greiff, S. (2015). More is not always better: The relation between item response and item response time in Raven’s matrices. Journal of Intelligence, 3(1), 21–24. https://doi.org/10.3390/jintelligence3010021 CrossRefGoogle Scholar
  22. Haberman, J. S., & Sinharay, S. (2010). Reporting of subscores using multidimensional item response theory. Psychometrika, 75, 331–354. https://doi.org/10.1007/s11336-010-9158-4 CrossRefGoogle Scholar
  23. Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301–321.  https://doi.org/10.1111/j.1745-3984.1989.tb00336.x CrossRefGoogle Scholar
  24. He, Q., & von Davier, M. (2015). Identifying feature sequences from process data in problem-solving items with N-grams. In Quantitative psychology research (pp. 173–190). Cham, Switzerland: Springer.CrossRefGoogle Scholar
  25. He, Q., & von Davier, M. (2016). Analyzing process data from problem-solving items with n-grams: Insights from a computer-based large-scale assessment. In Handbook of research on technology tools for real-world skill development (pp. 750–777). Hershey, PA: IGI Global.CrossRefGoogle Scholar
  26. Henson, R., Templin, J., & Willse, J. (2009). Defining a family of cognitive diagnosis models using loglinear models with latent variables. Psychometrika, 74, 191–210.  https://doi.org/10.1007/s11336-008-9089-5 CrossRefGoogle Scholar
  27. Holden, R. R., & Kroner, D. G. (1992). Relative efficacy of differential response latencies for detecting faking on a self-report measure of psychopathology. Psychological Assessment, 4(2), 170–173.CrossRefGoogle Scholar
  28. Jiao, H., Zhan, P., Liao, M., & Man, K. (2017, November). A joint multigroup testlet model for responses and response time accounting for differential item and speed functioning. Presented at the fifth conference on the statistical methods in psychometrics. New York: Columbia University.Google Scholar
  29. Jiao, H., Liao, M., & Zhan, P. (2018, April). Cognitive diagnostic modeling using responses and response times for items embedded in multiple contexts. Paper presented at the Annual Meeting of the National Council on Measurement in Education. New York City, NY.Google Scholar
  30. Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272. https://doi.org/10.1177/01466210122032064 CrossRefGoogle Scholar
  31. Klein Entink, R. H., Fox, J. P., & van der Linden, W. J. (2009). A multivariate multilevel approach to the modeling of accuracy and speed of test takers. Psychometrika, 74(1), 21–48. https://doi.org/10.1007/s11336-008-9075-y CrossRefGoogle Scholar
  32. Klein Entink, R. H., van der Linden, W. J., & Fox, J. P. (2009). A Box–Cox normal model for response times. British Journal of Mathematical and Statistical Psychology, 62(3), 621–640. https://doi.org/10.1348/000711008X374126 CrossRefGoogle Scholar
  33. Lee, S. Y., & Wollack, J. (2017). Use of response time for detecting security threats and other anomalous behaviors. Paper presented at the Timing Impact on Measurement in Education conference, Philadelphia, PA.Google Scholar
  34. Lee, Y.-S., de la Torre, J., & Park, Y. S. (2012). Cognitive diagnosticity of IRT-constructed assessment: An empirical investigation. Asia Pacific Education Review, 13, 333–345.CrossRefGoogle Scholar
  35. Liao, D. (2018). Conditional joint modeling of response time and response accuracy for speed-accuracy-difficulty interaction. Unpublished doctoral dissertation, University of Maryland, College Park.Google Scholar
  36. Liu, C., & Cheng, Y. (2018). An application of the support vector machine for attribute-by- attribute classification in cognitive diagnosis. Applied Psychological Measurement, 42, 58–72. https://doi.org/10.1177/0146621617712246 CrossRefGoogle Scholar
  37. Locke, E. A. (1965). Interaction of ability and motivation in performance. Perceptual and Motor Skills, 21, 719–725. https://doi.org/10.2466/pms.1965.21.3.719 CrossRefGoogle Scholar
  38. Loeys, T., Rosseel, Y., & Baten, K. (2011). A joint modeling approach for reaction time and accuracy in psycholinguistic experiments. Psychometrika, 76(3), 487–503. https://doi.org/10.1007/s11336-011-9211-y CrossRefGoogle Scholar
  39. Logan, S., Medford, E., & Hughes, N. (2011). The importance of intrinsic motivation for high and low ability readers’ reading comprehension performance. Learning and Individual Differences, 21, 124–128. https://doi.org/10.1016/j.lindif.2010.09.011 CrossRefGoogle Scholar
  40. Ma, W., & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69, 253–275. https://doi.org/10.1111/bmsp.12070 CrossRefGoogle Scholar
  41. Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2, 99–120. https://doi.org/10.3102/10769986002002099 CrossRefGoogle Scholar
  42. Man, K., Jiao, H., & Ouyang, Y. (2016, April). Response time based nonparametric person fit index for aberrant response behavior detection in large-scale assessment. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Washington, DC.Google Scholar
  43. Marianti, S., Fox, J. P., Avetisyan, M., Veldkamp, B. P., & Tijmstra, J. (2014). Testing for aberrant behavior in response time modeling. Journal of Educational and Behavioral Statistics, 39(6), 426–451. https://doi.org/10.3102/1076998614559412 CrossRefGoogle Scholar
  44. Maris, E. (1995). Psychometric latent response models. Psychometrika, 60, 523–547. https://doi.org/10.1007/BF02294327 CrossRefGoogle Scholar
  45. Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212. https://doi.org/10.1007/BF02294535 CrossRefGoogle Scholar
  46. Maris, G., & van der Maas, H. (2012). Speed-accuracy response models: Scoring rules based on response time and accuracy. Psychometrika, 77(4), 615–633. https://doi.org/10.1007/s11336-012-9288-y CrossRefGoogle Scholar
  47. Meng, X.-B., Tao, J., & Chang, H.-H. (2015). A conditional joint modeling approach for locally dependent item responses and response times. Journal of Educational Measurement, 52, 1–27.  https://doi.org/10.1111/jedm.12060 CrossRefGoogle Scholar
  48. Minchen, N. (2017). Continuous response in cognitive diagnosis models: Response time modeling, computerized adaptive testing, and Q-Matrix validation. Unpublished doctoral dissertation. Rutgers, The State University of New Jersey.Google Scholar
  49. Minchen, N. D., & de la Torre, J. (2016, April). Using response time in cognitive diagnosis models. Poster presented at the annual meeting of the National Council on Measurement in Education, Washington, DC.Google Scholar
  50. Molenaar, D., Tuerlinckx, F., & van der Maas, H. L. (2015). A bivariate generalized linear item response theory modeling framework to the analysis of responses and response times. Multivariate Behavioral Research, 50(1), 56–74. https://doi.org/10.1080/00273171.2014.962684
  51. Molenaar, D., Bolsinova, M., Rozsa, S., & De Boeck, P. (2016). Response mixture modeling of intraindividual differences in responses and response times to the Hungarian WISC-IV Block Design test. Journal of Intelligence, 4(3), 10–29. doi:10.3390/jintelligence4030010CrossRefGoogle Scholar
  52. Molenaar, D., Oberski, D., Vermunt, J., & De Boeck, P. (2016). Hidden Markov item response theory models for responses and response times. Multivariate Behavioral Research, 51(5), 606–626. https://doi.org/10.1080/00273171.2016.1192983 CrossRefGoogle Scholar
  53. Molenaar, D., Bolsinova, M., & Vermunt, J. K. (2018). A semi-parametric within-subject mixture approach to the analyses of responses and response times. British Journal of Mathematical and Statistical Psychology, 71(2), 205–228.  https://doi.org/10.1111/bmsp.12117 CrossRefGoogle Scholar
  54. Qian, H., Staniewska, D., Reckase, M., & Woo, A. (2016). Using response time to detect item preknowledge in computer-based licensure examinations. Educational Measurement: Issues and Practice, 35, 38–47.  https://doi.org/10.1111/emip.12102 CrossRefGoogle Scholar
  55. R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
  56. Ranger, J., & Kuhn, J. T. (2012). Improving item response theory model calibration by considering response times in psychological tests. Applied Psychological Measurement, 36(3), 214–231. https://doi.org/10.1177/0146621612439796 CrossRefGoogle Scholar
  57. Ranger, J., & Kuhn, J. T. (2013). Analyzing response times in tests with rank correlation approaches. Journal of Educational and Behavioral Statistics, 38(1), 61–80. https://doi.org/10.3102/1076998611431086 CrossRefGoogle Scholar
  58. Ranger, J., & Kuhn, J. T. (2014). An accumulator model for responses and response times in tests based on the proportional hazards model. British Journal of Mathematical and Statistical Psychology, 67(3), 388–407. https://doi.org/10.1111/bmsp.12025 CrossRefGoogle Scholar
  59. Ranger, J., & Ortner, T. (2012). A latent trait model for response times on tests employing the proportional hazards model. British Journal of Mathematical and Statistical Psychology, 65(2), 334–349. https://doi.org/10.1111/j.2044-8317.2011.02032.x CrossRefGoogle Scholar
  60. Rupp, A. A., Gushta, M., Mislevy, R. J., & Shaffer, D. W. (2010). Evidence-centered design of epistemic games: Measurement principles for complex learning environments. Journal of Technology, Learning, and Assessment, 8(4). Retrieved [2019-01-29] from https://ejournals.bc.edu/ojs/index.php/jtla/article/view/1623
  61. Schnipke, D. L., & Scrams, D. J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. In C. N. Mills, M. Potenza, J. J. Fremer, & W. Ward (Eds.), Computer-based testing: Building the foundation for future assessments (pp. 237–266). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  62. Su, Y.-S., & Yajima, M. (2015). R2jags: Using R to run ‘JAGS’. R package version 0 (pp. 5–7). Retrieved from http://CRAN.R-project.org/package=R2jags Google Scholar
  63. Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345–354. https://doi.org/10.1111/j.1745-3984.1983.tb00212.x CrossRefGoogle Scholar
  64. van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31(2), 181–204. https://doi.org/10.3102/10769986031002181 CrossRefGoogle Scholar
  65. van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72(3), 287–308. https://doi.org/10.1007/s11336-006-1478-z CrossRefGoogle Scholar
  66. van der Linden, W. J., & Glas, C. A. (2010). Statistical tests of conditional independence between responses and/or response times on test items. Psychometrika, 75(1), 120–139. https://doi.org/10.1007/s11336-009-9129-9 CrossRefGoogle Scholar
  67. van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384. https://doi.org/10.1007/s11336-007-9046-8 CrossRefGoogle Scholar
  68. van der Linden, W. J., & van Krimpen-Stoop, E. M. (2003). Using response times to detect aberrant responses in computerized adaptive testing. Psychometrika, 68(2), 251–265. https://doi.org/10.1007/BF02294800 CrossRefGoogle Scholar
  69. van der Linden, W. J., Klein Entink, R. H., & Fox, J. P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34(5), 327–347. https://doi.org/10.1177/0146621609349800 CrossRefGoogle Scholar
  70. von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–307.  https://doi.org/10.1002/j.2333-8504.2005.tb01993.x CrossRefGoogle Scholar
  71. van Rijn, P. W., & Ali, U. S. (2017). A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing. British Journal of Mathematical and Statistical Psychology, 70(2), 317–345. https://doi.org/10.1111/bmsp.12101 CrossRefGoogle Scholar
  72. von Davier, M. (2005). A general diagnostic model applied to language testing data (ETS research rep. No. RR-05-16). Princeton, NJ: ETS.Google Scholar
  73. von Davier, M. (2014). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67, 49–71. https://doi.org/10.1111/bmsp.12003 CrossRefGoogle Scholar
  74. von Davier, M., & Yamamoto, K. (2004). A class of models for cognitive diagnosis. Presented at the Fourth Spearman Conference, Philadelphia, PA.Google Scholar
  75. Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68, 456–477.  https://doi.org/10.1111/bmsp.12054 CrossRefGoogle Scholar
  76. Wang, C., Chang, H.-H., & Douglas, J. A. (2013). The linear transformation model with frailties for the analysis of item response times. British Journal of Mathematical and Statistical Psychology, 66, 144–168. https://doi.org/10.1111/j.2044-8317.2012.02045.x CrossRefGoogle Scholar
  77. Wang, C., Fan, Z., Chang, H. H., & Douglas, J. A. (2013). A semiparametric model for jointly analyzing response times and accuracy in computerized testing. Journal of Educational and Behavioral Statistics, 38(4), 381–417. https://doi.org/10.3102/1076998612461831 CrossRefGoogle Scholar
  78. Wang, T. (2006). A model for the joint distribution of item response and response time using a one-parameter Weibull distribution (Center for Advanced Studies in Measurement and Assessment Research Report, no. 20). Iowa City, IA: University of Iowa.Google Scholar
  79. Wang, T., & Hanson, B. A. (2005). Development and calibration of an item response model that incorporates response time. Applied Psychological Measurement, 29(5), 323–339. https://doi.org/10.1177/0146621605275984 CrossRefGoogle Scholar
  80. Xu, X., & von Davier, M. (2008a). Fitting the structured general diagnostic model to NAEP data (RR-08-27, ETS Research Report).Google Scholar
  81. Xu, X., & von Davier, M. (2008b). Comparing multiple-group multinomial loglinear models for multidimensional skill distributions in the general diagnostic model (RR-08-35, ETS Research Report).Google Scholar
  82. Yao, L., & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psychological Measurement, 31, 83–105. https://doi.org/10.1177/0146621606291559 CrossRefGoogle Scholar
  83. Zhan, P., Jiao, H., Liao, M., & Bian, Y. (2018). Bayesian DINA modeling incorporating within-item characteristics dependency. Applied Psychological Measurement.  https://doi.org/10.1177/0146621618781594
  84. Zhan, P., Jiao, H., & Liao, D. (2018). Cognitive diagnosis modelling incorporating item response times. British Journal of Mathematical and Statistical Psychology, 71, 262–286.  https://doi.org/10.1111/bmsp.12114 CrossRefGoogle Scholar
  85. Zhan, P., Liao, M., & Bian, Y. (2018). Joint testlet cognitive diagnosis modeling for paired local item dependence in response times and response accuracy. Frontiers in Psychology, 9, 609.  https://doi.org/10.3389/fpsyg.2018.00607 CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Human Development and Quantitative MethodologyUniversity of MarylandCollege ParkUSA
  2. 2.American Institutes for ResearchWashington, DCUSA
  3. 3.Department of Psychology, College of Teacher EducationZhejiang Normal UniversityZhejiangChina

Personalised recommendations