Advertisement

Exploratory Data Analysis for Cognitive Diagnosis: Stochastic Co-blockmodel and Spectral Co-clustering

  • Yunxiao ChenEmail author
  • Xiaoou Li
Chapter
Part of the Methodology of Educational Measurement and Assessment book series (MEMA)

Abstract

Exploratory data analysis (EDA) is an essential stage in statistical analysis that extracts information from data to assist confirmatory statistical modeling. Diagnostic classification models (DCMs) are a confirmatory approach to cognitive diagnosis, for which EDA tools need to be developed to assist the design of DCM-based tests. In this chapter, we propose a stochastic co-blockmodel that approximates the structure of many DCMs and an efficient spectral co-clustering algorithm for fitting the model. The proposed approach explores the structure of assessment data by clustering students and items into latent classes and analyzing the relationship between the student classes and the item classes. The performance of the proposed algorithms is evaluated through simulation studies. A real data example is provided to illustrate the use of the proposed method.

References

  1. Allman, E. S., Matias, C., & Rhodes, J. A. (2009). Identifiability of parameters in latent structure models with many observed variables. The Annals of Statistics, 37, 3099–3132. https://doi.org/10.1214/09-AOS689 CrossRefGoogle Scholar
  2. Amini, A. A., Chen, A., Bickel, P. J., & Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. The Annals of Statistics, 41, 2097–2122. https://doi.org/10.1214/13-AOS1138 CrossRefGoogle Scholar
  3. Banerjee, S., & Roy, A. (2014). Linear algebra and matrix analysis for statistics. New York, NY: CRC Press.CrossRefGoogle Scholar
  4. Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33–57. https://doi.org/10.1007/s11336-009-9136-x CrossRefGoogle Scholar
  5. Celeux, G., & Diebolt, J. (1985). The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly, 2, 73–82.Google Scholar
  6. Chen, Y., Li, X., Liu, J., Xu, G., & Ying, Z. (2017). Exploratory item classification via spectral graph clustering. Applied Psychological Measurement, 41, 579–599. https://doi.org/10.1177/0146621617692977 CrossRefGoogle Scholar
  7. Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660–692. https://doi.org/10.1007/s11336-016-9545-6 CrossRefGoogle Scholar
  8. Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110, 850–866. https://doi.org/10.1080/01621459.2014.934827 CrossRefGoogle Scholar
  9. Choi, D., & Wolfe, P. J. (2014). Co-clustering separately exchangeable network data. The Annals of Statistics, 42, 29–63. https://doi.org/10.1214/13-AOS1173 CrossRefGoogle Scholar
  10. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39, 1–38.Google Scholar
  11. Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco (pp. 269–274).Google Scholar
  12. Golub, G. H., & van Loan, C. F. (2012). Matrix computations. Baltimore, MD: JHU Press.Google Scholar
  13. Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301–321. https://doi.org/10.1111/j.1745-3984.1989.tb00336.x CrossRefGoogle Scholar
  14. Hartigan, J. A. (1972). Direct clustering of a data matrix. Journal of the American Statistical Association, 67, 123–129. https://doi.org/10.1080/01621459.1972.10481214 CrossRefGoogle Scholar
  15. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. New York, NY: Springer. https://doi.org/10.1007/978-0-387-84858-7 CrossRefGoogle Scholar
  16. Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210. https://doi.org/10.1007/s11336-008-9089-5 CrossRefGoogle Scholar
  17. Holland, P. W., Laskey, K. B., & Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks, 5, 109–137. https://doi.org/10.1016/0378-8733(83)90021-7 CrossRefGoogle Scholar
  18. Joseph, A., & Yu, B. (2016). Impact of regularization on spectral clustering. The Annals of Statistics, 44, 1765–1791. https://doi.org/10.1214/16-AOS1447 CrossRefGoogle Scholar
  19. Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272. https://doi.org/10.1177/01466210122032064 CrossRefGoogle Scholar
  20. Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564. https://doi.org/10.1177/0146621612456591 CrossRefGoogle Scholar
  21. Liu, J., Xu, G., & Ying, Z. (2013). Theory of the self-learning Q-matrix. Bernoulli, 19, 1790–1817. https://doi.org/10.3150/12-BEJ430 CrossRefGoogle Scholar
  22. Nielsen, S. F., et al. (2000). The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli, 6, 457–489. https://doi.org/10.2307/3318671 CrossRefGoogle Scholar
  23. Qin, T., & Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems 26 (pp. 3120–3128). Red Hook: NY Curran.Google Scholar
  24. R Core Team. (2013). R: A language and environment for statistical computing [Software-Handbuch]. Vienna, Austria. Retrieved from http://www.R-project.org/ Google Scholar
  25. Rohe, K., Chatterjee, S., & Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. The Annals of Statistics, 39, 1878–1915. https://doi.org/10.1214/11-AOS887 CrossRefGoogle Scholar
  26. Rohe, K., Qin, T., & Yu, B. (2016). Co-clustering directed graphs to discover asymmetries and directional communities. Proceedings of the National Academy of Sciences, 113, 12679–12684.  https://doi.org/10.1073/pnas.1525793113
  27. Rupp, A. A., & Templin, J. L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 6, 219–262. https://doi.org/10.1080/15366360802490866 Google Scholar
  28. Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford Press.Google Scholar
  29. Templin, J. L., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79, 317–339. https://doi.org/10.1007/s11336-013-9362-0 CrossRefGoogle Scholar
  30. Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305. https://doi.org/10.1037/1082-989X.11.3.287 CrossRefGoogle Scholar
  31. Templin, J. L., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32, 37–50.  https://doi.org/10.1111/emip.12010 CrossRefGoogle Scholar
  32. von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–307. https://doi.org/10.1348/000711007X193957 CrossRefGoogle Scholar
  33. von Davier, M. (2014). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS Research Report Series, 2014, 1–13.  https://doi.org/10.1002/ets2.12043 CrossRefGoogle Scholar
  34. von Davier, M., & Haberman, S. J. (2014). Hierarchical diagnostic classification models morphing into unidimensional diagnostic classification models: A commentary. Psychometrika, 79, 340–346. https://doi.org/10.1007/s11336-013-9363-z CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.London School of Economics and Political ScienceLondonUK
  2. 2.School of StatisticsUniversity of MinnesotaMinneapolisUSA

Personalised recommendations