Quality & Quantity

, Volume 51, Issue 3, pp 1167–1182 | Cite as

A multidimensional IRT approach for dimensionality assessment of standardised students’ tests in mathematics

  • Michela GnaldiEmail author


Mathematics proficiency involves several content domains and processes at different levels. This essentially means that mathematics ability is a complex latent variable. In standardised testing, the complex, and unobserved, latent constructs underlying a test are traditionally appraised by expert panels through subjective measures. In the present research, we deal with the issue of dimensionality of the latent structure behind a test measuring the mathematics ability of Italian students from a statistical and objective point of view, within an IRT framework. The data refer to a national standardised test developed and collected by the Italian National Institute for the Evaluation of the Education System (INVALSI), and administered to lower secondary school students (grade 8). The model we apply is based on a class of multidimensional latent class IRT models, which allows us to ascertain the test dimensionality based on an explorative approach, and by concurrently accounting for non-constant item discrimination and a discrete latent variable formulation. Our results show that the latent abilities underlying the INVALSI test mirror the assessment objectives defined at the national level for the mathematics curriculum. We recommend the use of the proposed extended IRT models in the practice of test construction, primarily—but not exclusively—in the educational field, to support the meaningfulness of the inferences made from test scores about students’ abilities.


Standardised national students’ tests INVALSI tests  Mathematics ability Multidimensional latent class IRT models Hierarchical clustering 


  1. Bacci, S., Bartolucci, F., Gnaldi, M.: A class of multidimensional latent class IRT models for ordinal polytomous item responses. Commun. Stat Theory Methods 43, 787–800 (2014)CrossRefGoogle Scholar
  2. Bartolini Bussi, M.G., Boni, M., Ferri, F., Garuti, R.: Early approach to theoretical thinking: gears in primary school. Educ. Stud. Math. 39, 67–87 (1999)CrossRefGoogle Scholar
  3. Bartolucci, F.: A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika 72, 141–157 (2007)CrossRefGoogle Scholar
  4. Bartolucci, F., Bacci, S., Gnaldi, M.: MultiLCIRT: an R package for multidimensional latent class item response models. Comput. Stat. Data Anal. 71, 971–985 (2014)CrossRefGoogle Scholar
  5. Bartolucci, F., Bacci, S., Gnaldi, M.: Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata. Chapman & Hall/CRCHall, New York (2015)CrossRefGoogle Scholar
  6. Birnbaum, A.: Some latent trait models and their use in inferring an examinee’s ability. In: Lord, F.M., Novick, M.R. (eds.) Statistical Theories of Mental Test Scores, pp. 395–479. Addison-Wesley, Reading, MA (1968)Google Scholar
  7. Briggs, D., Wilson, M.: An introduction to multidimensional measurement using Rasch models. J. Appl. Meas. 4, 87–100 (2003)Google Scholar
  8. Cai L, Yang JS, Hansen M.: Generalized full-information item bifactor analysis. Psychol. Methods 16, 221–248 (2011) CrossRefGoogle Scholar
  9. Camilli, G.: A conceptual analysis of differential item functioning in terms of a multidimensional item response model. Appl. Psychol. Meas. 16, 129–147 (1992)CrossRefGoogle Scholar
  10. Cizek, G., Bunch, M., Koons, H.: Setting performance standards: contemporary methods. Educ. Meas. 23(4), 31–50 (2004)CrossRefGoogle Scholar
  11. Douek, N.: Some remarks about argumentation and proof. In: Boero, P. (ed.) Theorems in School: From History, Epistemology and Cognition to Classroom Practice. Sense Publishers, Rotterdam (2006)Google Scholar
  12. Embretson, S.E.: A multidimensional latent trait model for measuring learning and change. Psychometrika 56, 495–515 (1991)CrossRefGoogle Scholar
  13. Formann, A.K.: Linear logistic latent class analysis and the Rasch model. In: Fischer, G., Molenaar, I. (eds.) Rasch Models: Foundations, Recent Developments, and Applications, pp. 239–255. Springer, New York (1995)CrossRefGoogle Scholar
  14. Glas, C.A.W., Verhelst, N.D.: Testing the rasch model. In: Fischer, G.H., Molenaar, I. (eds.) Rasch Models. Their Foundations, Recent Developments and Applications, pp. 69–95. Springer, New York (1995)Google Scholar
  15. Gnaldi, M., Bartolucci, F., Bacci, S.: A multilevel finite mixture item response model to cluster examinees and schools. Adv. Data Anal. Classif. (2015). doi: 10.1007/s11634-014-0196-0
  16. Golay, P., Lecerf, T.: On higher order structure and confirmatory factor analysis of the French Wechsler Adult Intelligence Scale (WAIS-III). Psychol. Assess. 23, 143–152 (2011)CrossRefGoogle Scholar
  17. Holzinger, K., Swineford, S.: The bi-factor method. Psychometrika 47, 41–54 (1937)CrossRefGoogle Scholar
  18. INVALSI: Quadro di riferimento per il primo ciclo di istruzione. Technical report, INVALSI (2012a)Google Scholar
  19. INVALSI: Quadro di riferimento per il secondo ciclo di istruzione. Technical report INVALSI (2012b)Google Scholar
  20. Jennrich, R., Bentler, P.: Exploratory bi-factor analysis. Psychometrika 76, 537–549 (2011)CrossRefGoogle Scholar
  21. Jennrich, R., Bentler, P.: Exploratory bi-factor analysis: the Oblique case. Psychometrika 77, 442–454 (2012)CrossRefGoogle Scholar
  22. Kane, M.: Content-related validity evidence in test development. In: Downing, S.M., Haladyna, T.M. (eds.) Handbook of Test Development. Lawrence Erlbaum Associates, Mahwah, New Jersey (2006)Google Scholar
  23. Lazarsfeld, P.F., Henry, N.W.: Latent Structure Analysis. Houghton Mifflin, Boston (1968)Google Scholar
  24. Lindsay, B., Clogg, C., Greco, J.: Semiparametric estimation in the rasch model and related exponential response models, including a simple latent class model for item analysis. J. Am. Stat. Assoc. 86, 96–107 (1991)CrossRefGoogle Scholar
  25. Loomis, S., Bourque, M.: From tradition to innovation: Standard setting on the national assessment of educational progress. In: Cizek, G.J. (ed.) Setting Performance Standards: Concepts Methods and Perspectives. Lawrence Erlbaum Associates, Mahwah, NJ (2001)Google Scholar
  26. Luecht, R.M., Miller, R.: Unidimensional calibrations and interpretations of composite traits for multidimensional tests. Appl. Psychol. Meas. 16, 279–293 (1992)CrossRefGoogle Scholar
  27. Martin-Löf, P.: Statistiska Modeller. Institütet för Försäkringsmatemetik och Matematisk Statistisk vid Stockholms Universitet, Stockholm (1973)Google Scholar
  28. Matteucci M, Mignani S (2015) Multidimensional irt models to analyze learning outcomes of italian students at the end of lower secondary school. In: Millsap R, Bolt D, van der Ark L, Wang W (eds) Quantitative Psychology Research, Springer Proceedings in Mathematics & Statistics, Springer International Publishing Switzerland, vol 89, pp. 91–111Google Scholar
  29. Messick, S.: Validity. In: Linn, R.L. (ed.) Educational Measurement. American Council on Education and Macmillan, New York (1989)Google Scholar
  30. Mokken, R.: A Theory and Procedure of Scale Analysis. De Gruyter, Berlin, Germany (1971)CrossRefGoogle Scholar
  31. Rasch G (1961) On general laws and the meaning of measurement in psychology. In: Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, pp. 321–333Google Scholar
  32. Raykov, T., Marcoulides, G.A.: Introduction to Psychometric Theory. Routledge, Taylor & Francis Group, New York (2011)Google Scholar
  33. Reckase, M.: Multidimensional Item Response Theory. Springer, NewYork (2009)CrossRefGoogle Scholar
  34. Schoenfeld, A.: Learning to think mathematically: problem solving, metacognition, and sense making in mathematics. In: Grows, D. (ed.) Handbook for Research on Mathematics Teaching and Learning. Macmillan, New York (1992)Google Scholar
  35. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)CrossRefGoogle Scholar
  36. Sijtsma, K., Molenaar, I.: Introduction to Nonparametric Item Response Theory. Sage, Thousand Oaks (2002)CrossRefGoogle Scholar
  37. Stout, W.: A non parametric approach for assessing latent trait unidimensionality. Psychometrika 52(4), 589–617 (1987)CrossRefGoogle Scholar
  38. Tout, D., Spithill, J.: The challenges and complexities of writing items to test mathematical literacy. In: Turner, R., Stacey, K. (eds.) Assessing Mathematical Literacy, The PISA Experience. Springer, New York (2014)Google Scholar
  39. Vermunt, J.: The use of restricted latent class models for defining and testing nonparametric and parametric item response theory models. Appl. Psychol. Meas. 25, 283–294 (2001)CrossRefGoogle Scholar
  40. Webb, N.L.: Identifying content for student achievement tests. In: Downing, S.M., Haladyna, T.M. (eds.) Handbook of Test Development. Lawrence Erlbaum Associates, Mahwah, New Jersey (2006)Google Scholar
  41. Wirth, R., Edwards, M.: Item factor analysis: current approaches and future directions. Psychol. Methods 12(1), 58–79 (2007)CrossRefGoogle Scholar
  42. Zhang, J., Stout, W.: Conditional covariance structure of generalized compensatory multidimensional item. Psychometrika 64(2), 129–152 (1999a)CrossRefGoogle Scholar
  43. Zhang, J., Stout, W.: The theoretical detect index of dimensionality and its application to approximate simple structure. Psychometrika 64(2), 213–249 (1999b)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  1. 1.Department of Political SciencesUniversity of PerugiaPerugiaItaly

Personalised recommendations