Skip to main content

Abstract

The dimensionality of a test is the number of latent variables that is measured by the test. An essential unidimensional test measures predominantly one latent variable, and a multidimensional test measures more than one latent variable. Three types of multidimensionality are described. First, a simple structure . The test measures two or more latent variables, and falls apart into two or more essential unidimensional subtests. Second, a complex structure . Each of the items measures the same two or more latent variables. Third, a bi-factor structure . Each of the items measures the same general latent variable, while subgroups of items also measure specific latent variables. The interpretation of study results is straightforward if the dependent variable is a simple-structure test, but is hard if it is a complex-structure test. Factor Analysis (FA) and Principle Component Analysis (PCA) of inter-item product moment correlations (pmcs) and the reliability are often applied to assess the dimensionality of a test . FA and PCA of inter-item pmcs fail, especially if the number of answer categories of the items is small, and high reliability does not guarantee that the test is essential unidimensional. Appropriate methods to assess test dimensionality are FA of inter-item tetrachoric (dichotomous items) and polychoric (more than two ordered answer categories) correlations, Mokken scale analysis, and full-information FA . The factor analytic methods make stronger assumptions than Mokken’s method, but Mokken’s method is not capable to assess the type of multidimensionality. Measurement invariance of an item with respect to a variable (e.g., E- and C-condition membership) means that the same item response model applies to all values of that variable (e.g., the same model in E- and C-conditions). Test scores should be measurement invariant to interpret study results, for example, measurement invariant with respect to condition membership to compare the difference of E- and C-condition test score means.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.

    Article  Google Scholar 

  • Bock, R. D., Gibbons, R., Schilling, S. G., Muraki, E., Wilson, D., & Wood, R. (2000). TESTFACT 3.0: Test scoring, item statistics, and full-information item factor analysis. Chicago, Il: Scientific Software.

    Google Scholar 

  • Boomsma, A. (1993). On the robustness of LISREL (maximum likelihood estimation) against small sample size and non-normality. Unpublished doctoral dissertation, University of Groningen, The Netherlands.

    Google Scholar 

  • Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 221–255). Westport, CT: Praeger Publishers.

    Google Scholar 

  • Camilli, G., Prowker, A., Dossey, J. A., Lindquist, M. M., Chiu, T.-W., Vargas, S., et al. (2008). Summarizing item difficulty variation with parcel scores. Journal of Educational Measurement, 45, 363–389.

    Article  Google Scholar 

  • Carroll, J. B. (1945). The effect of difficulty and chance success on correlations between items or between tests. Psychometrika, 10, 1–19.

    Article  Google Scholar 

  • Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5, and 7 response categories: A comparison of categorical variable estimators using simulated data. British Journal of Mathematical and Statistical Psychology, 49, 309–326.

    Article  Google Scholar 

  • Ettema, T. P. (2007). The construction of a dementia-specific Quality of Life instrument rated by professional caregivers in residential settings: The QUALIDEM. Unpublished doctoral dissertation, Free University at Amsterdam, The Netherlands.

    Google Scholar 

  • Ettema, T. P., Dröes, R. M., de Lange, J., Mellenbergh, G. J., & Ribbe, M. W. (2007). QUALIDEM: Development and evaluation of a dementia specific Quality of Life instrument. Scalability, reliability, and internal structure. International Journal of Geriatric Psychiatry, 22, 549–556.

    Article  Google Scholar 

  • Green, S. B., Lissitz, R. W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37, 827–838.

    Article  Google Scholar 

  • Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling. Sociological Methods & Research, 26, 329–367.

    Article  Google Scholar 

  • Houtkoop, B. L., & Plak, S. (2015). Nonparametric item response theory for dichotomous item scores. In H. J. Adèr & G. J. Mellenbergh (Eds.), Advising on research methods: Selected topics 2015 (pp. 51–67). Huizen, The Netherlands: van Kessel.

    Google Scholar 

  • Jöreskog, K. G., & Sörbom, D. (2001). PRELIS: A program for multivariate data screening and data summarization. Chicago, Il: Scientific Software.

    Google Scholar 

  • McLeod, L. D., Swygert, K. A., & Thissen, D. (2001). Factor analysis for items scored in two categories. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 189–216). Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143.

    Article  Google Scholar 

  • Mellenbergh, G. J. (2011). A conceptual introduction to psychometrics: Development, analysis, and application of psychological and educational tests. The Hague, The Netherlands: Eleven International Publishing.

    Google Scholar 

  • Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York, NY: Routledge.

    Google Scholar 

  • Mokken, R. J. (1971). A theory and procedure of scale analysis with applications in political research. Berlin, Germany: De Gruyter.

    Book  Google Scholar 

  • Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351–367). New York, NY: Springer.

    Chapter  Google Scholar 

  • Molenaar, I. W. (1991). A weighted Loevinger H-coefficient extending Mokken scaling to multicategory items. Kwantitatieve Methoden, 12, 97–117.

    Google Scholar 

  • Molenaar, I. W. (1997). Nonparametric models for polytomous responses. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 369–380). New York, NY: Springer.

    Chapter  Google Scholar 

  • Molenaar, I. W., & Sijtsma, K. (2000). User’s manual MSP5 for Windows. Groningen, The Netherlands: iec ProGAMMA.

    Google Scholar 

  • Muraki, E. (1993). POLYFACT [Computer program]. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Muraki, E., & Carlson, J. E. (1995). Full-information factor analysis for polytomous item responses. Applied Psychological Measurement, 19, 73–90.

    Article  Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (1998–2015). Mplus user’s guide (Version 7.11). Los Angeles, CA: Author.

    Google Scholar 

  • Oort, F. J. (1993). Theory of violators: Assessing unidimensionality of psychological measures. In R. Steyer, K. F. Wender, & K. F. Widaman (Eds.), Psychometric methodology, Proceeding of the 7th European Meeting of the Psychometric Society in Trier (pp. 377–381). Stuttgart, Germany: Fischer.

    Google Scholar 

  • Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58, 159–194.

    Article  Google Scholar 

  • Sijtsma, K. (2009). On the use, misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107–120.

    Article  Google Scholar 

  • Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.

    Book  Google Scholar 

  • Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation. Psychometrika, 55, 293–325.

    Article  Google Scholar 

  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.

    Article  Google Scholar 

  • Swygert, K. A., McLeod, L. D., & Thissen, D. (2001). Factor analysis for items or testlets scored in more than two categories. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 217–250). Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Thissen, D. (2001). IRTLRDIF [Computer program]. University of North Carolina at Chapel Hill: L. L. Thurstone Psychometric Laboratory.

    Google Scholar 

  • Thissen, D., Steinberg, L., & Gerrard, M. (1986). Beyond group mean differences: The concept of item bias. Psychological Bulletin, 99, 118–128.

    Article  Google Scholar 

  • Wicherts, J. M., Dolan, C. V., & Hessen, D. J. (2005). Stereotype threat and group differences in test performance: A question of measurement invariance. Journal of Personality and Social Psychology, 89, 696–716.

    Article  Google Scholar 

  • Wicherts, J. M., Dolan, C. V., Hessen, D. J., Oosterveld, P., van Baal, G. C. M., Boomsma, D. L., et al. (2004). Are intelligence tests measurement invariant over time? Investigating the nature of the Flynn effect. Intelligence, 32, 509–537.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gideon J. Mellenbergh .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mellenbergh, G.J. (2019). Test Dimensionality. In: Counteracting Methodological Errors in Behavioral Research. Springer, Cham. https://doi.org/10.1007/978-3-030-12272-0_10

Download citation

Publish with us

Policies and ethics