Advertisement

Testing Equality of Functions Across Multiple Experimental Conditions for Different Ability Levels in the IRT Context: The Case of the IPRASE TLT 2016 Survey

  • Fabrizio Maturo
  • Francesca Fortuna
  • Tonio Di Battista
Article

Abstract

In the educational field, it is common to analyze test data through item response theory models. In this context, a key role is played by item characteristic curves (ICCs) and item information curves (IICs). In many real cases, practitioners are interested in understanding if some factors have a significant influence on the probability of correctly answering items. In the literature, this problem has been addressed by applying the standard analysis of variance model, which is based on the total scores or the proportion of correct responses. However, this method needs to meet some strong assumptions and may present some limitations because it does not consider useful information typical of the IRT, such as the shapes of the ICCs and IICs, which provide interesting insights for different ability levels. To overcome these issues, this research suggests the use of the functional analysis of variance approach and a novel functional tool in the IRT context. The main advantages of this approach are that it is distribution-free and allows us to check the degree of consistency with the hypothesis of equality among mean curves for different ability levels. Specifically, the proposed method is applied on ICCs and IICs for improving the existing techniques in the educational studies. A real dataset drawn from the IPRASE Trentino Language Testing Survey 2016 is considered. The final purpose of this study is to provide additional tools for scholars and practitioners in defining specific educational plans.

Keywords

IRT ICC IIC FANOVA P-Statistic 

References

  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.  https://doi.org/10.1109/TAC.1974.1100705.CrossRefGoogle Scholar
  2. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. Lord & M. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Boston: Addison-Wesle.Google Scholar
  3. Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459.  https://doi.org/10.1007/bf02293801.CrossRefGoogle Scholar
  4. Carpita, M. (2017). L’analisi psicometrica dei test. In L. Covi & M. Dutto (Eds.), Rapporto TLT 2016 Trentino Language Testing Esito delle rilevazioni delle competenze linguistiche degli studenti trentini (pp. 71–86). Provincia Autonoma di Trento: IPRASE. (ISBN 978-88-7702-426-8).Google Scholar
  5. Ceccatelli, C., Di Battista, T., Fortuna, F., & Maturo, F. (2013). Best practice to improve the learning of statistics: The case of the national olympics of statistics in italy. Procedia: Social & Behavioral Sciences, XCIII, 2194–2199.  https://doi.org/10.1016/j.sbspro.2013.10.186.Google Scholar
  6. Chen, S., Hwang, F., & Lin, S. (2013). Satisfaction rating of QOLPAV: Psychometric properties based on the graded response model. Social Indicator Research, 110, 367–383.CrossRefGoogle Scholar
  7. Council of Europe. (2011). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge: Cambridge University Press.Google Scholar
  8. Covi, L., & Dutto, M. (2017). Rapporto TLT 2016 Trentino Language Testing. Esito delle rilevazioni delle competenze linguistiche degli studenti trentini. Provincia Autonoma di Trento: IPRASE. (ISBN 978-88-7702-426-8).Google Scholar
  9. de Ayala, R. (2009). The theory and practice of item response theory. New York: The Guilford Press.Google Scholar
  10. Di Battista, T., & Fortuna, F. (2016). Clustering dichotomously scored items through functional data analysis. Electronic Journal of Applied Statistical Analysis, 9, 433–450.Google Scholar
  11. Di Battista, T., & Fortuna, F. (2017). Functional confidence bands for lichen biodiversity profiles: A case study in Tuscany region (central Italy). Statistical Analysis and Data Mining: The ASA Data Science Journal, 10, 21–28.CrossRefGoogle Scholar
  12. Di Battista, T., Fortuna, F., & Maturo, F. (2014). Parametric functional analysis of variance for fish biodiversity. In International conference on marine and freshwater environments, iMFE 2014. www.scopus.com.
  13. Di Battista, T., Fortuna, F., & Maturo, F. (2016). Parametric functional analysis of variance for fish biodiversity assessment. Journal of Environmental Informatics, 28(2), 101–109.  https://doi.org/10.3808/jei.201600348.Google Scholar
  14. Di Battista, T., Fortuna, F., & Maturo, F. (2017). BioFTF: An R package for biodiversity assessment with the functional data analysis approach. Ecological Indicators, 73, 726–732.  https://doi.org/10.1016/j.ecolind.2016.10.032.CrossRefGoogle Scholar
  15. Drasgow, F. (1984). Scrutinizing psychological tests: Measurement equivalence and equivalent relations with external variables are the central issues. Psychological Bulletin, 95(1), 134–135.  https://doi.org/10.1037/0033-2909.95.1.134.CrossRefGoogle Scholar
  16. Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis. New York: Springer.Google Scholar
  17. Fortuna, F., & Maturo, F. (2018). K-means clustering of item characteristic curves and item information curves via functional principal component analysis. Quality & Quantity.  https://doi.org/10.1007/s11135-018-0724-7.Google Scholar
  18. Hambleton, R., & van der Linden, W. (1997). Handbook of modern item response theory. New York: Springer.Google Scholar
  19. Liu, Y. (2016). Modelling and testing differential item functioning in unidimensional binary item response models with a single continuous covariate: A functional data analysis approach. Psychometrika, 81, 371–398.CrossRefGoogle Scholar
  20. Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum.Google Scholar
  21. Lord, F., & Novick, M. (1968a). Statistical theories of mental test scores (with contributions by A. Birnbaum). Reading, MA: Addison-Wesley.Google Scholar
  22. Lord, F., & Novick, M. (1968b). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
  23. Manly, B. F. J. (1997). Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman and Hall, London. (ISBN 0412721309).Google Scholar
  24. Matthew, S. (2007). Modeling dichotomous item responses with free-knot splines. Computational Statistics & Data Analysis, 51, 4178–4192.CrossRefGoogle Scholar
  25. Maturo, F. (2018). Unsupervised classification of ecological communities ranked according to their biodiversity patterns via a functional principal component decomposition of hill’s numbers integral functions. Ecological Indicators, 90, 305–315.  https://doi.org/10.1016/j.ecolind.2018.03.013.CrossRefGoogle Scholar
  26. Maturo, F., & Di Battista, T. (2018). A functional approach to Hill’s numbers for assessing changes in species variety of ecological communities over time. Ecological Indicators, 84(C), 70–81.  https://doi.org/10.1016/j.ecolind.2017.08.016..CrossRefGoogle Scholar
  27. Maturo, F., Di Battista, T., & Fortuna, F. (2016). BioFTF: Biodiversity assessment using functional tools. ​https://cran.r-project.org/web/packages/BioFTF/index.html.
  28. Maturo, F., Migliori, S., & Paolone, F. (2017). Do institutional or foreign shareholders influence national board diversity? Assessing Board diversity through functional data analysis (pp. 199–217). Cham: Springer.  https://doi.org/10.1007/978-3-319-54819-7_14.Google Scholar
  29. Maturo, F., Migliori, S., & Paolone, F. (2018). Measuring and monitoring diversity in organizations through functional instruments with an application to ethnic workforce diversity of the U.S. federal agencies. Computational and Mathematical Organization.  https://doi.org/10.1007/s10588-018-9267-7.Google Scholar
  30. O’Connor, B., Crawford, M., & Holder, M. (2015). An item response theory analysis of the subjective happiness scale. Social Indicator Research, 124, 249–258.CrossRefGoogle Scholar
  31. Ramsay, J. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.CrossRefGoogle Scholar
  32. Ramsay, J. (1997). A functional approach to modeling test data. In W. van der Linden & R. Hambleton (Eds.), Handbook of modern item response theory (pp. 381–394). New York: Springer.CrossRefGoogle Scholar
  33. Ramsay, J. O., & Silverman, B. W. (2005). Functional data analysis (2nd ed.). New York: Springer.Google Scholar
  34. Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests. Copenhagen: Danish Institute for Educational Research.Google Scholar
  35. Rizopoulos, D. (2006). ltm: An r package for latent variable modeling and item response theory analysis. Journal of Statistical Software, 17(5), 1–25.CrossRefGoogle Scholar
  36. Roju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-Based Internal Measures of Differential Functioning of Items and Tests. Applied Psychological Measurement, 19(4), 353–368.  https://doi.org/10.1177/014662169501900405.CrossRefGoogle Scholar
  37. Rossi, N., Wang, X., & Ramsay, J. (2002). Nonparametric item response function estimates with the em algorithm. Journal of Educational and Behavioral Statistics, 27, 291–317.CrossRefGoogle Scholar
  38. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.  https://doi.org/10.1214/aos/1176344136.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  • Fabrizio Maturo
    • 1
  • Francesca Fortuna
    • 2
  • Tonio Di Battista
    • 2
  1. 1.Department of Management and Business Administration“G. d’ Annunzio” UniversityPescaraItaly
  2. 2.DISFPEQ“G. d’ Annunzio” UniversityPescaraItaly

Personalised recommendations