Abstract
In certain educational assessments, there are many administrations of test forms of the same assessment over a specific period. The issue of equating these test forms from long series of test administrations is complicated, because the statistical properties of the items and the student populations can be volatile. This study demonstrates the possibilities of time series modeling for monitoring item and population characteristics in the context of test equating. More specifically, seasonal effects, trends, sudden breaks (or jumps), and outliers in population means and item parameters are studied in a series of simulations by making use of the frameworks of item response theory (IRT) and state space modeling. Three different state space models are used: the local level model, the linear trend model, and a seasonal model with linear trend. The goal is to capture peculiarities in the data in real time, so that, if necessary, immediate action can be taken. Preliminary results of the simulations indicate that many of the effects, as well as combinations, are well captured with the developed approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. Lord & M. Novick (Eds.), Statistical theories of mental test scores (pp. 397–472). Reading, MA: Addison-Wesley.
Brinkhuis, M., & Maris, G. (2009). Dynamic parameter estimation in student monitoring systems (Technical Report No. 2009–1). Arnhem: Cito.
Durbin, J., & Koopman, S. (2001). Time series analysis by state space methods. Oxford: Oxford University Press.
Frühwirth-Schnatter, S. (1994). Data augmentation and dynamic linear models. Journal of Time Series Analysis, 15, 183–202.
Gilks, W., Richardson, S., & Spiegelhalter, D. (1996). Markov chain Monte Carlo in practice. Boca Raton, FL: Chapman & Hall.
Glas, C. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.
Glas, C. (2000). Item calibration and parameter drift. In W. van der Linden & C. Glas (Eds.), Computer adaptive testing: Theory and practice (pp. 183–200). Boston, MA: Kluwer.
Harvey, A. (1989). Forecasting, structural time series models and the Kalman filter. Cambridge: Cambridge University Press.
Kalman, R. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82, 35–45.
Keller, L., & Keller, R. (2011). The long-term sustainability of different item response theory scaling methods. Educational and Psychological Measurement, 71, 362–379.
Kolen, M., & Brennan, R. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.
Lee, Y.-H., & Haberman, S. (2012). Harmonic regression and scale stability. In IMPS. Lincoln, NE.
Lee, Y.-H., & von Davier, A. (in press). Monitoring scale scores over time via quality control charts, model-based approaches, and time series techniques. Psychometrika.
Li, D., Li, S., & von Davier, A. (2011). Applying time series analysis to detect scale drift. In A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (pp. 327–346). New York: Springer.
Lindquist, A., & Picci, G. (1981). State space models for Gaussian stochastic processes. In M. Hazewinkel & J. Willems (Eds.), Stochastic systems: The mathematics of filtering and identification and applications (pp. 169–204). Dordrecht: Reidel.
Liu, J., & West, M. (2001). Combined parameter and state estimation in simulation-based filtering. In A. Doucet, N. de Freitas, & N. Gordon (Eds.), Sequential Monte Carlo methods in practice (pp. 197–223). New York: Springer.
Lord, F., & Novick, M. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Oud, J., & Singer, H. (2008). Continuous time modeling of panel data: SEM versus filter techniques. Statistica Neerlandica, 62, 4–28.
Petris, G. (2010). An R package for dynamic linear models. Journal of Statistical Software, 36, 1–16.
Petris, G., Petrone, S., & Campagnoli, P. (2009). Dynamic linear models. New York: Springer.
Pitt, M., & Shephard, N. (1999). Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association, 94, 590–599.
Rao, C., & Sinharay, S. (Eds.). (2007). Handbook of statistics, volume 26: Psychometrics. Amsterdam: Elsevier.
Shephard, N. (1994). Partial non-Gaussian state space. Biometrika, 81, 115–131.
Storvik, G. (2002). Particle filters for state-space models with the presence of unknown static parameters. IEEE Transactions on Signal Processing, 50, 281–289.
VanBrackle, L., & Reynolds, M. (1997). EWMA and CUSUM control charts in the presence of correlation. Communications in Statistics: Simulation and Computation, 26, 979–1008.
van der Linden, W., & Hambleton, R. (Eds.). (1997). Handbook of modern item response theory. New York: Springer.
van Rijn, P., Dolan, C., & Molenaar, P. (2010). State space methods for item response modeling of multi-subject time series. In P. Molenaar & K. Newell (Eds.), Individual pathways of change: Statistical models for analyzing learning and development (pp. 125–151). Washington, DC: American Psychological Association.
Veerkamp, W., & Glas, C. (2000). Detection of known items in adaptive testing with a statistical quality control method. Journal of Educational and Behavioral Statistics, 25, 373–389.
von Davier, A. (2012). The use of quality control and data mining techniques for monitoring scaled scores: An overview (Technical Report No. RR-12-20). Princeton, NJ: Educational Testing Service
West, M., & Harrison, J. (1997). Bayesian forecasting and dynamic models. New York: Springer.
Acknowledgments
The authors would like to thank Frank Rijmen and Lili Yao for helpful comments on an earlier draft of the manuscript. In addition, we are obliged to Kim Fryer for editorial assistance
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Wanjohi, R.G., van Rijn, P.W., von Davier, A.A. (2013). A State Space Approach to Modeling IRT and Population Parameters from a Long Series of Test Administrations. In: Millsap, R.E., van der Ark, L.A., Bolt, D.M., Woods, C.M. (eds) New Developments in Quantitative Psychology. Springer Proceedings in Mathematics & Statistics, vol 66. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9348-8_8
Download citation
DOI: https://doi.org/10.1007/978-1-4614-9348-8_8
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9347-1
Online ISBN: 978-1-4614-9348-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)