Abstract
A procedure to construct valid and fair fixed-length tests with randomly drawn items from an item bank is described. The procedure provides guidelines for the set-up of a typical achievement test with regard to the number of items in the bank and the number of items for each position in a test. Further, a procedure is proposed to calculate the relative difficulty for individual tests and to correct the obtained score for each student based on the mean difficulty for all students and the particular test of a student. Also, two procedures are proposed for the problem to calculate the reliability of tests with randomly drawn items. The procedures use specific interpretations of regularly used methods to calculate Cronbach’s alpha and KR20 and the Spearman-Brown prediction formula. A simulation with R is presented to illustrate the accuracy of the calculation procedures and the effects on pass-fail decisions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Draaijer, S., Warburton, B.: The emergence of large-scale computer assisted summative examination facilities in higher education. In: Kalz, M., Ras, E. (eds.) CAA 2014. CCIS, vol. 439, pp. 28–39. Springer, Heidelberg (2014)
Mills, C.N., Potenza, M.T., Fremer, J.J., Ward, W.C.: Computer-Based Testing, Building the Foundation for Future Assessments. Lawrence Erlbaum Associates, London (2002)
Glas, C.A.W., Van der Linden, W.J.: Computerized Adaptive Testing With Item Cloning. Appl. Psychol. Meas. 27, 247–261 (2003)
Van Haneghan, J.P.: The impact of technology on assessment and evaluation in higher education. In: Technology Integration in Higher Education: Social and Organizational Aspects, pp. 222–235 (2010)
Veldkamp, B.: Het random construeren van toetsen uit een itembank [Random selection of tests from an itembank]. Exam. Tijdschr. Voor Toetspraktijk. 9, 17–19 (2012)
Gibson, W.M., Weiner, J.A.: Generating random parallel test forms using CTT in a computer-based environment. J. Educ. Meas. 35, 297–310 (1998)
Parshall, C.G., Spray, J.A., Kalohn, J.C., Davey, T.: Practical Considerations in Computer-Based Testing. Springer, New York (2002)
van Berkel, H., Bax, A.: Toetsen in het Hoger Onderwijs [Testing in Higher Education]. Bohn Stafleu Van Loghum, Houten/Diegem (2006)
Schönbrodt, F.D., Perugini, M.: At what sample size do correlations stabilize? J. Res. Personal. 47, 609–612 (2013)
Cizek, G.J., Bunch, M.B.: Standard Setting: a Guide to Establishing and Evaluating Performance Standards on Tests. Sage Publications, Thousand Oaks (2007)
Impara, J.C., Plake, B.S.: Teachers’ ability to estimate item difficulty: a test of the assumptions in the angoff standard setting method. J. Educ. Meas. 35, 69–81 (1998)
Gierl, M.J., Haladyna, T.M.: Automatic Item Generation: Theory and Practice. Routledge, New York (2012)
Livingston, S.A.: Equating Test Scores (without IRT). Educational Testing Service, Princeton (2004)
Cronbach, L.J.: Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334 (1951)
Kuder, G.F., Richardson, M.W.: The theory of the estimation of test reliability. Psychometrika 2, 151–160 (1937)
Lopez, M.: Estimation of Cronbach’s alpha for sparse datasets. In: Mann, S., Bridgeman, N. (eds.) Proceedings of the 20th Annual Conference of the National Advisory Committee on Computing Qualifications (NACCQ), pp. 151–155, New Zealand (2007)
Spearman, C.: Correlation calculated from faulty data. Br. J. Psychol. 1904–1920 3, 271–295 (1910)
Team, R.C.: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2015)
Ripley, B., Venables, B., Bates, D.M., Hornik, K., Gebhardt, A., Firth, D., Ripley, M.B.: Package “MASS.” (2014)
De Boeck, P., Wilson, M. (eds.): Explanatory Item Response Models. Springer, New York (2004)
Klinkenberg, S.: Simulation for determining test reliability of sparse data sets (2015)
Woodhouse, B., Jackson, P.H.: Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: II: a search procedure to locate the greatest lower bound. Psychometrika 42, 579–591 (1977)
Cureton, E.E.: Corrected item-test correlations. Psychometrika 31, 93–96 (1966)
Lucas, J.M., Saccucci, M.S.: Exponentially weighted moving average control schemes: properties and enhancements. Technometrics 32, 1–12 (1990)
Wei, W.W.: Time Series Analysis. Addison-Wesley, Boston (1994)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Draaijer, S., Klinkenberg, S. (2015). A Practical Procedure for the Construction and Reliability Analysis of Fixed-Length Tests with Randomly Drawn Test Items. In: Ras, E., Joosten-ten Brinke, D. (eds) Computer Assisted Assessment. Research into E-Assessment. TEA 2015. Communications in Computer and Information Science, vol 571. Springer, Cham. https://doi.org/10.1007/978-3-319-27704-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-27704-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27703-5
Online ISBN: 978-3-319-27704-2
eBook Packages: Computer ScienceComputer Science (R0)