Holland’s Advice for the Fourth Generation of Test Theory: Blood Tests Can Be Contests

Dorans, Neil J.

doi:10.1007/978-1-4419-9389-2_14

Holland’s Advice for the Fourth Generation of Test Theory: Blood Tests Can Be Contests

Neil J. Dorans³

Conference paper
First Online: 01 January 2011

917 Accesses
4 Citations

Part of the book series: Lecture Notes in Statistics ((LNSP,volume 202))

Abstract

According to Holland (2008) in The First Four Generations of Test Theory, testing as a scientific enterprise is not more than 120 years old. Holland divides this enterprise into four overlapping generations. The first generation, which was influenced by concepts such as error of measurement and correlation that were developed in other fields, focused on test scores and saw developments in the areas of reliability, classical test theory, generalizability theory, and validity. This generation began in the early twentieth century and continues today, but most of its major developments were achieved by 1970. The second generation, which focused on models for item level data, began in the 1940s and peaked in the 1970s but continues into the present as well. The third generation started in the 1970s and continues into today. It is characterized by the application of statistical ideas and sophisticated computational methods to item level models, as well as models of sets of items.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alderman, D. L., & Holland, P. W. (1981). Item performance across native language groups on the Test of English as a Foreign Language (ETS Research Rep. No. RR-81-16) Princeton, NJ: ETS.
Google Scholar
Cattell, J. M. (1890). Mental tests and measurements. Mind, 15, 373–381.
Article Google Scholar
College Board. (2005). 2005 college bound seniors: Total group profile report. New York, NY: Author.
Google Scholar
Dorans, N. J. (1982). Technical review of item fairness studies: 1975–1979 (ETS Statistical Rep. No. SR-82-90). Princeton, NJ: ETS.
Google Scholar
Dorans, N. J. (2002). Recentering the SAT score distributions: How and why. Journal of Educational Measurement, 39(1), 59–84.
Article Google Scholar
Dorans, N. J., & Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 35–66). Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37(4), 281–306.
Article Google Scholar
Dorans, N. J., & Kulick, E. (1983). Assessing unexpected differential item performance of female candidates on SAT and TSWE forms administered in December 1977: An application of the standardization approach (ETS Research Rep. No. RR-83-09). Princeton, NJ: ETS.
Google Scholar
Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23, 355–368.
Article Google Scholar
Dorans, N. J., & Kulick, E. (2006). Differential item functioning on the MMSE: An application of the Mantel-Haenzel and standardization procedures. Medical Care, 44 S3, S107–S114.
Google Scholar
Dorans, N. J., Lyu, C. F., Pommerich, M., & Houston, M. (1997). Concordance between ACT assessment and recentered SAT I sum scores. Colleges and Universities, 73, 24–34.
Google Scholar
Dorans, N. J., Pommerich, M., & Holland, P. W. (Eds.). (2007). Linking and aligning scores and scales. New York, NY: Springer.
MATH Google Scholar
Feuer, M. J., Holland, P. W., Green, B. F., Bertenthal, M. W., & Hemphill, F. C. (Eds.). (1999). Uncommon measures: Equivalence and linkage among educational tests (Report of the Committee on Equivalency and Linkage of Educational Tests, National Research Council). Washington, DC: National Academy Press.
Google Scholar
Gulliksen, H. (1950). Theory of mental tests. New York, NY: Wiley.
Book Google Scholar
Hackett, R. K., Holland, P. W., Pearlman, M., & Thayer, D. T. (1987). Test construction manipulating scores differences between Black and White examinees: Properties of the resulting tests (ETS Research Rep. No. RR-87-30). Princeton, NJ: ETS.
Google Scholar
Holland, P. W. (1994). Measurements or contests? Comments on Zwick, Bond and Allen/Donoghue. Proceedings of the Social Statistics Section of the American Statistical Association, 1994, 27–29.
Google Scholar
Holland, P. W. (2008, March). The first four generations of test theory. Paper presented at the Association of Test Publishers on Innovations in Testing, Dallas, TX.
Google Scholar
Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 187–220). Westport, CT: American Council on Education/Prager.
Google Scholar
Holland, P. W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possibly nonparallel test. Psychometrika, 68, 123–149.
Article MathSciNet Google Scholar
Holland, P. W., & Rubin, D. B. (Eds.). (1982). Test equating. New York, NY: Academic Press.
Google Scholar
Holland, P. W., & Wainer, H. (Eds.). (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Kelley, T. L. (1927). Interpretation of educational measurements. New York, NY: World Book.
Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
MATH Google Scholar
Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement bias. Psychometrika, 57(2), 289–311.
Article MathSciNet MATH Google Scholar
Mosteller, F., & Tukey, J. W. (1977). Data analysis and regression: A second course in statistics. Reading, MA: Addison-Wesley.
Google Scholar
Schmitt, A. P., Holland, P. W., & Dorans, N. J. (1993). Evaluating hypotheses about differential item functioning. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 281–315). Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Shealy, R. T., & Stout, W. F. (1993). A model based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometika, 58, 197–239.
Google Scholar
Tucker, L. R. (1971). Relations of factor score estimates to their use. Psychometrika, 36(4), 427–436.
Article MathSciNet MATH Google Scholar
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. New York, NY: Springer.
MATH Google Scholar
Wainer, H. (2007). The world’s most dangerous equation. American Scientist, 95, 249–256.
Google Scholar

Download references

Acknowledgements

The author thanks Paul Holland for being the mentor, colleague, and friend who had the most impact on my career. Tim Moses provided valuable advice. Any opinions expressed here are those of the author and not necessarily of Educational Testing Service.

Author information

Authors and Affiliations

Educational Testing Service, Rosedale Road, Princeton, NJ, 08541, USA
Neil J. Dorans

Authors

Neil J. Dorans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neil J. Dorans .

Editor information

Editors and Affiliations

, Research and Development, Educational Testing Service, MS 12T, Rosedale Road, Princeton, 08541, New Jersey, USA
Neil J. Dorans
, Research and Development, Educational Testing Service, MS 12T, Rosedale Road, Princeton, 08541, New Jersey, USA
Sandip Sinharay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dorans, N.J. (2011). Holland’s Advice for the Fourth Generation of Test Theory: Blood Tests Can Be Contests. In: Dorans, N., Sinharay, S. (eds) Looking Back. Lecture Notes in Statistics(), vol 202. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9389-2_14

Download citation

DOI: https://doi.org/10.1007/978-1-4419-9389-2_14
Published: 02 June 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9388-5
Online ISBN: 978-1-4419-9389-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics