Abstract
Educational assessment in the Western world has a long but very irregular history. Two distinct threads are woven together: the first is the variety of settings in which testing itself came to have practical use while the second is the incorporation of increasingly rigorous methods by which to make sense out of the results of that testing. This chapter sets out some of the key developments in each of these two areas, from their origins until the dawn of contemporary psychometrics. For extended periods of time even the simplest improvements in either testing or statistics fought long and hard against tradition and inertia. It took many generations for the two threads to finally merge into a full-fledged science of educational measurement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adams, H. F. (1936). Validity, reliability and objectivity. In W.R. Miles (Ed.), Psychological studies of human variability. Psychological Monographs, 57, 329–350.
Barthelmess, H. M. (1931). The validity of intelligence test elements. New York: Teachers College.
Binet, A. (1898). La mesure en psychologie individuelle. Revue Philosophique, 46, 113–123.
Binet, A., & Simon, T. (1905). Methodes nouvelles pour le diagnostic scientifique des etats inferieurs de l’intelligence. L’Annee Psychologique, 11, 163–190.
Binet, A., & Simon, T. (1910). Sur la necessite d’une methode applicable au diagnostic des arrierees militaires. Annales Medico-psychologique.
Birnbaum, A. (1957). An efficient design and use of tests of a mental ability for various decision making problems. Series Report No. 58-16, USAF School of Aviation Medicine, Randolph, TX.
Birnbaum, A. (1958). On the estimation of mental ability. Series Report No.15, USAF School of Aviation Medicine, Randolph, TX.
Bower, J. (1975). A history of western education. Civilization of Europe sixth to sixteenth century, vol. 2. New York: St. Martin’s Press.
Bright, O. T. (1895). Changes — wise and unwise — in grammar and high schools. Journal of Proceeding and Addresses, St. Paul: National Education Association.
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.
Brown, W., & Thompson, G. H. (1940). The essentials of mental measurement, Cambridge, MA: Cambridge University Press.
Brownless, V. T., & Keats, J. A. (1958). A retest method of studying partial knowledge and other factors influencing item response. Psychometrika, 23, 67–73.
Burt, C. L. (1909). Experimental tests of general intelligence. British Journal of Psychology, 3, 94–177.
Burt, C. L. (1936). The use of psychological tests in England. In Sadler, M. E., Abbott, A., Burts, C. L., Burns, C. D., Hartog, P., Spearman, C., and Stirk, S. D. Essays on examinations. London: Macmillan.
Campbell, N.R. (1920). Physics, the elements. Cambridge: Cambridge University Press.
Campbell, N.R. (1921). What is science? London: Methuen.
Cattell, J. M. (1890). Mental tests and measurements. Mind, 15, 373–381.
Cattell, R. B. (1964). Validity and reliability: A proposed more basic set of concepts. Journal of Educational Psychology, 55, 1–22.
Clarke, A. D. B., and Clarke, A. M. (1985). Mental testing: origins, evolution, and present status. History of Education, 14, 263–272.
Cochran, W. G. (1976). Early development of techniques in experimentation. In D. B. Owen (Ed.), On the history of statistics and probability. New York: Dekker.
Cremin, L. (1961). The transformation of the school. New York: Knopf.
Cronbach, L. J. (1947). Test “reliability”: Its meaning and determination. Psychometrika, 12, 1–16.
Cronbach, L. J. (1975). Five decades of public controversy over mental testing. American Psychologist, 30, 1–14.
Cullen, M. J. (1975). The statistical movement in early Victorian Britain: The foundations of empirical social research. New York: Barnes & Noble.
DuBois, P. H. (1964). A test-dominated society: China, 1115 B.C.-1905 A.D. ETS Invitational conference on testing problems. Princeton: Educational Testing Service.
DuBois, P. H. (1970). A history of psychological testing. Boston: Allyn and Bacon.
Edgeworth, F. Y. (1890). The element of chance in competitive examinations. Journal of the Royal Statistical Society. 53, 460–475, 644-673.
Englehart, M. D. (1950). Examinations. In W. S. Monroe (Ed.), Encyclopedia of educational research. New York: MacMillan.
Ferguson, G. A. (1942). Item selection by the constant process. Psychometrika, 7 19–29.
Fisher, R. A. (1956). Statistical methods and scientific inference. New York: Hafner.
Fisher, A. (1915). The mathematical theory of probabilities and its application to frequency curves and statistical methods. New York: Macmillan.
Freeman, F. N. (1926). Mental tests: Their history, principles and applications. Boston: Houghton Mifflin.
Goodenough, F. L. (1936). A critical note on the use of the term ‘reliability’ in mental measurement. Journal of Educational Psychology, 27, 173–178.
Graves, F. P. (1950). A history of education in modern times. New York: MacMillan.
Guilford, J.P. (1936). Psychometric methods. New York: McGraw-Hill.
Gulliksen, H. (1961). Measurement of learning and mental abilities. Psychometrika, 26, 93–107.
Gulliksen, H. (1950). Theory of mental tests. New York: Wiley.
Guttman, L. (1944). A basis for scaling qualitative data. American Sociological Review, 9, 139–150.
Hambleton, R. K., & Cook, L. L. (1977). Latent trait models and their use in the analysis of educational test data. Journal of Educational Measurement, 14, 75–96.
Horst, A. P. (1936). Item selection by means of maximizing function. Psychometrika, 1, 229–244.
Keats, J. A., & Lord, F. M. (1962). A theoretical distribution for mental test scores. Psychometrika, 27, 59–72.
Kelley, T. L. (1927). Interpretation of educational measurements. Yonkers-on-Hudson, NY: World.
Kelley, T. L., & Krey, A. C. (1934). Tests and measurements in the social sciences. Report of the Commission on the Social Studies, American Historical Association, Part IV. New York: Charles Scribner’s Sons.
Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151–160.
Latham, H. (1877). On the action of examinations considered as a means of selection. Cambridge: Deighton Bell.
Lawley, D. N. (1943). On problems connected with item selection and test construction. Proceedings of the Royal Society of Edinburgh, 61, Section A, 273–287.
Lazarsfeld, P. F. (1960). Latent structure analysis and test theory. In H. Gulliksen and S. Messick (Eds.), Psychological scaling: Theory and applications. New York: Wiley.
Lazarsfeld, P. F. (1950). The logical and mathematical foundations of latent struture analysis. In S. A. Stouffer, et al (Eds.), Measurement and prediction. Princeton: Princeton University Press.
Lentz, T. F., Hirshstein, B., & Finch, F. H. (1932). Evaluation of methods of evaluating test items. Journal of Educational Psychology, 23, 344–350.
Lincoln, E. A. (1932). The unreliability of reliability coefficients. Journal of Educational Psychology, 23, 11–14.
Lord, F. M. (1952). A theory of test scores. Psychometric Monographs, No. 7
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, Mass.: Addison-Wesley.
Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2, 99–120.
Marks, R. (1977). Providing for individual differences: A history of the intelligence testing movement in North America. Interchange, 7, 3–16.
McCall, W. A. (1922). How to Measure in Education. New York: Macmillan.
Meitzen, A. (1891). History, theory, and technique of statistics. fnnals of the American Academy of Political and Social Science, 1, 1–237.
Meyer, A. E. (1965). Educational history of the western world. New York: McGraw Hill.
Monroe, W. S. (1923). Introduction to the theory of educational measurement. Boston: Houghton Mifflin.
Monroe, W. S. (1945). Educational measurement in 1920 and 1945. Journal of Educational Research, 38, 334–340.
Pearson, E. S. (Ed.) (1978). The history of statistics in the 17th and 18th centuries, against the changing background of intellectual, scientific and religious thought. Lectures by Karl Pearson. London: Charles Griffin.
Peterson, J. (1925). Early conceptions and tests of intelligence. Yonkers-on-Hudson, NY: World.
Quetelet, M.A. (1849). Letters on the theory of probabilities. London: Charles and Edwin Layton.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Neilsen & Lydiche.
Rice, J. M. Forum, 1897. Cited in W. H. Wilds & K. V. Lottich, (1970). Foundations of modern education. New York: Holt, Rinehart & Winston.
Ruch, G. M. (1929). The objective or new-type examination, an introduction to educational measurement. Chicago: Scott, Foresman.
Ruch, G. M., & deGraff, M. H. (1926). Corrections for chance and “guess” vs. “do not guess” instructions in multiple-response tests. Journal of Educational Psychology, 17, 368–375.
Rugg, H. O. (1917). Statistical methods applied to education. Boston: Houghton Mifflin.
Sadler, M. E. (1936). The scholarship system in England to 1890 and some of its developments. In Sadler, M. E., Abbott, A., Burts, C. L. Burns, C. D., Hartog, P., Spearman, C, and Stirk, S. D. Essays on examinations. London: MacMillan.
Sharp, S. E. (1899). Individual psychology: A study in psychological method. American Journal of Psychology, 10, 329–391.
Smallwood, M. L. (1935). An historical study of examinations and grading systems in early American universities. Cambridge: Harvard University Press (Harvard Studies in Education vol. 24).
Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.
Spearman, C. (1904). General intelligence objectively determined and measured. American Journal of Psychology, 15, 201–292.
Spring, J. H. (1972). Psychologists and the war: The meaning of intelligence and the Alpha and Beta tests. History of Education Quarterly, 12, 3–15.
Strayer, G. D. (1913).Standards and tests for measuring the efficiency of schools or systems of schools. Bulletin, United States Bureau of Education. Whole No. 13: Report of the Committee of the National Council of Education.
Sylvester, D. W. (1970). Educational documents 800-1816. London: Methuen
Thompson, G. O. B., & Sharp, S. (1983). History of mental testing. In T. Husen & N. Postlethwaite (Eds.), International encyclopedia of education: Research and studies, Oxford: Pergamon Press.
Thorndike, E. L. (1904). An introduction to the theory of mental and social measurements. New York: Science Press.
Thorndike, E. L. (1913). Educational measurements of fifty years ago. Journal of Educational Psychology, 6, 551–552.
Thurstone, L. L. (1925). A method of scaling psychological and educational tests. Journal of Educational Psychology, 16, 433–451.
Thurstone, L. L. (1931). The reliability and validity of tests. Ann Arbor: Edwards.
Thurstone, L. L. (1926). The scoring of individual performance. Journal of Educational Psychology, 17, 446–457.
Thurstone, L. L. (1927). The unit of measurement in educational scales. Journal of Educational Psychology, 18, 505–524.
Toulouse, E., & Pieron, H. (1904). Technique de psychologie experimentale. Paris: Doin.
Tryon, R. C. (1957). Reliability and behavior domain validity: Reformulation and historical critique. Psychological Bulletin, 54, 229–249.
Tucker, L. R. (1946). Maximum validity of a test with equivalent items. Psychometrika, 11, 1–13.
Wilds, E. H., & Lottich, K. V. (1970). Foundations of modern education. New York: Holt, Rinehart & Winston.
Wissler, C. (1901). The correlation of mental and physical tests. Psychological Review, Monograph Supplement Vol. 8, No. 16.
Wright, B.D. (1984). Despair and hope for educational measurement. Contemporary Education Review, 3, 281–288.
Yerkes, R. M. (Ed.) (1921). Psychological examining in the United States Army. Memoirs of the National Academy of Sciences, 15, 1—890.
Yule, G.U. (1910). An introduction to the theory of statistics. London: Charles Griffin.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1987 Kluwer Academic Publishers
About this chapter
Cite this chapter
McArthur, D.L. (1987). Educational Assessment: A Brief History. In: McArthur, D.L. (eds) Alternative Approaches to the Assessment of Achievement. Evaluation in Education and Human Services, vol 16. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-3257-9_1
Download citation
DOI: https://doi.org/10.1007/978-94-009-3257-9_1
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-7961-7
Online ISBN: 978-94-009-3257-9
eBook Packages: Springer Book Archive