Skip to main content

Validity-Versus-Reliability Tradeoffs and the Ethics of Educational Research

  • Chapter
  • First Online:
Educational Research: Ethics, Social Justice, and Funding Dynamics

Part of the book series: Educational Research ((EDRE,volume 10))

Abstract

In educational research that calls itself empirical, the relationship between validity and reliability is that of trade-off: the stronger the bases for validity, the weaker the bases for reliability (and vice versa). Validity and reliability are widely regarded as basic criteria for evaluating research; however, there are ethical implications of the trade-off between the two. The paper traces a brief history of the concepts, and then describes four ethical issues associated with the validity-reliability tradeoff in educational research: bootstrapping, stereotyping, dehumanization, and determinism. The article closes by describing emerging trends in social science research that have the potential to displace the validity-reliability tradeoff as a central concern for the evaluation of educational research: the introduction of translational sciences, a shift from significance to replicability, a move from inference to Big Data, and the increasing importance of consequential validity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    There is some conceptual fuzziness in this paper between educational research and educational testing. For purposes of this paper, I think the distinction is not very important; much empirical educational research is conducted on the basis of educational test results, and testing instruments constitute the data-collection instruments of much empirical research in education. The validity-reliability tradeoff pertains in empirical educational research whether or not tests are involved.

  2. 2.

    Thanks to Jeff Bale for pointing this out.

  3. 3.

    I don’t know why scare quotes appear around the term “truth value” but not around the other terms on the list.

  4. 4.

    I have never understood how research methods or findings could be extrapolated from animals to humans. I just don’t get how it could have occurred to researchers (such as Thorndike) to imagine that findings from experiments on lab rats could be applied to teaching and learning for the people in Teachers College. But we humans can be taught, and apparently we have learned to behave like rats when we are treated as such.

  5. 5.

    The other purposes specified by Biesta (2010) are qualification and socialization. Biesta uses the term subjectification very differently from the way Foucault uses it.

References

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: American Psychological Association.

    Google Scholar 

  • Baker, E. L. (2013). The chimera of validity. Teachers College Record, 115(9), 1–26. http://www.tcrecord.org. ID Number: 17106. Date Accessed: 4/23/2015 3:06:16 PM.

    Google Scholar 

  • Biesta, G. (2010). Good education in an age of measurement: Ethics, politics, democracy. Boulder: Paradigm Publishers.

    Google Scholar 

  • Campbell, D. T. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.

    Article  Google Scholar 

  • Castel, R. (1991). From dangerousness to risk. In G. Burchell, C. Gordon, & P. Miller (Eds.), The Foucault effect: Studies in governmentality (pp. 281–298). Chicago: University of Chicago Press.

    Google Scholar 

  • Cherryholmes, C. H. (1988). Power and criticism: Poststructural investigations in education. New York: Teachers College Press.

    Google Scholar 

  • Cizek, G. J. (2007, August). Introduction to modern validity theory and practice. Invited presentation to the National Assessment Governing Board, McLean, VA. Available: https://www.nagb.gov/content/nagb/assets/documents/naep/cizek-introduction-validity.pdf

  • Cochran-Smith, M., & Lytle, S. L. (1999). The teacher research movement: A decade later. Educational Researcher, 28, 15–25. https://doi.org/10.3102/0013189X028007015.

    Article  Google Scholar 

  • Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Orlando: Harcourt Brace.

    Google Scholar 

  • Cronbach, L. J. (1969). Validation of educational measures. In P. H. H. DuBois (Ed.), Proceedings of the invitational conference on testing problems (pp. 35–52). Princeton: Educational Testing Service.

    Google Scholar 

  • Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.

    Google Scholar 

  • Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. https://doi.org/10.1037/h0040957. PMID 13245896.

    Article  Google Scholar 

  • Eubanks, D. (2012, June 8). Bad reliability, part two. http://highered.blogspot.com/2012/06/bad-reliability-part-two.html

  • Fendler, L. (2006). Why generalisability is not generalisable. Journal of the Philosophy of Education, 40(4), 437–449.

    Article  Google Scholar 

  • Fendler, L., & Muzaffar, I. (2008). The history of the bell curve: Sorting and the idea of normal. Educational Theory, 58(1), 63–82.

    Article  Google Scholar 

  • Fenstermacher, G. (1994). The knower and the known: The nature of knowledge in research on teaching. In L. Darling-Hammond (Ed.), Review of research in education (Vol. 20, pp. 3–56). Washington, DC: American Educational Research Association.

    Google Scholar 

  • Fiske, D. W. (2002). Validity for what? In H. I. Braun, N. Jackson, & D. Wiley (Eds.), The role of constructs in psychological and educational measurement (pp. 169–178). Hillsdale: Lawrence Erlbaum.

    Google Scholar 

  • Gigerenzer, G., & Marewski, J. N. (2015, February). Surrogate science: The idol of a universal method for scientific inference. Journal of Management, 41(2), 421–440. https://doi.org/10.1177/0149206314547522.

    Article  Google Scholar 

  • Gould, S. J. (1981). The mismeasure of man. New York: W.W. Norton.

    Google Scholar 

  • Hacking, I. (1995). The looping effects of human kinds. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition: A multidisciplinary debate (pp. 351–394). Oxford: Clarendon Press.

    Google Scholar 

  • Heilbron, J., Magnusson, L., & Wittrock, B. (Eds.). (1998). The rise of the social sciences and the formation of modernity: Conceptual change in context, 1750–1850. Boston: Kluwer Academic Publishers.

    Google Scholar 

  • Jenkins, J. G. (1946). Validity for what? Journal of Consulting Psychology, 10, 93–98.

    Article  Google Scholar 

  • Kadir, K. A. (2008). Framing a validity argument for test use and impact: The Malaysian public service experience (esp. chapter 2 on history of validity p. 29). Dissertation.

    Google Scholar 

  • Karson, M. (2007). Nomothetic versus idiographic. In N. J. Salkind & K. Rasmussen (Eds.), Encyclopedia of Measurement and statistics. New York: Sage. https://doi.org/10.4135/9781412952644.

    Google Scholar 

  • Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park: Sage.

    Google Scholar 

  • Lippman, W. (1922, November 8). The reliability of intelligence tests. The New Republic (pp. 275–277).

    Google Scholar 

  • MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323–326.

    Article  Google Scholar 

  • Matters, G., & Pitman, J. A. (1994). The validity–reliability trade-off. 20th annual conference of the International Association for Educational Assessment (IAEA). Wellington.

    Google Scholar 

  • Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35(11), 1012–1027.

    Article  Google Scholar 

  • Messick, S. (1998). Test validity: A matter of consequence. Social Indicators Research, 45(1–3), 35–44.

    Article  Google Scholar 

  • Moss, P. A. (1992). Shifting conceptions of validity in educational measurement: Implications for performance assessment. Review of Educational Research, 62(3), 229. Retrieved from http://ezproxy.msu.edu.proxy1.cl.msu.edu/login?url=http://search.proquest.com.proxy1.cl.msu.edu/docview/1290947129?accountid=12598.

    Article  Google Scholar 

  • NIH [National Institutes of Health]. (2007, October 15). National center for advancing translational sciences. Available: https://ncats.nih.gov/. Accessed 31 Oct 2015.

  • Nuzzo, R. (2014, February 13). Scientific method: Statistical errors. Nature, 506, 150–152. https://doi.org/10.1038/506150a. http://www.nature.com/news/scientific-method-statistical-errors-1.14700

  • Paloma, C. A., & Banta, T. W. (1999). Assessment essentials: Planning, implementing, improving. New York: Jossey-Bass.

    Google Scholar 

  • Reliability vs. validity. (2005, September 26). Bloomberg business. Online version available: http://www.bloomberg.com/bw/stories/2005-09-28/reliability-vs-dot-validity

  • Schwartz, D. L., & Arena, D. (2013). Measuring what matters most: Choice-based assessments for the digital age. Cambridge, MA: MIT Press.

    Google Scholar 

  • Shepard, L. A. (2013). Validity for what purpose? Teachers College Record, 115(9), 1–12. http://www.tcrecord.org ID Number: 17116, Date Accessed: 10/14/2015 8:12:55 AM.

    Google Scholar 

  • Shultz, M. M., & Zedeck, S. (2008). Identification, development, and validation of predictors for successful lawyering. Berkeley Law School Research Grant Report. https://www.law.berkeley.edu/files/LSACREPORTfinal-12.pdf

  • Siegfried, T. (2015, July 2). Science is heroic, with a tragic (statistical) flaw. Science News Online. https://www.sciencenews.org/blog/context/science-heroic-tragic-statistical-flaw

  • Steele, C. M., & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African-Americans. Journal of Personality and Social Psychology, 69(5), 797–811.

    Article  Google Scholar 

  • Terman, L. M. (1916). The measurement of intelligence: An explanation of and a complete guide for the use of the Stanford revision and extension of the Binet-Simon intelligence scale. Boston: Houghton Mifflin.

    Book  Google Scholar 

  • Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. The Psychological Review: Monograph Supplements, 2(4), i–109. https://doi.org/10.1037/h0092987.

    Google Scholar 

  • Westen, D., & Rosenthal, R. (2003). Quantifying construct validity: Two simple measures. Journal of Personality and Social Psychology, 84(3), 608–618. https://doi.org/10.1037/0022-3514.84.3.608. Accessed 23 Oct 2015 4:30:19 PM EDT.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lynn Fendler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fendler, L. (2018). Validity-Versus-Reliability Tradeoffs and the Ethics of Educational Research. In: Smeyers, P., Depaepe, M. (eds) Educational Research: Ethics, Social Justice, and Funding Dynamics. Educational Research, vol 10. Springer, Cham. https://doi.org/10.1007/978-3-319-73921-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73921-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73920-5

  • Online ISBN: 978-3-319-73921-2

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics