Skip to main content

Cause or Effect? Validating the Use of Tests for High-Stakes Inferences in Education

  • Conference paper
  • First Online:
Looking Back

Part of the book series: Lecture Notes in Statistics ((LNSP,volume 202))

Abstract

A good aphorism can, in a few words, capture an essential truth. Of the many good aphorisms Paul Holland has coined over the years, I have found myself invoking the one above frequently enough to worry that I should be paying out royalty fees, so it is only fitting that I use it as the starting point for some ideas I wish to explore in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It would also be possible to compare a single tutoring program to a control condition of no tutoring, but this comparison would introduce a clear source of bias in the sense that students enrolled in tutoring are likely to be more motivated than those who are not.

References

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Validity. In Standards for educational and psychological testing (pp. 9–24). Washington, DC: American Educational Research Association.

    Google Scholar 

  • Ballou, D., Sanders, W., & Wright, P. (2004). Controlling for student background in value-added assessment of teachers. Journal of Educational and Behavioral Statistics, 29(1), 37–65.

    Article  Google Scholar 

  • Borsboom, D., Mellenbergh, G., & Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071.

    Article  Google Scholar 

  • Briggs, D. C. (2008). Synthesizing causal inferences. Educational Researcher, 37(1), 15–22.

    Article  Google Scholar 

  • Briggs, D. C., & Wiley, E. (2008). Causes and effects. In L. Shepard & K. Ryan (Eds.), The future of test-based educational accountability. New York, NY: Routledge.

    Google Scholar 

  • Burch, P., Steinberg, M., & Donovan, J. (2007). Supplemental educational services and NCLB: Policy assumptions, market practices, emerging issues. Educational Evaluation and Policy Analysis, 29(2), 115–133.

    Article  Google Scholar 

  • Cronbach, L. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.

    Google Scholar 

  • Cronbach, L., & Meehl, P. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.

    Article  Google Scholar 

  • Ferrara, S. (2006). Standardized assessment of individual achievement in K-12. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 579–621). Westport, CT: American Council on Education/Praeger.

    Google Scholar 

  • Holland, P. W. (1986). Statistics and causal inference (with discussion and rejoinder). Journal of the American Statistical Association, 81, 945–970.

    Article  MathSciNet  MATH  Google Scholar 

  • Holland, P. W., & Thayer, D. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Holland, P. (2004). Evidence for causal inference in education research. Invited session on inference, Evidence and Scientific Research at the annual conference of the American Educational Research Association, San Diego, CA.

    Google Scholar 

  • Kane, M. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527–535.

    Article  Google Scholar 

  • Kane, M. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education/Praeger.

    Google Scholar 

  • Koretz, D., & Hamilton, L. (2006). Testing for accountability in K–12. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 531–578). Westport, CT: American Council on Education/Praeger.

    Google Scholar 

  • Linn, R. (2006). Validity and reliability of student assessment results. Unpublished manuscript.

    Google Scholar 

  • Lockwood, J. R., McCaffrey, D. F., Hamilton, L. S., Stecher, B., Le, V., & Martinez, J. F. (2007). The sensitivity of value-added teacher effect estimates to different mathematics achievement measures. Journal of Educational Measurement, 44(1), 47–68.

    Article  Google Scholar 

  • McCaffrey, D. F., Lockwood, J. R., Koretz, D., Louis, T. A., & Hamilton, L. (2004). Models for value-added modeling of teacher effects. Journal of Educational and Behavioral Statistics, 29(1), 67–101.

    Article  Google Scholar 

  • Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: American Council on Education/MacMillan.

    Google Scholar 

  • No Child Left Behind Act of 2001, Pub. L. No. 107–110 § 115 Stat. 1425.

    Google Scholar 

  • OMNI Institute. (2008). Evaluation of supplemental educational services: 2006–07 academic year data. Unpublished manuscript.

    Google Scholar 

  • Ridgway, J., & Schoenfeld, A. H. (1994). Balanced assessment: Designing assessment schemes to promote desirable change in mathematics education. Keynote paper for the EARLI Email Conference on Assessment.

    Google Scholar 

  • Ridgway, J., Zawojewski, J., & Hoover, M. (2000). Problematising evidence-based policy and practice. Evaluation and Research in Education, 14(3, 4), 181–192.

    Article  Google Scholar 

  • Rubin, D., Stuart, A., & Zannato, E. (2004). A potential outcome view of value-added assessment in education. Journal of Educational and Behavioral Statistics, 29(1), 103–116.

    Article  Google Scholar 

  • Sanders, W. L., Saxton, A. M., & Horn, S. P. (1997). The Tennessee value-added assessment system, a quantitative, outcomes-based approach to educational measurement. In J. Millman (Ed.), Grading teachers, grading schools. Is student achievement a valid evaluation measure? (pp. 137–162). Thousand Oaks, CA: Corwin Press.

    Google Scholar 

  • Shepard, L. (1993). Evaluating test validity. Review of Educational Research, 19, 405–450.

    Google Scholar 

  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361–370.

    Article  Google Scholar 

  • The SAS Corporation. (n.d.). SAS ® EVAAS ® for K–12. Retrieved from http://www.sas.com/govedu/edu/k12/evaas/index.html.

  • U.S. Department of Education. (2004). Standards and assessments peer review guidance: Information and examples for meeting requirements of the No Child Left Behind Act of 2001. Washington, DC: Author.

    Google Scholar 

  • Vergari, S. (2007). Federalism and market-based education policy: The supplemental educational services mandate. American Journal of Education, 113, 311–339.

    Article  Google Scholar 

  • Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Derek C. Briggs .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this paper

Cite this paper

Briggs, D.C. (2011). Cause or Effect? Validating the Use of Tests for High-Stakes Inferences in Education. In: Dorans, N., Sinharay, S. (eds) Looking Back. Lecture Notes in Statistics(), vol 202. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9389-2_8

Download citation

Publish with us

Policies and ethics