Quality & Quantity

, Volume 47, Issue 4, pp 2225–2257 | Cite as

Merging the accountability and scientific research requirements of the No Child Left Behind Act: using cohort control groups

  • Jean Stockard


This article shows how assessment data such as that mandated by the No Child Left Behind Act can be used to examine the effectiveness of educational interventions and meet the Act’s mandate for “scientifically based research.” Based on the classic research design literature a cohort control group and a cohort control group with historical comparisons design are suggested as internally valid analyses. The logic of the “grounded theory of generalized causal inference” is used to develop externally valid results. The procedure is illustrated with published data regarding the Reading Mastery curriculum. Empirical results are comparable to those obtained in meta-analyses of the curriculum, with effect sizes surpassing the usual criterion for educational importance. Implications for school officials and policy makers are discussed.


Research designs Cohort control groups Assessment data Evaluation research 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Agaodino R., Dynarski M.: Are experiments the only option? A look at dropout prevention programs. Rev. Econ. Stat. 86, 180–194 (2004)CrossRefGoogle Scholar
  2. Boruch R.: Better evaluation for evidence-based policy: place randomized trials in education, criminology, welfare, and health. Ann. AAPS 599(May), 6–18 (2005)Google Scholar
  3. Brunswick E.: Perception and the Representative Design of Psychological Experiments, 2nd edn. University of California Press, Berkeley (1956)Google Scholar
  4. Campbell D.T.: Relabeling internal and external validity for applied social scientists. In: Trochim, W.M.K. (ed.) Advances in Quasi-Experimental Design and Analysis, pp. 67–77. Jossey-Bass, San Francisco (1986)Google Scholar
  5. Campbell D.T., Stanley J.C.: Experimental and Quasi-Experimental Designs for Research. Rand McNally, Chicago (1963)Google Scholar
  6. Cartwright N.: Are RCTs the gold standard?. BioSocieties 2, 11–20 (2007)CrossRefGoogle Scholar
  7. Clay R.A.: More than one way to measure. Monit. Psychol. 41(September), 52–55 (2010)Google Scholar
  8. Cook T.D.: The generalization of causal connections: multiple theories in search of clear practice. In: Sechrest, L., Perrin, E., Bunker, J. (eds.) Research Methodology: Strengthening Causal Interpretations of Nonexperimental Data (DHHS Publication No. PHS 90-3454)., pp. 9–31. Department of Health and Human Services, Rockville (1990)Google Scholar
  9. Cook T.D.: Clarifying the warrant for generalized causal inferences in quasi-experimentation. In: McLaughlin, M.W., Phillips, D.C. (eds.) Evaluation and education: At Quarter-Century, pp. 115–144. National Society for the Study of Education, Chicago (1991)Google Scholar
  10. Cook T.D.: Randomized experiments in educational policy research: a critical examination of the reasons the educational evaluation community has offered for not doing them. Educ. Eval. Policy Anal. 24, 175–199 (2002)CrossRefGoogle Scholar
  11. Cook T.D.: Why have educational evaluators chosen not to do randomized experiments. Ann. AAPS 589, 114–149 (2003)Google Scholar
  12. Cook T.D.: Emergent principles for the design, implementation, and analysis of cluster-based experiments in social science. Ann. AAPS 599(May), 176–198 (2005)Google Scholar
  13. Cook T.D., Campbell D.T.: Quasi-Experimentation: Design and Analysis Issues for Field Settings. Rand McNally, Chicago (1979)Google Scholar
  14. Cook T.D., Steiner P.M.: Case matching and the reduction of selection bias in quasi-experiments: the relative importance of pretest measures of outcome, of unreliable measurement, and of mode of data analysis. Psychol. Methods 15, 56–58 (2010)CrossRefGoogle Scholar
  15. Cook T.D., Scriven M., Coryn C.L.S., Evergreen S.D.H.: Contemporary thinking about causation in evaluation: a dialogue with Tom Cook and Michael Scriven. Am. J. Eval. 31, 105–117 (2010)CrossRefGoogle Scholar
  16. Collins M., Carnine D.: Evaluating the field test revision process by comparing two versions of a reasoning skills CAI program. J. Learn. Disabil. 21, 375–379 (1988)CrossRefGoogle Scholar
  17. Couglin, C.: (forthcoming) A review of the direct instruction literature: a four decade program of research. In: John, L. (ed.). Direct Instruction and Evidence-Based Practice. ADI Press, EugeneGoogle Scholar
  18. Cronbach L.J.: Designing Evaluations of Educational and Social Programs. Jossey-Bass, San Francisco (1982)Google Scholar
  19. Cronbach L.J., Meehl P.E.: Construct validity in psychological tests. Psychol. Bull. 52, 281–302 (1955)CrossRefGoogle Scholar
  20. Engelmann S.: Teaching Needy Kids in Our Backward System: 42 Years of Trying. ADI Press, Eugene (2007)Google Scholar
  21. Engelmann, Z.: Socrates on gold standard experiments. (2009). Accessed 8 Dec 2011
  22. Engelmann S.E., Carnine D.: Theory of Instruction: Principles and Applications. Irvington Publishers, New York (1982)Google Scholar
  23. Engelmann S.E., Engelmann K.E.: Impediments to scaling up effective comprehensive school reform models. In: Glennan, T.K., Bodilly, S.J., Galegher, J.R., Kerr, K.A. (eds.) Expanding the Reach of Education Reforms: Perspectives from Leaders in the Scale-up of Educational Interventions, pp. 107–133. Rand, Santa Monica (2004)Google Scholar
  24. Fayer H.: Place randomized trials: experimental tests of public policy. Ann. AAPS 599(May), 272–291 (2005)Google Scholar
  25. Fuchs L.S., Fuchs D., Hosp M.K., Jenkins J.R.: Oral reading fluency as an indicator of reading competence: a theoretical, empirical, and historical analysis. Sci. Stud. Read. 5(3), 239–256 (2001)CrossRefGoogle Scholar
  26. Glazerman S., Levy D.M., Myers D.: Nonexperimental versus experimental estimates of earnings impacts. Ann. AAPSS 589, 63–93 (2003)CrossRefGoogle Scholar
  27. Good R.H., Simmons D.C., Kame’enui E.J.: The importance and decision-making utility of a continuum of fluency-based indicators of foundational reading skills for third-grade high-stakes outcomes. Sci. Stud. Read. 5(3), 257–288 (2001)CrossRefGoogle Scholar
  28. Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. London and New York: Routledge.Google Scholar
  29. Heinsman D.T., Shadish W.R.: Assignment methods in experimentation: when do nonrandomized experiments approximate answers from randomized experiments?. Psychol. Methods 1, 154–169 (1996)CrossRefGoogle Scholar
  30. Huitt W.G., Monetti D.M., Hummel J.H.: Direct approach to instruction. In: Reigeluth, C., Carr-Chellman, A. (eds.) Instructional-Design Theories and Models: Volume III, Building a Common Knowledge Base, pp. 73–98. Lawrence Erlbaum, Mahwah (2009)Google Scholar
  31. Julnes G., Rog D.J.: Current federal policies and controversies over methodology in evaluation. N. Dir. Eval. 113(spring), 1–12 (2007)CrossRefGoogle Scholar
  32. Maxwell S.E.: Introduction to the special section on Campbell’s and Rubin’s conceptualizations of causality. Psychol. Methods 15, 1–2 (2010)CrossRefGoogle Scholar
  33. McMillan, J.H.: Randomized field trials and internal validity: not so fast my friend. Pract. Assess. Res. Eval. 12(15), (2007). Accessed 8 Dec 2011
  34. Millenson M.: Demanding Medical Excellence: Doctors and Accountability in the Information Age. University of Chicago Press, Chicago (1997)Google Scholar
  35. Miron, G.: The constructive use of existing data and research for evaluating charter schools. Paper prepared for the Symposium on the Use of School-Level Data for Evaluating Federal Education Programs, December 8–9, 2005, Washington (2005)Google Scholar
  36. Morgan S.L., Winship C.: Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge University Press, New York (2007)CrossRefGoogle Scholar
  37. National Mathematics Advisory Panel (NMAP): Foundations for Success: The Final Report of the National Mathematics Advisory Panel. US Department of Education, Washington (2008)Google Scholar
  38. Noble J.H. Jr: Meta-analysis: methods, strengths, weaknesses, and political uses. J. Lab. Clin. Med. 147, 7–20 (2006)CrossRefGoogle Scholar
  39. Odom S.L., Brantlinger E., Gersten R., Horner R.H., Thompson B., Harris K.R.: Research in special education: scientific methods and evidence-based practices. Except. Child. 71, 137–148 (2005)Google Scholar
  40. Raudenbush S.W.: Advancing educational policy by advancing research on instruction. Am. Educ. Res. J. 45, 206–230 (2008)CrossRefGoogle Scholar
  41. Robinson, D.H.: Scientific research is programmatic. In: Scientifically-based Education Research and Federal Funding Agencies: The Case of the No Child Left Behind Legislation, pp 121–128. Information Age Publishing, Charlotte (2004)Google Scholar
  42. Rubin D.B.: Reflections stimulated by the comments of Shadish (2010) and West and Thoemmes (2010). Psychol. Methods 15, 38–46 (2010)CrossRefGoogle Scholar
  43. Sampson R.J.: Gold standard myths: observations on the experimental turn in quantitative criminology. J. Quant. Criminol. 26, 489–500 (2006)CrossRefGoogle Scholar
  44. Schwandt T.A.: A diagnostic reading of scientifically based research for education. Educ. Theory 55, 285–305 (2005)Google Scholar
  45. Scriven, M.: The logic of causal investigations. Unpublished paper, Western Michigan University (n.d.)Google Scholar
  46. Scriven, M.: Can we infer causation from cross-sectional data? National Academy of Sciences. (2005). Accessed 8 Dec 2011
  47. Scriven M.: A summative evaluation of RCT methodology: and an alternative approach to causal research. J. MultiDiscipl. Eval. 5, 11–24 (2008)Google Scholar
  48. Shadish W.R.: Campbell and Rubin: a primer and comparison of their approaches to causal inference in field settings. Psychol. Methods 15, 3–17 (2010)CrossRefGoogle Scholar
  49. Shadish W.R., Cook T.D.: The renaissance of field experimentation in evaluating interventions. Ann. Rev. Psychol. 60, 607–629 (2009)CrossRefGoogle Scholar
  50. Shadish W.R., Cook T.D., Campbell D.T.: Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin, Boston (2002)Google Scholar
  51. Shadish W.R., Clark M.H., Steiner P.M.: Can randomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. J. Am. Stat. Assoc. 103, 1334–1343 (2008)CrossRefGoogle Scholar
  52. Sherman L.W.: Misleading evidence and evidence-led policy: making social science more experimental. Ann. AAPSS 589(September), 6–19 (2003)CrossRefGoogle Scholar
  53. Slavin R.E.: What works? Issues in synthesizing educational program evaluations. Educ. Res. 37, 5–14 (2008)Google Scholar
  54. Sloane F.: Randomized trials in mathematics education: recalibrating the proposed high watermark. Educ. Res. 37, 624–630 (2008a)Google Scholar
  55. Sloane F.: Through the looking glass: experiments, quasi-experiments, and the medical model. Educ. Res. 37, 41–46 (2008b)Google Scholar
  56. St. Pierre E.A.: Scientifically based research in education: epistemology and Ethics. Adult Educ. Quart. 56, 239–266 (2006)CrossRefGoogle Scholar
  57. Towne L., Wise L.L., Winters T.M.: Advancing Scientific Research in Education. National Academies Press, Washington (2005)Google Scholar
  58. U. S. Department of Education: Improving Teacher Quality State Grants, ESEA Title II, Part A, Non-Regulatory Guidance. US Department of Education, Washington, October 5 (2006) Accessed 8 Dec 2011
  59. Weisburd D., Lum C.M., Petrosino A.: Does research design affect study outcomes in criminal justice?. Ann. AAPSS 578, 50–70 (2001)CrossRefGoogle Scholar
  60. West S.G., Thoemmes F.: Campbell’s and Rubin’s perspectives on causal inference. Psychol. Methods 15, 18–37 (2010)CrossRefGoogle Scholar
  61. What Works Clearinghouse: Procedures and Standards Handbook (Version 2.0). (2008). Accessed 8 Dec 2011
  62. Whitehurst, G.J.: The Institute of Education Sciences: new wine, new bottles. Paper presented at the 2003 Annual Meeting of the American Educational Research Association, April 22 (2003)Google Scholar
  63. Williams B.A.: Perils of Evidence-Based Medicine. Perspect. Biol. Med. 53, 106–120 (2010)CrossRefGoogle Scholar
  64. Winship C., Morgan S.L.: The estimation of causal effects from observational data. Ann. Rev. Sociol. 25, 659–706 (1999)CrossRefGoogle Scholar
  65. Yin R.K., Davis D.: Adding new dimensions to case study evaluations: the case of evaluating comprehensive reforms. N. Dir. Eval. 10, 75–93 (2007)CrossRefGoogle Scholar
  66. Zdep S.M., Irvine S.H.: A reverse Hawthorne effect in educational evaluation. J. School Psychol. 8, 89–95 (1970)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Department of Planning Public Policy and ManagementUniversity of OregonEugeneUSA
  2. 2.National Institute for Direct InstructionEugeneUSA

Personalised recommendations