The revised Society for Prevention Research (SPR) standards of evidence are an exciting advance in the field of prevention science. We appreciate the committee’s vision that the standards represent goals to aspire to rather than a set of benchmarks for where prevention science is currently. The discussion about the standards highlights how much has changed in the field over the last 10 years and as knowledge, theory, and methods continue to advance, the new standards push the field toward increasing rigor and relevance. The discussion also acknowledges the shifting landscape of current and ongoing prevention needs, which is influenced by changes in culture, technology, and other unpredictable forces (e.g., cyberbullying, refugee resettlement, natural disasters).

The authors of this commentary work for the Administration for Children and Families (ACF), a division of the Department of Health and Human Services, which aims to serve vulnerable children and families through a range of prevention programs (e.g., child maltreatment, home visitation). Within ACF, the Office of Planning, Research, and Evaluation (OPRE) is the primary research and evaluation office, whose mission is to inform the improvement of ACF programs through research and evaluation. In this commentary, the authors consider implications of the new SPR standards for research, policy, and practice with the lens of their work in OPRE.

ACF leadership needs timely, relevant, and trustworthy information to inform their decision-making. OPRE’s work is accomplished through sponsoring and overseeing evaluations and utilizing and translating the broader empirical literature to inform policy and practice. SPR’s work has important implications for our work. The updated standards of evidence are critical to ensuring the research of the prevention science community can facilitate ACF’s mission through supporting high-quality evaluation design and execution, encouraging a continuum of evidence building, and highlighting the importance of implementation for outcomes and scale-up.

Translating High-quality Evaluations to Support Evidence-based Policy

OPRE aims to play a key role in translating research to inform program and policy decision-making. OPRE accomplishes this goal through a variety of activities. One example is our systematic reviews of evidence. The purpose of these reviews is to provide information to policy-makers, program administrators, and others in the practice community about the amount and quality of impact evidence currently available to support decision-making. The utility of our systematic reviews for decision-making is driven by the quality of the available impact research on prevention programs. Because decision-makers must make decisions about investments timely, their decisions will be made using the information available, even if that information is limited (Haskins and Baron 2011). The revised standards set forward by SPR are critical to support the generation of high-quality evidence on prevention programs (e.g., Efficacy standards 3.d, 4, 5, 6, and 7) for inclusion in systematic reviews. For example, in executing one OPRE-funded systematic review, large gaps were identified in the quality and reporting of factors related to internal validity, making assessing the quality of the evidence challenging (Avellar and Paulsell 2011). Work on this systematic review highlighted factors in the SPR standards that are not consistently reported in the empirical literature, leaving the systematic review team to contact authors to obtain the information or exclude some research from the systematic review because the quality is unclear. For example, Avellar and Paulsell (2011) noted authors often do not pre-specify the outcomes of interest for transparency in the number of tests conducted nor do authors regularly make adjustments for multiple comparisons. In addition, this work revealed that many impact studies either do not report effect sizes or do not report how effect sizes were calculated (Avellar and Paulsell 2011). For evaluations to be useful for policy-making, greater transparency in design, execution, and analysis is paramount.

In addition to having rigorous standards regarding internal validity, the revised SPR standards make a clear statement about the importance of external validity (Efficacy standard 2.c and Effectiveness standards 3a, b and c). ACF programs serve diverse populations in terms of culture, race, ethnicity, language, income, gender, region or state, and other factors. In our work related to disseminating empirical evidence to policy and practice, questions are continually raised about how far the results of the available impact trials can be generalized to new populations, settings, and points in time. As decision-makers choose among evidence-based programs, they want to know if a given program will work with their population in their community even though their community is different from the characteristics of the population in the impact trial. OPRE-sponsored work has identified significant gaps in the empirical literature on external validity for evidence-based programs (Avellar and Paulsell 2011). To our knowledge, no standards currently exist to assess external validity, making it difficult to supporting decision-makers in this respect. Unless the field reports information related to external validity, systematic reviews will be constrained in their ability to help decision-makers understand the external validity of prevention programs.

We appreciate the task force’s acknowledgement of the non-linear progression of evidence building on prevention programs, and the emphasis that this portfolio of work should be collaborative and practice-oriented. In some cases where the evidence for particular populations is not established, ACF is supporting a continuum of evidence building with a community-based participatory framework. In one example of this work, through a provision of the Affordable Care Act of 2010, funding for home visiting was greatly expanded for states, territories, and tribes to establish evidence-based home visiting programs for at-risk pregnant women and children from birth to 5 years old (Supplee et al. 2013). This program is called the Maternal Infant and Early Childhood Home Visiting (MIECHV) and is jointly administered by Health Resources and Services Administration (HRSA) and ACF. Three percent of these funds were set aside for tribes, tribal organizations, and urban Indian organizations. Initially, the systematic review of the evidence of effectiveness of home visiting did not find any home-visiting model to have evidence of effectiveness for tribes (Del Grosso et al. 2012). Therefore, ACF required each tribal grantee to conduct a small but rigorous evaluation of its home-visiting program, its implementation, or both. To facilitate success with these activities, OPRE funded the Tribal Home Visiting Evaluation Institute (TEI) to provide technical assistance to grantees and their evaluators in designing and carrying out locally driven rigorous evaluations of home visiting. The TEI aims to provide individualized, culturally relevant technical assistance that empowers grantees to conduct research and evaluation that is meaningful for the tribe and meets requirements of the program that grantee evaluations must demonstrate rigor (i.e., credibility, applicability, consistency, and neutrality). The TEI builds on ACF’s past and continuing efforts to build evaluation and research capacity within AI/AN communities to support the well-being of their own communities (Tribal Evaluation Workgroup 2013).

Readiness to Scale: Implementation and Scale-Up

The potential of evidence-based policy to improve outcomes for children and families is achievable only if evidence-based programs are implemented with quality (Supplee and Metz 2015). Currently, the available literature does not include enough information about implementation to understand what it really takes to execute evidence-based programs at scale (e.g., Paulsell et al. 2014). Over the past 5 years, the work scaling-up many evidence-based programs revealed that developers of evidence-based programs face serious challenges meeting the needs of communities and users (Supplee 2014). We applaud SPR standards for emphasizing the importance of documenting implementation, empirically studying implementation, and creating the knowledge and materials needed for implementation and replication (e.g., Efficacy standards 2.a, 2.d, 3.b and D; Effectiveness standards 2, 5, and 6.d, and Dissemination standards 2, 3, 4, and 5). For example, in scaling up teen pregnancy prevention programs, the lack of articulation of core components of the models made it challenging to make decisions about which adaptations to fit local context were positive to improve fit versus negative because they altered a core component related to efficacy (Margolis and Roper 2014). The availability of high-quality materials, training, and ongoing technical assistance has been central not only to inform selection of one program over another but also to inform the quality of implementation (Dworkin, Pinto, Hunter, Rapkin, and Remien 2008). We support the inclusion of these important factors in the SPR standards.

Policy-makers need to know how much something may cost to implement—including start-up and ongoing costs—in addition to its potential to yield a return on investment. To produce accurate estimates to guide communities’ investments, the field needs information on the cost of implementation in the real world, outside of efficacy trials. The SPR standards’ emphasis on generating information on cost in each stage of evidence building is critical to the success of evidence-based policy.

Articulation of Theory and Theory Testing

The revised standards acknowledge and specify the key role of articulating and testing theory in building knowledge and understanding about effective approaches for prevention (Efficacy standards 2.b. and 4). The process of testing theory to build our knowledge of effective prevention is an important contribution to meeting OPRE’s mission. One example of this type of work supported by OPRE is the Buffering Toxic Stress Consortium, a set of six University-Head Start Partnership projects evaluating promising parenting interventions in Early Head Start settings (Consortium Principal et al. 2013). While all six of these projects are testing the compelling conceptual theory that hypothesizes that changes in parents’ warmth and responsiveness mediate the impact of adversity on child well-being (2.d), each is testing a separate intervention with its own strategies and “action theory” for changing those parent mediators. Beyond building knowledge about a limited set of interventions that may have promise for changing developmental trajectories, this approach of theory testing can facilitate the development of well-informed interventions that target parent warmth and responsiveness.

Considerations for the Future

To conclude this commentary, we would like to raise some additional factors we view as gaps in the current standards. We hope the field continues to address these issues and future iterations of the SPR standards will consider their inclusion. In particular, we highlight two areas where we believe the standards could be strengthened: the accumulation of multiple types of evidence, particularly at the dissemination stage, and the importance of transparency in research and evaluation.

Evidence Building at Scale

It is unfortunate that the new standards constrain the types of evidence to be collected at scale. We wholeheartedly agree that simply handing off an evidence-based program and assuming it continues to produce the desired outcomes is not a tenable position. We also acknowledge that in the kinds of programs supported by ACF, it is rarely feasible to continually embed impact tests. We agree there is value in studying the program after a few years of implementation to build the knowledge base about impacts of that program at scale. However, there is equal value, from our perspective, in building a culture of continuous improvement with shared responsibility between the community-based organization, the funder, and the developer of the evidence-based program (Chambers, Glasgow, and Stange 2013; Metz and Albers 2014). In order for prevention to meet its promise for population impact, the community-based organizations that adopt the preventive interventions need tools to help them monitor the quality of implementation and outcomes obtained (Spoth et al. 2013). The funders of those programs need data to show whether the investments made in a particular program are paying off. The developer of the evidence-based program benefits from regular data from implementers to support ongoing program improvement.

To ensure a full portfolio of evidence, we support the integrated use of appropriate performance measurement, the use of continuous quality improvement techniques, and feedback loops between stakeholder groups, along with embedding rigorous impact trials at scale when appropriate. We see value in a close partnership between the research and practice community to build this evidence and develop a shared ownership of data and performance monitoring for sustainability. A close partnership includes the research community designing measures and monitoring systems for communities to assess outcomes. To enable success, a close partnership may require researchers to provide technical assistance to communities on effective means to use data to support high-quality implementation. Finally, a close partnership between research and practice may provide opportunities to embed tests of both implementation and outcomes at scale.

Transparency

The second missed opportunity we see in the current standards is related to a growing movement demanding greater transparency and accountability around research and evaluation (Humphreys et al. 2013; Miguel et al 2014; Nosek et al. 2015). As the focus on use of evidence in decision-making has increased, so has the importance of ensuring that a range of stakeholders using evaluation feel confident they can trust the design, analysis, and interpretation of the research. We appreciate that the standards make a recommendation related to independent evaluation of evidence-based programs as part of scale-up. However, transparency in research includes more than just the independence of the evaluator. We believe the standards would be enhanced through the inclusion of a recommendation to engage in practices such as pre-registering clinical trials, pre-specifying analytic plans (or at least pre-specifying exploratory versus confirmatory empirical tests), and sharing data to allow for replication of findings by independent analysts. There are multiple registries emerging to support investigators in this work (e.g., clinicaltrials.gov, American Economic Association registry for clinical trials, Registry of Clinical Trials on What Works Clearinghouse). Many of the registries also include an option for pre-specifying analytic plans. We recognize there are pros and cons to pre-specifying analytic plans and plans may not be appropriate in every trial. However, these issues should be catalysts for thinking critically about methods to address transparency head on. For example, practices such as a lack of adjustment for multiple comparisons and the “file drawer problem” (i.e., lack of publication of non-significant findings) compromise our ability to build and maintain trust in science. Regarding replication of findings, there has been a rapid growth in the submission of data to archives to allow for replication of analysis. Practices to enhance research transparency are important to ensure the work of SPR remains trusted to the policy and practice community.

Conclusion

In conclusion, we are very excited to see the revised standards of evidence. We believe these standards will forward the important work of the prevention science community and they complement key principles that guide OPRE’s planning, conduct, and use of evaluation to support ACF programs. Ideally, the evidence OPRE creates will be rigorous, relevant, independent, transparent, and ethical resulting in policies that support children and families (Administration for Children and Families Evaluation Policy 2014).