Introduction

Principles of effective correctional treatment

A growing body of research over the past two decades has established that treatment programs in correctional settings tend to be ineffective when key features are lacking and tend to be effective when those features are present. Those key features have become known as the “principles of effective correctional treatment.” These principles were first presented in 1990 in a widely cited meta-analysis by Don Andrews and colleagues (1990b), in which they summarized the findings of 154 studies of correctional programs for juvenile and adult offenders. The average effect size for programs rated as providing “appropriate” service was relatively large (phi coefficient = .30); by contrast, for programs rated as providing “inappropriate” service, the treatment groups had poorer outcomes than did the comparison groups (phi = −.06); similarly for programs using criminal sanctions only, outcomes favored the comparison groups (phi = −.07). “Appropriate” service was defined as programs that were designed and delivered in accord with the principles of risk, need, and responsivity (described below), which Andrews and colleagues had developed in earlier work (see Andrews et al. 1990a).

Andrews and his colleagues subsequently published other articles and meta-analyses supporting these principles of effective correctional treatment (e.g., Andrews and Bonta 1998; Andrews and Dowden 2006; Dowden and Andrews 2000; Gendreau et al. 1996), and other researchers have also found that the principles hold up when examined using various approaches (e.g., Jolley and Kerbs 2010; Lowenkamp et al. 2006; Ogloff and Davis 2004; Thanner and Taxman 2003). Although the principles have not escaped criticism (Byrne 2012; Ward et al. 2007), they have become generally accepted to the point that they are included in criminology textbooks (e.g., Andrews and Bonta 2010; Bartol and Bartol 2011; McGuire 2004) and have been adopted in many correctional systems (e.g., Canada, see Correctional Service Canada 2012; California, see California Department of Corrections and Rehabilitation 2007; Colorado, see Colorado Division of Probation Services 2010). Furthermore, these principles may be applied at nearly every stage of correctional processing, from determining release and supervision decisions to treatment eligibility and service provisions (Andrews 2006; Hannah-Moffat 2005; MacKenzie 2006; Ogloff and Davis 2004). (For brevity, we will subsequently refer to “Andrews” or to the “Andrews Principles,” with the understanding that Andrews’ colleagues—Zinger, Hoge, Bonta, Gendreau, Cullen, Dowden, and others—made substantial contributions to the development and validation of the risk, need, and responsivity principles.)

Risk principle

Andrews described the risk principle in their 1990 article as follows:

The risk principle suggests that higher levels of service are best reserved for higher risk cases and that low-risk cases are best assigned to minimal service.… In brief, … the effects of treatment typically are found to be greater among higher risk cases than among lower risk cases. This is expected unless the need and/or responsivity principles are violated (Andrews et al. 1990b: 374).

The risk principle consists of two elements: (1) offenders who are assessed as being at higher risk for recidivism will show a greater reduction in recidivism following treatment than will lower-risk clients; and (2) higher-risk clients should receive high-intensity services and lower-risk clients should receive low-intensity services, with “intensity” referring either to frequency and duration of service contact or to the variety of services provided, or both. Lower-risk offenders will benefit from low-intensity services or even from no services, but they may do worse if they participate in high-intensity services along with high-risk clients (Cécile and Born 2009; Dodge et al. 2006; Lowenkamp and Latessa 2004).

Need principle

Andrews described the need principle as follows:

The most promising intermediate targets [for correctional rehabilitation programs] include changing antisocial attitudes, feelings, and peer associations; promoting … [family bonds]…; promoting identification with anticriminal role models; increasing self-control and self management skills; replacing the skills of lying, stealing, and aggression with other, more prosocial skills; reducing chemical dependencies; and generally shifting the density of rewards and costs for criminal and noncriminal activities in familial, academic, vocational, and other behavioral settings… Less-promising targets include increasing self-esteem without touching antisocial propensity…, increasing the cohesiveness of antisocial peer groups…, improving neighborhood-wide living conditions without reaching high-risk families…, and attempts to focus on vague personal/emotional problems that have not been linked with recidivism…. (Andrews et al. 1990b: 375).

The needs principle rests on the assumption that the primary goal of correctional treatment programs is to reduce subsequent criminal behavior, thereby enhancing public safety. Accordingly, the focus of the needs principle is that correctional treatment should be directed toward dynamic offender behaviors and attitudes (called by Andrews “criminogenic needs”) that are strongly associated with recidivism and that are capable of change through intervention. In subsequent articles, Andrews and colleagues went on to describe the Big Four (history of antisocial behavior, antisocial personality pattern, antisocial cognition, and antisocial associates) and the Central Eight (the Big Four, plus family and/or marital, school and/or work, leisure and/or recreation, and substance abuse; Andrews et al. 2006) to be assessed for all offenders within correctional settings. Often within specific programs, needs are determined using validated instruments (e.g., LSI-R, COMPAS). Not all dynamic offender needs are associated with recidivism, however (e.g., personal and/or emotional distress, major mental disorder, physical health issues, fear of official punishment; Andrews et al. 2006). These “noncriminogenic needs” are individual characteristics that research has found to have no or weak association with recidivism. As result, they should not be the focus of correctional treatment, or at best, they should receive minimal attention.

Responsivity principle

Andrews described the responsivity principle as follows:

The responsivity principle has to do with the selection of styles and modes of service that are (a) capable of influencing the specific types of intermediate targets that are set with offenders and (b) appropriately matched to the learning styles of offenders…. Specifically, they include modeling, graduated practice, rehearsal, role playing, reinforcement, resource provision, and detailed verbal guidance and explanations (making suggestions, giving reasons, cognitive restructuring) (Andrews et al. 1990b: 375).

The responsivity principle is concerned with the types of treatment that are most appropriate for offenders. In its full application, the responsivity principle would require comprehensive assessment of offenders to determine appropriate treatment approaches that matched each offender’s learning style. In practice, however, Andrews posits that the treatment approaches most appropriate to the learning styles of the general offender population include those that would fall under cognitive-behavioral and social learning theory and practice (see Andrews et al. 1990a for theoretical and empirical support). In addition, according to the responsivity principle, there are certain types of treatment that do not fit the learning styles of most offenders; these include deterrence and labeling approaches, intermediate punishments, unstructured peer-based group and residential programs, counseling approaches that are nondirective and client centered, and unstructured psychodynamic therapy.

In later work, Andrews divided the responsivity principle into two parts: “general responsivity,” referring to the use of a cognitive social learning approach to change behavior for all types of offenders, and “specific responsivity,” referring to the tailoring of cognitive learning interventions to take into consideration particular client characteristics, such as motivation, gender, and ethnicity (Andrews et al. 2006). This meta-analysis examined only general responsivity.

Relevance of the Andrews principles to drug abuse treatment programs

The Andrews principles have typically been examined in meta-analyses of correctional treatment programs, which have primarily been focused on crime outcomes or criminal justice involvement. But if they hold up with correctional treatment programs, they may also hold up in drug treatment programs in which all clients have drug problems, but many are also involved in the criminal justice system. Also, the outcome of interest with respect to the Andrews principles has been recidivism. But would the principles still hold true when the outcome is relapse to drug use? That is, do drug treatment programs that exhibit one or more of the risk, need, and responsivity principles have better drug use outcomes than programs that do not?Footnote 1 A number of researchers have focused on the Andrews principles within the context of offender treatment programs that address drug abuse problems (Belenko 2006; Guastaferro 2012; Taxman and Thanner 2006; Thanner and Taxman 2003), but research has not examined whether the principles hold up in programs that specifically address more general samples of drug abuse clients.

The purpose of the present meta-analysis was to answer the question: Do the Andrews principles of risk, needs, and responsivity apply to programs that treat drug abusers, many of whom are, or have been, involved in the criminal justice system? Although there have been critiques of the Andrews principles (Byrne 2012; Ward et al. 2007), we tested the principles as described in Andrews’ original formulation in order to determine whether they applied to drug abuse treatment programs. The study involved the efforts of two teams experienced in meta-analysis, National Development and Research Institutes (NDRI) and Integrated Substance Abuse Programs (ISAP) at UCLA. The teams collaborated on all aspects of the project, including determining selection criteria, developing literature search strategies, screening for relevant research documents, creating a codebook, coding the research reports, conducting an inter-coder reliability study, and designing analysis plans. The Institutional Review Boards of the two organizations approved exemption of the meta-analysis from human subjects review.

Methods

Criteria for inclusion and exclusion of studies

The criteria for studies to be included in the meta-analysis were that the study (1) was written in English; (2) appeared between January 1965 and April 2007; (3) was published or unpublished; (4) was on a treatment program located in the United States or Canada;Footnote 2 (5) evaluated a drug abuse treatment or intervention program in which drug use was one of the outcomes, even if not the primary outcome; (6) used a treatment-comparison group design (either randomized or non-randomized);Footnote 3 and (7) reported sufficient statistical data on outcomes to permit calculation of an effect size. Because our concern was with programs that treated clients with drug use problems, we excluded studies in which less than 80 % of the clients were users of illicit drugs at the start of the study. Also excluded were clinical trials of medications for addiction that had not received approval by the Food and Drug Administration to treatment drug dependence (e.g., buspirone for cocaine dependence) at the time of the literature search. Comparison groups included those that involved no treatment or treatment as usual, but studies in which two treatments were of roughly equal strength (i.e., comparative effectiveness studies) were not eligible. Also excluded were studies in which clients were assigned to different “dosages” of the same treatment.

Literature search

We conducted online bibliographic searches using the ISI Web of Knowledge, which is an integrated platform for citation searches and retrieval that includes four bibliographic databases: PsychInfo, Current Contents, Web of Science (which contains Social Sciences Citation Index), and Medline. We also searched Criminal Justice Abstracts and Dissertation Abstracts. In each of these databases, searches were conducted for the period from 1995 to 2007. (In previous meta-analysis projects, each of the research organizations had identified relevant literature for the period before 1995.) Terms used in the search strategy appear in Appendix 1. The search results included complete reference information and abstracts, which were downloaded to bibliographic software for further processing (e.g., elimination of duplicates). In addition to database searches, we also reviewed the reference lists of eligible articles. In total, more than 10,000 citations were retrieved for the given time period, which were subsequently divided between the NDRI team and the ISAP team to evaluate whether each citation met study eligibility criteria. Full documents were retrieved when the abstract alone was not sufficient to determine eligibility. The citations that met eligibility criteria were combined with studies from the previous meta-analyses conducted by NDRI and ISAP for coding. Figure 1 depicts the flow of decisions about the search and selection process.

Fig. 1
figure 1

Flow of screening studies for the EPT project. *The ratio of documents to studies was about 1.1 to 1, since some studies had more than one document reporting the study’s outcomes. **These 232 studies produced 243 independent comparisons of E and C groups that were coded

Coding

The detailed codebook was adapted from the codebooks developed by NDRI and ISAP for their previous meta-analyses, with the addition of questions relevant to the Andrews principles. Questions covered study context, methodology, participant characteristics, treatment characteristics, dependent variable characteristics, and effect size statistics.Footnote 4 The studies were coded by M.A.- or Ph.D.-level coders who received training at the beginning of the study and subsequently met regularly with senior research staff to discuss coding problems and to establish policies on particular coding issues. Within each team, each study was coded by two coders, who resolved differences through discussion; unresolved issues were decided by senior staff. When findings from a study were reported in more than one document, we drew information from all the documents associated with that study. For some studies, missing information on key variables was obtained from study authors.

In order to determine the consistency of coding decisions, NDRI and ISAP conducted a formal inter-coder reliability study by independently double-coding 16 studies. The reliability checks of effect sizes yielded a Cronbach’s α of .90, and the various nominal level codes we checked yielded agreement rates ranging from .88 to 1.00.

Statistical approach

Before discussing specific analytic procedures, it is important to clarify that we are not presenting each meta-analysis of the Andrews principles as a null hypothesis significance test (NHST) posed at the level of meta-analysis. An alternative to the NHST approach is an “effect size with confidence interval” method, making clear that the precision of the confidence interval depends on the number of cases in the analysis, not just on the effect size. Since there are likely to be sources of variability other than the random variability captured in the confidence interval, we do not claim that the confidence interval is a sufficiently accurate measure of the true variation that may exist in future replications. The confidence interval should be interpreted as a rough indicator of the minimal variability that may exist in the effect size.

In turning to the specifics of the analysis, most of the studies examined reported only independent comparisons in which outcomes were compared between one experimental group and one comparison group. However, in some studies, one experimental group was compared with more than one comparison group or one comparison group was compared with more than one experimental group. In these cases, the comparisons are not statistically independent. When multiple comparisons were present in a given study, we used our best judgment to pick the fairest, most pertinent comparison group relative to each particular experimental group (e.g., comparisons that the treatment group would naturally be expected to surpass, such as treatment as usual). We use the phrase “independent comparison” to denote that the sample of research participants involved in that comparison do not overlap other samples included in the meta-analysis. Only independent comparisons were used in the meta-analyses reported here, but for brevity and convenience we refer to them as “studies.”

During coding, the following types of information were extracted from eligible studies: (1) illicit drug use based on self-report or non-self-report (e.g., urinalysis results), (2) criminal behavioral or criminal justice involvement based on self-report or records, (3) information on up to three waves of measures of drug use or crime collected during and/or following treatment, (4) quantitative data on outcomes for drug use or crime, and (5) information relevant for coding risk, needs, and responsivity. In a given study, it was usually possible to code multiple effect sizes for a given outcome, but, to ensure independence among effect sizes used in analysis, the effect sizes were averaged to create a study-level effect size for drug use outcomes and (if applicable) a study-level effect size for crime outcomes within a given study.

Our index of effect size was the standardized mean difference, familiar as Cohen’s d (calculated as the mean of the E group minus the mean of the C group, divided the pooled standard deviation). However, following accepted practice, we use a refined version, Hedges’ g, which corrects for a bias in d that exists when sample sizes are small (Hedges and Olkin 1985). Because studies with larger samples provide more precise and stable estimates of the population effect size than do studies with smaller samples, each effect size was weighted by the inverse of its variance. Some studies report outcomes for the E and C groups in terms of means, standard deviations, and Ns (the standard ingredients of Cohen’s d and Hedges’ g); others report different statistics that can be converted to standardized mean difference effect sizes. For example, a study reporting percentages successful and Ns can be converted to numbers of successes and numbers of failures, which can be converted to odds ratios, then to LogOdds, then to d, using the formula \( d=\mathrm{LogOddsRatio}\times \left( {\left[ {\mathrm{sqrt}(3)} \right]/\mathrm{pi}} \right) \) (Borenstein et al. 2009: 46–47). Calculations of d from other available statistics (e.g., t test values) used formulas provided in Lipsey and Wilson (2001) or Borenstein et al. (2009).

In some studies, the effect size for an outcome is so atypical that there is a risk of distorting the overall results. A practical criterion for detecting such outliers is a value that is more than 1.5 times the interquartile range above the third quartile or below the first quartile (Crawley 2005; Walfish 2006). Calculating the quartiles for the effect sizes and applying this criterion led us to Winsorize: (1) for the illicit drug use outcomes, 8 out of 243 effect sizes that were greater than g = +1.20 were set to the value of +1.20 and 2 of the 243 that were less than g = −.48 were set to the value of −.48, and (2) for the crime outcomes, 1 out of 51 effect sizes that was greater than g = 1.00 were set to the value of 1.00 (there were no outlier values at the negative end of the crime effect size distribution). We did not impute missing values.

To test the relationship between each of the Andrews principles and the effect sizes for crime and drug use outcomes, we used meta-regression, which provides a linear regression that predicts effect sizes from a predictor variable. The meta-regressions use a random effects model, which allows for the likelihood that there may be more than one true effect size underlying different types of studies. The unstandardized meta-regression coefficient indicates the change in average effect size associated with a one-unit change in the predictor, e.g., from 0 to 1. The standardized meta-regression coefficient (usually termed “beta”) indicates the change in average effect size (in standard deviation units) associated with a one standard deviation increase in the predictor. In a bivariate linear regression model, because the value of the standardized regression coefficient is the same as the correlation coefficient (r), it provides a correlation as a measure of the strength of association. By convention (Cohen 1988), a correlation of .10 is considered small, .30 medium, and .50 large. In more practical terms, r is approximately equal to the percentage difference in outcome between the treatment group and the comparison group.

An important consideration in assessing meta-analytic findings is whether the strength of the relationships found varies according to particular moderators across studies. In addition to the main analyses, we conducted a set of auxiliary analyses using two moderators: methods quality and publication status (journal/book vs. unpublished report). For the measure of methods quality, we selected two methods variables that are typically included in meta-analyses in case they may be (inversely) correlated with effect size in the studies: (1) random assignment (of the 243 independent comparisons, 74.5 % used random assignment), and (2) use of intent-to-treat analysis (used in 82.7 % of the studies). We combined these two variables into a single variable that had a value of 1 if a given study used (1) random assignment and (2) intent-to-treat; otherwise, the value was 0. For publication status, published studies were coded as 1 and unpublished as 0.Footnote 5

In meta-regression (as in regression analysis more generally), outliers and influential data points should be investigated to assess their effect on the adequacy of the model (Belsley et al. 1980; Viechtbauer and Cheung 2010). Among the statistical indicators of excessively influential studies (called case diagnostics) are externally standardized residuals (labeled “rstudent”), DFFITS values, Cook’s distance values, and the covariance ratios. Meta-regression is not a simple statistical approach, and assessing overly influential studies in meta-regression should not be oversimplified by relying on any one statistical indicator of influential data points or by relying on a particular cutoff level of an indicator. Belsley et al. (2004: 27, 29–30) suggest a cautious approach to the use of diagnostics of influential data points and recommend use of “judgment and intuition in choosing reasonable cutoffs most suitable for the problem at hand.” We examined plots of each of these indicators to see whether some of them stood out as outliers with large and statistically significant values on one or more indicators. If extreme values were found, we re-estimated the meta-regression with that study (or studies) excluded. In presenting the findings, we then discuss both the initial meta-regression and the revised meta-regression that excluded the overly influential study or studies.

The analyses were conducted using Comprehensive Meta-analysis™ software (Borenstein et al. 2005), David Wilson’s meta-analysis SPSS macros (Wilson 2006), and R statistical software 2.10.0 (R Development Core Team 2009), specifically, the package “metafor” version 0.5-5 (Viechtbauer 2009).

Results

Study characteristics

We coded 232 primary research studies; some of these had multiple experimental (E) and/or comparison (C) groups, which allowed us to code 243 independent comparisons (“studies”). About 78 % of the coded studies were funded by NIDA; other sources of funding included the National Institute of Justice, the Center for Substance Abuse Treatment, the Department of Veterans Affairs, the National Institute on Alcohol Abuse and Alcoholism, and the National Institute of Mental Health. The vast majority of the studies (95 %) were dated between 1980 and 2006. Unpublished studies constituted 13 % of the total. The studies took place in a variety of locations: community-based programs, 50 %; university research centers, 24 %; correctional settings, 11 %; Veteran’s Administration medical centers, 7 %; and other location, 8 %. Nearly three-quarters (74 %) of the studies assigned subjects randomly. In just over three-fifths (61 %) of studies, the comparison group received standard treatment, 5 % received delayed treatment, 10 % received no treatment or minimal contact, and 24 % received some other type of treatment.

In terms of gender, 13 % of the study samples were exclusively male, 10 % were exclusively female, and the remainder included both males and females. The percentages of Black, White, and Hispanic participants in study samples varied widely; averages over all of the studies were 42 % Black, 42 % White, and 16 % Hispanic. The mean age of study samples ranged from 15 to 45, with an overall mean of 33 years of age. When the study focused on a specific illicit drug, the drug most commonly mentioned was heroin, cocaine, or crack. In 17 % of the studies, all or nearly all of the participants in drug treatment were involved in the criminal justice system. Within the full set of coded studies, the point estimate for crime outcomes was g = .06 (k = 51; 95 % CI: −.02, .13) and for drug use outcomes, g = .21 (k = 243; 95 % CI: .17, .24).

Risk principle

The hypothesis tested with respect to the risk principle was: Studies of programs in which the subjects were at comparatively higher risk for negative post-treatment outcomes (either criminal recidivism or illicit drug use) will have a larger average effect size than will studies of programs in which the subjects were at comparatively lower risk for negative post-treatment outcomes.

The coders used information reported in each study to rate the risk level for crime or drug use among the E group and the C group participants. The assessment of risk included scores from validated risk instruments used in a study, but, since many studies did not report such scores, we also used baseline behavioral measures that are associated with subsequent crime or drug use to code for risk level (see Appendix 2 for the guidelines for coding the risk level). The original coding had three risk levels (low, medium, and high), but because there were so few studies coded as low risk for either crime (k = 9) or drug use (k = 0), the low and medium risk levels were collapsed for analysis purposes such that risk became a dichotomous variable of lower or higher risk. Given that clients typically seen in drug abuse treatment programs have long histories of drug use and often crime, it is not surprising that few or no studies had samples that were rated as being low risk. In determining the risk level of a given study sample, we used the risk level of the E group (which, in nearly all cases, matched the risk level of the C group).Footnote 6

Meta-regression analysis of the risk principle: crime

Of the 51 studies with a crime outcome measure, 7 studies were classified as “lower risk” for committing crime following treatment, 41 as “higher risk,” and 3 did not report adequate information to code risk level. We first report a meta-regression of the effect size (g) for the dependent variable (here, crime outcomes) on the predictor variable of interest (here, crime risk). The bivariate unstandardized meta-regression coefficient for a unit increase in crime risk was .22 (k = 48; 95 % CI: .01, .44). That is, the overall linear relationship estimate indicated that, on average, a higher crime risk compared with a lower crime risk was associated with .22 more Hedges’ g units in effect size. The standardized regression coefficient (beta) was .27; in this bivariate analysis, this means that the correlation, r, between crime risk and effect size was .27. Thus, in these studies, drug abuse treatment programs that treat clients with higher risk for crime have better crime outcomes than those that treat lower-risk clients, which is consistent with findings by Andrews from studies of correctional programs.

We next investigated a meta-regression of effect size for crime that added the two moderators of methods quality (randomization and intent-to-treat analysis) and publication status along with crime risk. The unstandardized coefficient for crime risk remained about the same, but the introduction of those two moderator variables into the model resulted in a slightly wider confidence interval that includes the zero point: .22 (k = 48; 95 % CI: −.01, .45) (see Table 1).

Table 1 Meta-regression results of Andrews principles on crime outcomes

We examined statistical indicators of excessively influential studies. Plots of these indicators showed that all the studies were reasonably well clustered, with none showing clearly outlying values in the meta-regression.

Meta-regression analysis of the risk principle: drug use

Of the 243 studies with measures of illicit drug use outcomes, 231 reported adequate information to code a baseline measure of risk of drug relapse for the E group. Of those 231, 15 studies were coded as “lower risk” and 216 as “higher risk.”

For drug use outcomes, the bivariate unstandardized meta-regression coefficient for a unit increase in risk of drug use was −.08 (k = 231; 95 % CI: −.21, .06), the standardized (correlation) coefficient was −.06. In a meta-regression including the two moderator variables along with risk of drug use, the unstandardized coefficient was −.08 (k = 231; 95 % CI: −.22, .05) (see Table 2). Higher risk for drug use relapse is inversely associated with drug use outcome in evaluations of drug abuse treatment programs.

Table 2 Meta-regression results of Andrews principles on drug use outcomes

Plots of the indicators of excessively influential studies showed that two studies appeared to be outliers with atypical values (rstudent >  3.5; DFFITS > 3.0; Cook’s distance > .10; covariance ratio < .95). We re-estimated the meta-regression with those two studies excluded. The unstandardized coefficient for drug risk was about −.002 (k = 229; 95 % CI: −.13, .13). Risk of drug relapse was virtually unrelated to drug use outcomes.

Needs principle

The hypothesis tested with respect to the needs principle was: Studies in which the provided services address criminogenic needs will have a larger average effect size than will studies in which the services did not address criminogenic needs.

Because most studies do not provide comprehensive measures of client needs or do not indicate whether clients’ needs (i.e., those beyond the specific focus of treatment) are addressed, the needs principle was operationalized in terms of the types of services that a program was described as providing. For Andrews, the main services provided in a program should be those that are related to recidivism (i.e., services that address criminogenic needs).

Determining whether a particular program adhered to the needs principle involved a two-step process. First, we coded each study in terms of which of 38 types of services were reported in the study description.Footnote 7 Each service was coded 1 if it was reported in the study, or coded 0 if was not reported, doing so for the E group and the C group. Although these services are typically found in drug treatment programs, most of them do not meet Andrews’ criteria for services addressing criminogenic needs, that is, those behaviors and attitudes that are strongly associated with recidivism. Thus, in the second step, we identified, from the list of 38 services, 10 services that we judged address the need for structure in the client’s life and promote prosocial behaviors, attitudes, and values, which reflect the theoretical structure of Andrews’ criminogenic needs (see Appendix 3).Footnote 8

For each criminogenic needs-related service, the code for C was subtracted from the code for E. A difference of 0 indicated either that the service was present in both groups (Es 1 minus Cs 1 = 0) or absent in both groups (Es 0 minus Cs 0 = 0). A difference of 1 indicated that the service was provided in the E group but not in the C group.Footnote 9 These differences were summed to create a “needs” score indicating the degree to which a particular program or intervention delivered services in accordance with the needs principle. As an example, one study had a sum of service differences equal to 3; for this study, the E group had three services (conforming to the needs principle) not found in the C group. The service difference was Winsorized at 7 in order to reduce the effect of outliers. Thus, in accordance with the needs principle, studies with higher needs scores would be expected to have larger effect sizes than studies with lower scores because there is a greater contrast between criminogenic-related services offered in the E group compared with the C group.

Meta-regression analysis of the need principle: crime

For crime outcomes, the unstandardized meta-regression coefficient for the needs score as a predictor of outcome was .05 (k = 51; 95 % CI: −.02, .11); the standardized (correlation) coefficient was .17. As shown in Table 1, when including covariates for methods quality and publication status, the meta-regression coefficient was .04 (k = 51; 95 % CI: −.02, .10). Case diagnostics of excessively influential studies showed that all the studies were reasonably well clustered, with none showing clearly outlying values in the meta-regression. In this set of studies of drug abuse treatment programs, the number of services related to criminogenic needs had a small association with crime outcomes.

Meta-regression analysis of the need principle: drug use

For drug use outcomes, the unstandardized meta-regression coefficient for the needs score as a predictor of outcome was .02 (k = 240; 95 % CI: −.02, .05); the standardized (correlation) coefficient was .05. As shown in Table 2, the unstandardized meta-regression coefficient with the two moderators included was .02 (k = 240; 95 % CI: −.02, .05). In conducting case diagnostics, two studies appeared to be outliers with highly influential values (rstudent > 3.5; DFFITS > 3.0; Cook’s distance > .10; covariance ratio <.95). These were found to be the same two studies mentioned earlier. When we re-estimated the meta-regression with those studies excluded, the unstandardized coefficient for drug risk was about .01 (k = 238; 95 % CI: −.02, .05), indicating very little relationship between the number of services addressing criminogenic needs and drug use outcomes.

Responsivity principle

The hypothesis tested with respect to the responsivity principle was: Studies in which the treatment delivered was appropriate for (responsive to) the learning style of clients will have a larger average effect size than will studies in which the treatment was not (or less) appropriate. For Andrews, programs that use behavioral/cognitive-behavioral approaches with offender populations meet the responsivity principle, while programs that use other approaches do not.

In assessing the responsivity of the intervention used in the E and C groups, the coders assigned 0 if the comparison group received no treatment; 1 if the treatment was delivered exclusively or mainly by evocative, relationship-dependent, self-reflective, verbally interactive, and/or insight-oriented methods; 2 if the treatment was delivered by a (roughly equal) combination of evocative, relationship-dependent, self-reflective, verbally interactive, and insight-oriented methods and of behavioral, social learning, or cognitive-behavioral methods; and 3 if the treatment was delivered exclusively or mainly by behavioral, social learning, or cognitive-behavioral methods.

Table 3 shows the coded responsivity of the treatment provided to E group subjects and to C group subjects. The coding can be considered an ordinal scale in which the interventions rated 3 were most consistent with the responsivity principle, those rated 1 were least consistent, and those rated 2 fell in the middle. For a given study, the responsivity rating for the E group minus the responsivity rating for the C group is an approximate measure of the level of responsivity in the E group relative to that in the C group. For example, if the E and the C groups of a study are both coded 3 on responsivity, the responsivity difference is 0 and both groups are considered equally responsive, and if E is coded 3 and C is coded 1, the responsivity difference in favor of E is 2.Footnote 10 The fact that there were twice as many C groups than E groups for which coders could not determine responsivity suggests that study authors tend to report more detailed information on the elements of the experimental condition than of the comparison condition.

Table 3 Coded responsivity of the treatment provided to E group and to C group subjects

Meta-regression analysis of the responsivity principle: crime

In this analysis, we treated the ordinal-level responsivity values (0, 1, 2, and 3) as rough estimates of an interval scale to conduct a meta-regression of the effect size for crime on differential responsivity in order to provide an approximate estimate of the effect of responsivity. As seen in Table 1, for crime outcomes, the unstandardized meta-regression coefficient for responsivity (i.e., the E groups having been rated as having “1 unit” greater in responsivity than the C group) was .09 (k = 22; 95 % CI: −.02, .20), while the standardized (correlation) coefficient was .33. In this analysis, a 1-unit difference could reflect the E group having a responsivity code of 3 and the C group a code of 2, or the E group having a responsivity code of 2 and the C group a code of 1, and so on for the other combinations of responsivity. In a model including the moderator variables of methods quality and publication status, the coefficient was .08 (k = 22; 95 % CI: −.03, .20). No excessively influential studies were found. For this set of drug treatment evaluation studies, the magnitude of the relationship between responsivity and crime outcome was medium (r = .33) and is consistent with Andrews’ findings for correctional treatment programs, but the confidence interval includes zero, due to the relatively small number of studies.

Additional analysis of the responsivity principle: crime

We supplemented the meta-regression of responsivity with an ANOVA-analog meta-analysis in which each “group” corresponds to a difference in responsivity between the E group and the C group. The results are summarized in Table 4. The results support the finding of the meta-regression. Increased responsivity in the E group relative to the C group is associated with a larger effect size, with effect sizes steadily increasing from −.07 to .22 as the level of responsivity increases.

Table 4 ANOVA-analog meta-analysis of Hedges’ g for crime by the difference in responsivity between the E group and the C group (k = 22)

Meta-regression analysis of the responsivity principle: drug use

For drug use outcomes, the unstandardized meta-regression coefficient for responsivity was .02 (k = 124; 95 % CI: −.03, .06); the standardized (correlation) coefficient was .05 (see Table 2). When including the two moderator covariates, the coefficient was .02 (k = 124; 95 % CI: −.03, .07). The analyses found no excessively influential studies. According to this analysis, responsivity, as operationalized above, does not appear to be a predictor of drug use outcomes within studies of drug abuse treatment programs.

Additional analysis for the responsivity principle: drug use

When effect sizes for drug use at each level of responsivity are calculated using ANOVA-analog meta-analysis, the increase in effect sizes was not linear, as it was for crime outcomes (see Table 4). Furthermore, in the one study in which responsivity was higher in the C group than in the E group, the effect was negative and large (−.48). Although too much should not be made of a single study, the C group was rated as being more responsive than the E group and it had better outcomes, which is what the responsivity principle would predict. Overall, it appears from the studies included in the analysis that the responsivity principle is less strongly supported for drug use outcomes than for crime outcomes.

The Andrews principles considered together

Andrews’ original statement of the risk principle indicated that “the effects of treatment typically are found to be greater among higher risk cases than among lower risk cases. This is expected unless the need and/or responsivity principles are violated” (Andrews et al. 1990b: 374). This implies that a strong test of the Andrews principles would include the three principles together in assessing whether programs are considered an “appropriate correctional service” (Andrews et al. 1990b: 379).Footnote 11 We conducted a meta-regression analysis, for crime outcomes and for drug use outcomes, in which studies were categorized by the degree to which they exhibited all or some of the three Andrews principles. The classification of studies by the number of principles present required both that each study have a rating for each of the three principles and that each study have an effect size for either drug use or crime, or both. These restrictions meant that the number of studies in each meta-regression was relatively small.

For crime outcomes, the unstandardized meta-regression coefficient for appropriate service was .18 (k = 12; 95 % CI: −.05, .41); the standardized (correlation) coefficient was .46. Since only 12 studies were available for this analysis, it is not surprising that the confidence interval for the coefficient includes zero, but the point estimate, the standardized (correlation) coefficient, is large, .46. As shown in Table 1, the analysis that included the two moderator variables had a meta-regression coefficient of .09 (k = 12; 95 % CI: −.17, .36). The case diagnostic analysis did not reveal any excessively influential studies.

For drug use outcomes (see Table 2), the unstandardized meta-regression coefficient for appropriate service was .06 (k = 59; 95 % CI: −.02, .13); the standardized (correlation) coefficient was .16. When including the moderator variables, the coefficient was .06 (k = 59; 95 % CI: −.02, .13). The case diagnostic analysis found no excessively influential studies.

For both outcomes, as would be expected from Andrews’ work, the greater the number of principles exhibited by a program, the larger the effect size. However, since in neither analysis was the coefficient statistically different from zero, this finding should be considered suggestive.

Discussion

Andrews and colleagues developed and validated their principles of effective correctional treatment through previous research and meta-analyses of correctional treatment programs (e.g., Andrews and Bonta 1998; Andrews et al. 1990b; Gendreau et al. 1996). Their outcome was crime or recidivism among offenders, and they did not look specifically at drug abuse treatment programs, at least in their early studies. We tested the Andrews principles using studies of drug abuse treatment programs, with both crime and drug use as outcomes. In other words, we sought to determine whether the Andrews principles regarding effective correctional treatment hold up for programs designed to treat clients with drug use disorders, many of whom also have been involved in the criminal justice system.

Within the full set of studies, drug treatment programs generally have a greater impact on drug use outcomes than on crime outcomes, which is consistent with previous meta-analyses of drug treatment (Marsch 1998; Prendergast et al. 2002). But when outcomes from evaluations of drug abuse treatment are examined in relation to the Andrews principles, the impact of the principles is greater on crime than on drug use.

In the bivariate meta-regressions for crime outcomes, the standardized coefficients (equivalent to r) are small to moderate: for risk, .27; for need, .17; for responsivity, .33; and for appropriate service (a combination of risk, need, and responsivity), .46. The relatively large effect size for appropriate service is consistent with what Andrews and colleagues found in correctional programs (Andrews et al. 1990b). These findings should not be surprising, since the principles were developed to increase the likelihood that correctional treatment programs would have a positive impact on recidivism. It appears that the principles also apply to drug treatment programs with respect to crime outcomes.

We anticipated that programs that served drug abuse clients and exhibited the principles of risk, needs, and/or responsivity would show effects for drug use that were similar to those for crime. This was not the case, however. In the bivariate meta-regressions for drug use outcomes, the standardized coefficients indicated a very small relationship: for risk, −.06; for needs, .05; and for responsivity, .05. However, for appropriate service, the standardized coefficient of .16 suggests that when examined together, the principles have a considerably larger effect on drug use outcomes than when examined separately. When considering all of the bivariate meta-regressions, it should be noted that, with one exception (crime outcomes by crime risk), all the coefficients the 95 % confidence intervals included the zero point, in some cases probably due to the small number of studies included in the analysis.

The inclusion of the two moderators (methods quality and publication status) did little to change the results. The one exception was the meta-regression for appropriate service with crime as the outcome (k = 12), where the addition of the moderators to the model reduced the unstandardized coefficient by half. Also, case diagnostics analysis indicated that influential data points had an effect on the results of only one of the analyses. In the meta-regression of drug use by drug use risk level, removing two studies with atypical values from the model reduced the effect to nearly zero.

Most of the clients in the studies included in the meta-analysis were rated as having a high risk of drug use relapse; E groups were generally provided more services that addressed criminogenic needs than were C groups; and the treatments evaluated generally fell within the recommendations of responsive treatment (i.e., mainly behavioral/cognitive behavioral). Yet these principles of effective correctional treatment had at best a weak impact on drug use outcomes when examined in a different type of program (focused on drug abuse) and with a different outcome (drug use) compared with the various Andrews meta-analyses.

The Andrews principles were developed in relation to programs intended for criminal offenders and were described in terms of the risks, needs, and responsivity considerations of offenders. Risk is defined with respect to the likelihood of committing crime. Services should address needs (traits, attitudes, and behaviors) that are at least moderately associated with recidivism. At the same time, needs that are only weakly associated with recidivism should receive little or no attention in programming. Responsive programs are those that follow social learning or cognitive-behavioral principles because, according to Andrews, they are most appropriate for the learning styles of offenders. Thus, the Andrews principles are a set of specific recommendations for how services for a particular population—criminal offenders—should be designed and delivered.

When examined in the context of programs that treat drug abuse clients (many of whom have been or are involved in the criminal justice system), however, it appears that the principles have a negligible relationship to drug use outcomes. As to why this might be the case, we can only speculate at this point. Drug dependence may have a different etiology and be associated with different psychological and social consequences than criminal behavior. The treatments to reduce drug use and those to reduce crime are focused on different processes and mechanisms (even though the general orientation may be behavioral). Criminal behaviors do not possess the pharmacological addictive properties of drug abuse/dependence, and, hence, the process of change is substantially more difficult and the probability of treatment success, shown by the findings presented here, is lower. It may be that risk, needs, and responsivity do provide a sound foundation for establishing principles of effective treatment, but they may need to be adapted and refined by clinicians and researchers for persons with drug use disorders.Footnote 12 Perhaps the effect of Andrews’ principles on crime outcomes within drug programs has an indirect effect on reducing drug use by reducing the propensity to engage in criminal behavior, thus contributing to the overall success of drug programs. Testing this hypothesis would require a meta-analytic path analysis, which is beyond the scope of this study.

Limitations

In considering the findings from this meta-analysis, a number of limitations need to be considered. First, meta-analysts are sometimes criticized for combining a heterogeneous set of studies (the “apples and oranges” problem; Sharpe 1997). We attempted to address this problem by operationalizing the Andrews principles and selecting studies accordingly.

A second concern is publication bias, that is, the tendency of researchers to submit and of journals to publish articles that show statistically significant effects, but much less inclined to publish effects that are not statistically significant (Begg 1994; Rosenthal 1979). The search strategy included identifying and retrieving both published and unpublished studies, and 13 % of coded studies were unpublished. Publication status was included in the moderator analyses.

Third, since there is usually concern about mixing studies of high and low methodological quality, we only coded studies that used an experimental or quasi-experimental design, and we included a measure of methods quality in the analyses.

Fourth, meta-analysis must rely on the nature of the extant literature for its data and findings. Since studies seldom experimentally test “principles of effective treatment,” the meta-analysis was dependent on the reported characteristics of the studies, and the conclusions should be considered correlational and not necessarily causal. We note one study (Thanner and Taxman 2003) that used a randomized design with participants stratified by risk level to test the effectiveness of a “seamless model” of probation compared with traditional probation among offenders with drug problems. Consistent with the risk principle, high-risk clients did better than moderate-risk clients on most crime and drug use outcomes, although the differences were not statistically significant.

Fifth, questions about the construct validity of our methods for classifying studies with respect to the three principles may limit the strength of our conclusions about the applicability of the Andrews principles within drug abuse treatment programs. The general limitation, of course, is that we could only use the information made available in the authors’ reports of the findings of their experiments. Regarding the risk for crime and the risk for drug use in the (aggregate) experimental group and control group, very few of the documents reported within-sample risk levels using validated risk assessment instruments. Consequently, we instructed our coders to use a broader set of indicators of risk (see Appendix 2). This approach is similar to that used in other meta-analyses of the Andrews principles (e.g., Dowden and Andrews 2000; Koehler et al. 2013), and, while it does involve coder judgment, it is unlikely to be improved upon until primary researchers include sample breakdowns by risk based on standardized measures.

As in other studies, our coding of needs was based on whether the program being evaluated offered services that addressed criminogenic needs, following guidelines for criminogenic needs provided by Andrews and colleagues (e.g., Andrews et al. 2006; see Appendix 3). There was some coder judgment involved in determining which of the services typically provided in drug abuse treatment programs matched the Andrews list of criminogenic needs and then in determining whether the treatment program in a given study should be coded as offering one or more of these services. Application of the coding scheme was enhanced by double coding of each study. Future work should extend the concept of dynamic needs in the Andrews theory to the topic of drug use relapse and determine whether programs that provide services addressing those needs have a larger impact on drug use following treatment than those that do not.

Finally, with respect to the responsivity principle, we followed Andrews and colleagues in coding programs that included behavioral or social learning principles as responsive. We assumed that treatment approaches that match the learning styles of offenders are the same as those that match the learning styles of drug abusers. Others may question this approach and decide that some other definition of responsivity is needed for drug abusers. If so, a different operational definition of responsivity might lead to different findings in a future meta-analysis.

Conclusion

The purpose of this study was to use meta-analysis to empirically assess whether the Andrews principles of effective correctional treatment were applicable in drug abuse treatment programs. The results are consistent with the principles of risk, needs, and responsivity with respect to crime outcomes, but they provide limited or no support for the principles when drug use was the outcome. With that in mind, policy makers and treatment providers interested in improving programs for drug abusers should understand that the largest impact for both drug use and crime is found among interventions incorporating all three of the principles. As for future research, at the study level, providing a stronger test of the Andrews principles in drug treatment programs would benefit from greater use of standardized risk instruments, reporting of outcomes by risk level, more explicit specification of dynamic needs related to drug use outcomes, and designs that involve assignment to different levels of the principles (e.g., high risk vs. low risk). Given that the principles were weakly related to drug use outcomes, researchers might revise the principles to be more appropriate to the characteristics and circumstances of drug abusers and then examine the revised principles within drug treatment programs. Finally, meta-analysis of treatment programs in terms of specific responsivity (e.g., gender, ethnicity, motivation) would be an important extension of our work.