From Disordered Eating Behavior to Eating Disorders

Eating-related health problems may result in serious eating disorders (ED), such as anorexia nervosa (AN), bulimia nervosa (BN), or binge eating disorder (BED), especially among children and adolescents (American Psychiatric Association 2000). ED are difficult to treat (Steinhausen 2002), and particularly AN is associated with a high mortality rate (>15 %; e.g., Zipfel et al. 2000). Accordingly, ED have received great attention from the public as well as the scientific community. Recent epidemiological data indicate that not only females are suffering from ED, but also an increasing number of males. Prevalence estimations for females/males are 0.9/0.3 % for AN, 1.5/0.5 % for BN, and 3.5/2 % for BED (Treasure et al. 2010). It is worth mentioning that more than 50 % of ED diagnoses are categorized as “ED not otherwise specified” (EDNOS; within the ICD-10, BED also belongs to this category). The highest incidence of ED occurs between the ages of 10 and 19 years. Especially among adolescents, several symptoms of ED are common, constituting a disordered eating behavior. Of American girls and boys, 14.3 and 7.1 %, respectively, aged 9 to 14 years reported problematic eating behavior (Treasure et al. 2010).

Although it remains unclear which of the high-risk children will develop a clinically relevant ED, several screening studies have shown a strong relationship between risky eating behavior and ED. Mintz and O’Halloran (2000) found a 90 % concordance between self-reported eating problems measured with the Eating Attitudes Test (EAT-26; Garner and Garfinkel 1979) and the DSM-IV criteria for ED. In a nationwide ED screening survey within high schools of the USA in 2000 (Austin et al. 2008), 15 % of the 3,252 participating girls and 4 % of the 2,315 boys, with a mean age of 15.9 years, showed risky eating behavior (i.e., showing 20 or more points in the EAT-26). These rates obviously increase reported ED prevalence rates since many of the girls and boys with conspicuous ED symptoms are not in psychological treatment.

Facing the epidemiological facts, several research teams have developed programs to prevent the onset and sequelae of ED. Most of these programs are implemented in a school setting (Levine and Smolak 2006; for a meta-analytical review, see Stice et al. 2007).

How to Prevent Eating Disorders?

When considering the huge amount of ED prevention programs, which were developed in the past two decades (81 of them were represented in the meta-analysis of Stice et al. 2007), one may ask: Why didn’t we use one of the already successfully evaluated programs instead of developing and evaluating a new one? Our first reason was the lack of transferability of the existing programs to the structural conditions of the German school system, which is not only different to other countries, but every one of the 16 German states has its own school system. Therefore, in order to implement a program in the school setting in Thuringia, we had to cooperate with the responsible political organization, the Thuringian Ministry of Education (now TMBWK). Unlike other countries, the health ministries are not responsible for the health situations in schools. The TMBWK insists that we have to consider program implementation from a practical point of view rather than from a scientific standpoint or from a general health standpoint. This implies the following compromises:

  • The idea of preventing ED in schools should not focus on a single project such as prevention of AN in 15-year-old girls, but developing cornerstones for a comprehensive program of prevention and health promotion in the realm of ED and related problems, such as overweight and obesity, for girls and boys up to the age of 12 years Haines & Neumark-Sztainer (2006).

  • To secure sustainability and to strengthen empowerment of all stakeholders (politicians, parents, teachers, children, and scientific community), teachers had to implement Torera instead of external experts (this promotes the organizational structure of the school, but contradicts the recommendation of Stice et al. 2007). That is, not only the teaching manuals but also the workbooks for teachers and students have to be created, which are not available from existing programs. Nevertheless, if we consider every school to be a self-responsible organization, teachers should be free to deviate from the manual and fit the program to the organizational structures of their schools (we checked the adherence in the final evaluation).

  • In order to not discourage interested schools, we had to refrain from a randomized group allocation.

In 2003, we developed a program called “Primary prevention of anorexia nervosa in preadolescent girls” (“PriMa”) as a first module of a prevention package for (pre)adolescents focusing on eating-related problems (Wick et al. 2011). The PriMa program was designed for sixth grade girls only, since they have a three times higher risk for AN than boys and since AN incidence peaks at the age of 15 years (Treasure et al. 2010). Although boys are not as much affected by ED as girls, they also are concerned with body-related issues. Instead of developing an ideal of thinness, boys often are doubtful if they will gain enough muscle to be attractive for their (female) peers (Cash and Smolak 2011). To provide a healthy way to enhance physical fitness for boys, we developed the “Teenage Obesity Prevention Program” (TOPP) for sixth grade boys (see Berger et al. 2011), which was implemented in schools parallel to the PriMa program. Our evaluation studies clearly highlight the advantages of these gender-specific programs (Berger et al. 2008).

Nevertheless, it is obvious that some aspects of the etiology of ED are closely related to experiences with the opposite sex. Teasing, weight bullying, and nasty remarks about one’s appearance from peers are well-known reasons for accelerating a vicious circle of gaining weight by withdrawal, frustration, and subsequent comfort eating. Swinburn and Egger (2004) called this circle the “runaway weight gain train” because it seems impossible to stop this development for an increasing number of children. In this state, girls are especially willing to explore every method proposed to get rid of the additional pounds, such as self-induced vomiting following a binge eating attack (vomiting to control weight in the past 3 months was reported by 12.2 % of the girls and 3.9 % of the boys in the study of Austin et al. 2008).

Encouraged by the positive feedback following our sixth grader programs and their satisfying outcomes (Berger et al. 2011; Wick et al. 2011), with Torera we created an additional program for seventh grade boys and girls. The central idea of this program was to focus on binge eating attacks as an extreme expression of losing control over one’s eating behavior. Such attacks are central, both in BN and BED, and were not addressed in our former programs. Since the Greek origin of the term bulimia relates to “hunger of a bull,” we decided to name this program “Torera.” From a clinical point of view, the name for this kind of ED is misleading because the pivotal diagnostic criterion is not an overwhelming hunger, but regularly uncontrollable binge attacks with subsequent purging, e.g., self-induced vomiting (American Psychiatric Association 2000). For preventive purposes, Torera was supposed to serve as a symbol for the fight against the threat of ED in general (for an overview and short description of the program, see Table 1).

Table 1 Overview of the nine Torera lessons

Following the primary preventive approach (Caplan 1964), Torera aims to decrease ED risk factors such as weight concerns, negative body image, dieting, and low social support (Jacobi et al. 2004) as well as to strengthen protective factors such as body self-esteem, knowledge about ED, healthy eating, and competence in using media. As derived from theoretical models describing how health behavior could be changed, we assumed body self-esteem as a mediator for changing disordered eating behavior (for more details, see the “Intervention” section).

Method

Study Design and Sample

Schools from Thuringia, Germany participating in PriMa and TOPP 1 year before were invited to participate in the Torera program (for the sample flow, see Fig. 1). The assignment to the intervention group (IG) was based on self-selection: The IG consisted of 10 schools that had agreed to participate in the Torera program (including teacher trainings, questionnaires, and a telephone interview), whereas 12 schools that disagreed to participate in Torera served as controls. Because of the expressed reservations of the Ministry of Education against randomized controlled trials, we had to apply a quasi-experimental pre–post design (pre = baseline measurement at sixth grade; post = measurement after the Torera intervention at seventh grade) with two control groups. One control group consisted of schools which had run the PriMa program and the TOPP program, respectively, 1 year before, but did not want to participate in Torera (pretreated control group = CG2). This group consists of schools that did not refuse to take part in ED prevention in general, but were ready to take part in the sixth grade programs. The other control group consisted of schools that did not want to take part in any of our programs, but agreed to fill out the questionnaires at the two measurement points (untreated control group = CG1). A positive impact of the Torera intervention in both the untreated control group design and the pretreated control group design could be seen as an argument against a possible selection bias. Before starting the program, we obtained informed consent from the parents and the participating students. The study was approved by the ethics committee of the Jena University Hospital (#1655-11/05).

Fig. 1
figure 1

Study flow diagram. PriMa program for the primary prevention of anorexia nervosa for sixth grade girls (see Wick et al. 2011), TOPP teenage obesity prevention program for sixth grade boys (see Berger et al. 2011), CG1 untreated control group, CG2 control group with participation in PriMa or TOPP, IG intervention group with participation in PriMa or TOPP at sixth grade and participation in Torera at seventh grade, t 1 baseline measurement, t 2 post intervention measurement

Since Torera was integrated into the regular curriculum, participation was obligatory for all students. The sample consisted of 533 girls and boys with a mean age of 13.1 years at the second measurement point, representing 24 % of the entire potentially eligible sample. Seven percent had to be excluded due to missing data or for not meeting the age range for typical sixth graders at baseline. One hundred eighty-eight students participated in the intervention study, while 345 served as controls (for details, see Fig. 1). The first measurement point (t 1) was between March 2007 and January 2008, while the second measurement point (t 2) was between February 2008 and July 2009. The interval between t 1 and t 2 ranged from 189 to 616 days (394 days on average).

Intervention

The Torera program started with a half-day training session for the project teachers. All of them were experienced with the application of the PriMa or TOPP program from 1 year before. After the training session, the teachers were able to independently conduct the lessons with the help of a 76-page teaching manual and special workbooks. The Torera intervention consists of nine 90-min sessions. The temporal sequence of the program and its basic contents are shown in Table 1.

The concept of Torera follows a multitheoretical approach. First, Torera addresses the Health Action Process Approach (HAPA; Schwarzer 1992) which describes how to transform risky behavior into healthy behavior following a time line from a motivational phase to an action phase. The HAPA model integrates the empirically most valid components of the Theory of Planned Behavior (Ajzen 1991) and the Health Belief Model (Rosenstock et al. 1988) such as risk perception, outcome expectancies, barriers, and resources of the behavioral change process. The central variable of the HAPA model is self-efficacy (see also Bandura 1977), influencing most other variables such as intention, action planning, and action control. To be self-efficient, a high self-esteem is necessary. In our case, body-related parts of the self-esteem are especially relevant for a change of risky eating behavior. Table 1 shows several examples of program elements which are supposed to strengthen the body self-esteem (e.g., practicing healthy eating and physical activity with concrete instructions for homework and positive feedback). Furthermore, applying the Stages of Change Model (Prochaska et al. 1992), Torera contains systematic information about eating-related issues and step-by-step exercises for behavioral change. To extend the provision of theoretical knowledge, several role plays were integrated into the program. Based on social learning theories as well as neuropsychological findings related to so-called mirror neurons (Rizzolatti and Craighero 2004), we considered role plays to be efficient strategies to try out learned behavior in a test mode without the risk of being teased or bullied. When working with children at school, it is essential to use methods that fit not only to the children’s developmental state but also to the teaching culture in general. Therefore, we derived many elements of Torera including the didactic methods from teacher interviews (e.g., presenting information to others in lessons 1–3). Other topics were inspired by patient reports (e.g., aggression against oneself and others in lesson 8). Every lesson addresses typical topics, such as dieting, attempting to reduce risk factors, and strengthening protective factors (Table 1). As in the programs PriMa and TOPP, a central method used in the Torera program is dissonance induction (see, e.g., Stice et al. 2003). This method is based on the theory of cognitive dissonance (Heider 1958) and assumes that, if one has two contradicting cognitions, he or she feels a strong need to resolve the dissonance by acquiring additional information. Three lessons begin with a specially designed poster showing an ED-related situation. For example, the poster from lesson 4 depicts a smiling Barbie-like doll offering a view inside the doll’s brain with pictures of different foods dominating all other thoughts like meeting friends, leisure activities, or romantic fantasies. Next to the doll's face is a quote from an ED patient describing suffering from omnipresent weight and eating concerns. The contradiction between the smiling face and inner suffering should provoke dissonant cognitions, which in turn should be resolved in a subsequent group discussion. Although the manual contains some guiding questions for the discussion, teachers are asked to refrain from trying to resolve the cognitive dissonance and not to provoke reactance as described in the reactance theory (Brehm 1966).

Sample Size

After stratification for gender and risk group, our smallest unit of analysis contained 91 individuals (IG compared to CG2 on SCOFF, risk group only; see Table 3), while our largest unit had 399 subjects (IG compared to CG1 on EAT-26D; see Table 3). Hence, a medium effect size of d = 0.5 by Cohen’s convention (1988) could be detected with a power of 65 % (1 − beta error) and 99 %, respectively (alpha error = significance level of p < 0.05, two-tailed). The power computation was carried out using G-Power 3.0 (Faul et al. 2007). Hierarchical effects could be ruled out by a small intraclass correlation coefficient (ICC) of 0.024 determined from the SCOFF measurement at baseline (for the EAT measurement, ICC was 0.003; for the body self-esteem measurement, ICC was 0.014). Raudenbush and Bryk (2002) recommended an ICC of at least 0.10 for the computation of hierarchical analyzes.

Measures

The entire questionnaire comprised 50 items including demographic information (grade, age, height, and weight) and the measures of the primary outcomes:

  • Body self-esteem was measured using a subscale from the German Body Experience Questionnaire (FBeK; Strauss and Richter-Appelt 1996). It contains 15 items and asks about negative and aggressive reactions towards one’s own body and the feeling of attractiveness and identification with one’s own body (response categories “agree” or “disagree,” scale range 0–15, internal consistency in the present study Cronbach’s α = 0.90).

  • Eating behavior was assessed using two questionnaires, the SCOFF test and the short version of the German version of the EAT (Tuschen-Caffier et al. 2005; original EAT: Garner and Garfinkel 1979).

The SCOFF test was developed by Morgan et al. (1999) containing five yes–no questions with a scale range of 0–5:

  • Do you make yourself Sick because you feel uncomfortably full?

  • Do you worry you have lost Control over how much you eat?

  • Have you recently lost more than One stone in a 3-month period?

  • Do you believe yourself to be Fat when others say you are too thin?

  • Would you say that Food dominates your life?

To calculate the test score, every “yes” answer is counted: “A score of ≥2 indicates a likely case of anorexia nervosa or bulimia” (Morgan et al. 1999, p. 1467). Because the scale contains only five items, the internal consistency is rather low (in the present study, Cronbach’s α = 0.43).

The 26 items of the EAT describe conspicuous to pathological eating behavior. The items are answered on a six-point scale. The summation of the items was performed according to the manual of the EAT with 0 points for the response categories “never,” “rarely,” and “sometimes” and 1 to 3 points for the response categories “often,” “very often,” and “always” (scale range 0–78; internal consistency in the present study, Cronbach’s α = 0.86). According to Garner et al. (1982), an EAT score <10 indicates asymptomatic, scores between 10 and 19 indicate moderate, and EAT scores ≥20 indicate a high degree of eating symptomatology (cf. also Buddeberg-Fischer and Reed 2001).

To determine the internal validity, the scales were correlated with each other as well as with the self-reported body mass index (BMI in kilograms per square meter). These correlations were as follows: Eating behavior was negatively correlated with body self-esteem (SCOFF: r = −0.56, p < 0.01; EAT: r = −0.49, p < 0.01). BMI was positively correlated with eating behavior (SCOFF: r = 0.27, p < 0.01; EAT: r = 0.23, p < 0.01) and negatively with body self-esteem (r = −0.47, p < 0.01). Together, the reliability and internal validity of the measures were satisfactory to good.

Statistical Analyses

All calculations were performed using SPSS (version 17.0). To test the program’s effectiveness, we conducted analyses of covariance (ANCOVA) to control for potential baseline differences with “group” (IG vs. CG1, IG vs. CG2) as independent factor, the baseline measurement (t 1) as covariate, and the post measurement (t 2) as dependent variable (Table 3). Comparability with other studies was secured by reporting raw means in Table 3 instead of the adjusted means of the ANCOVA. To estimate the relevance of the results, the statistical effect size d (mean difference divided by pooled standard deviation) was calculated. Subgroup analyzes were stratified for risk group and gender. Based on the results from former studies, we expected differential effects on these factors and, therefore, did not simply treat them as covariates. Mediation effects were estimated by the Baron and Kenny (1986) steps. Effect sizes were reported as suggested by D.A. Kenny on http://davidakenny.net/cm/mediate.htm. The indirect effect of body self-esteem (FBeK) on eating behavior (EAT) was tested by the Sobel test offered by the interactive web page by K.J. Preacher (http://quantpsy.org/sobel/sobel.htm, see also Preacher and Hayes (2004)). Implementation evaluation using confidence intervals of the means of differences between t 1 and t 2 was carried out to investigate the effects of adherence and mode of application (Table 4). Additionally, the effectiveness was estimated based on the chi-square analysis of the percentage of subjects who showed a risk group shift from t 1 to t 2 (Table 5).

Results

Demographic characteristics of the participants are summarized in Table 2. Confidence intervals indicate no significant baseline differences with respect to age and BMI. Table 3 lists the means of all primary outcomes from the first to second measurement point and the results of the ANCOVAs.

Table 2 Sample characteristics at baseline (Thuringia, Germany, March 2007–January 2008)
Table 3 Means (M denotes raw means, not adjusted by the ANCOVA), standard error of means (SE) and number of subjects (n denotes smallest valid cell count within subgroups) for baseline measures (t 1 denotes the period March 2007–January 2008) and post intervention measures (t 2 denotes the period February 2008–July 2009) for treatment groups, stratified for gender and risk group with an analysis of the covariance results and effect sizes

Significant differences for body self-esteem occurred in both IG vs. CG1 and IG vs. CG2. Body self-esteem was higher at post measurement than at pre measurement for the IG (+1.02 points) but not for CG1 (+0.01 points) or CG2 (−0.05 points).

Changes in eating behavior were significantly related to program participation, as seen both by the SCOFF and the EAT score. Students belonging to the IG showed a greater decrease in risky eating behavior (SCOFF, −0.31 points; EAT, −1.98 points) from the first to the second measurement than the students belonging to CG1 (SCOFF, −0.17 points; EAT, −0.55 points) or CG2 (SCOFF, −0.18 points; EAT, −0.68 points).

Subgroup Analyzes

As shown in Table 3, risk group members (EAT ≥ 10 at baseline) showed significant improvements in the eating behavior measures within the IG compared to CG1. For the nonrisk group (EAT < 10), all measures for the IG vs. CG2 revealed significant effects. For IG vs. CG1, this was true for the FBeK measure. Torera participants with an initial risk improved their body self-esteem by 1.49 points and their eating behavior by 0.73 SCOFF points and 6.97 EAT points.

The gender stratification showed that the boys improved their eating behavior only on EAT compared to CG2, whereas the girls received more benefit from their program participation according to almost all measures (except for SCOFF IG vs. CG2). Girls improved by 1.42 points on body self-esteem, 0.28 points on SCOFF, and 3.36 points on EAT. A direct comparison between girls’ and boys’ baseline values using confidence intervals reflects significant differences on all measures: the boys had higher scores on body self-esteem (M = 11.28 [10.86, 11.70]) and a less conspicuous eating behavior (SCOFF: M = 0.74 [0.63, 0.85], EAT: M = 6.08 [5.44, 6.72]) compared to the girls (FBeK: M = 8.90 [8.40, 9.40], SCOFF: M = 1.29 [1.15, 1.43], EAT: M = 10.18 [8.96, 11.40]).

Mediation Analysis

As described in the “Intervention” section, self-esteem was considered to possibly mediate the effect of the intervention on eating behavior. We tested this hypothesis by conducting the Baron and Kenny steps in three regression analyses to achieve the coefficients of the mediation path diagram depicted in Fig. 2.

Fig. 2
figure 2

Mediation analysis. a raw (unstandardized) regression coefficient for the association between IV and mediator, b raw coefficient for the association between the mediator and the DV including IV as a predictor of the DV, c correlation between IV and DV with mediator, controlled for covariates, c′ correlation between IV and DV without mediator, controlled for covariates, CG1 untreated control group, IG intervention group, t 1 baseline measurement, t 2 post intervention measurement. *p < 0.05, values are standardized regression coefficients (beta) for girls (n = 200) and boys (n = 202)

Because girls have a higher risk for ED, we tested the mediation separately for both sexes. Controlling for the type of school as well as for the baseline measures of FBeK and EAT, the mediation analysis revealed a significant mediation effect, using the treatment (IG vs CG1) as an independent variable, the post intervention measure of the FBeK as a mediator, and the post intervention measure of the EAT as a dependent variable for girls only (Sobel test—coefficients see Fig. 2—for girls: t value of coefficient a / t value of coefficient b = 3.415 / 4.116 = 2.628, p = 0.009, effect size a × b = 0.06, convention: small = 0.01, medium = 0.09, large = 0.25; Sobel test for boys = 0.925 / 4.982 = 0.909, p = 0.363).

Implementation Evaluation

Following the intervention, we asked the teachers (all female) about their experiences with the Torera program in a semistructured telephone interview. Specifically, we wanted to know how they implemented the program (within normal school lessons or as a special project workshop = application mode) and how strict they adhered to the teachers guiding manual (= adherence). As Table 4 shows, the application mode had no influence: both conditions, implementation as a regular lesson and implication as a workshop, yielded significant differences between baseline and post measures of all dependent variables (all differences indicating improvements). Questions related to adherence (following the manual or not) showed that eight out of ten teachers decided not to follow the manual accurately. More precisely, four teachers shortened the first three lessons because of a lack of time for applying the entire program within normal school lessons. Although stricter adherence in the adherence group revealed higher effects, statistical tests of the interaction between adherence and repeated measure (t 1 to t 2) showed no significant effects (FBeK: F (1, 167) = 3.11, p = 0.080; SCOFF: F (1, 160) = 0.20, p = 0.652; EAT: F (1, 171) = 0.46, p = 0.501).

Table 4 Evaluation of implementation process: does adherence (i.e., following manual) and application mode matter?

Program Efficiency

As it is known from other ED prevention programs, sometimes they do “more harm than good” (Carter et al. 1997). To rule out potentially damaging side effects, we analyzed up and down risk shifts within subjects. A risk shift was defined as a change from the risk group to the nonrisk group (downward shift = positive effect = “good”), i.e., from 10 or more EAT points at baseline to <10 points at post measurement, and vice versa, from the nonrisk group to the risk group (upward shift = negative effect = “harm”), i.e., from <10 EAT points at baseline to 10 or more points at post measurement (see Table 5).

Table 5 Risk shifts at baseline and post measurement, based on individual EAT points

The risk shift analysis (Table 5) shows a maximum of positive shifts in the IG (19.6 %) compared to CG1 (11.8 %) and CG2 (10.8 %). In the IG, there were significantly more positive shifts than negative shifts (χ 2 = 16.95, p < 0.001). An absolute risk reduction (ARR) of 7.8 % could be calculated, which resulted in a number needed to treat (NNT) of 13. In the IG, the negative shifts were lower than in the two control groups. Therefore, the chance of damaging side effects from the intervention could be excluded. In other words, participating in the Torera program leads to more positive effects and less harm than participating in either only PriMa/TOPP or only filling out questionnaires. The costs for Torera are analogous to the costs of the other two programs, PriMa and TOPP (€2.50 per student in the long run; Berger et al. 2011).

Discussion

This quasi-experimental effectiveness study with pre–post assessments aimed to verify the effects of the Torera program addressing the risk factors for ED among preadolescent seventh grade boys and girls with an average age of 13.1. We found significant intervention effects on body self-esteem and eating behavior in line with our hypotheses compared to both a pretreated and an untreated control group (CG). Effect sizes were small according to Cohen’s convention (1988). However, stratifying for risk status at baseline and gender revealed a more detailed picture: Subjects from the IG with 10 or more EAT points at baseline improved their eating behavior significantly with at least medium effect sizes. Controlling for gender, only the girls improved their self-esteem and lowered risky eating behavior with small to medium effect sizes, whereas boys were almost unaffected by the intervention apart from a lowered risky eating behavior compared to the pretreated CG (measured with the EAT). It may be that a bottom effect or a ceiling effect could explain this result, since the boys had significantly lower values for eating behavior and higher values for body self-esteem at the baseline compared to the girls (Table 3). In other words, the potential for changes is much greater in girls than in boys, and girls are generally at higher risk for ED.

To examine how the intervention works in terms of causal relations between the measured variables, a mediation analysis was conducted. Following the HAPA of Schwarzer (1992), self-efficacy and self-esteem, respectively, could be proposed as a mediator variable when trying to change unhealthy behavior. For girls, our analysis confirmed this hypothesis. Therefore, in line with other research on ED prevention programs (see Stice et al. 2007), our results show that increasing self-esteem changes eating behavior. For boys, there is also a strong relationship between body self-esteem and eating behavior, but the mediation effect was nonsignificant. A possible explanation could be the generally more positive body self-esteem of boys compared to girls.

In addition to the effectiveness, we checked some aspects of the program’s efficiency, which are known to be more relevant for the stakeholder’s acceptance and broader dissemination. First, we addressed the question of whether adherence to the manual or application mode (normal school lessons vs. workshop) was of influence. We found that both application modes worked. Eight out of the ten project teachers did not follow the manual strictly. At first glance, they reached better results than those teachers who followed the manual accurately. However, a closer examination of the effects showed that the two conditions differed significantly only for the body self-esteem measure. Another analysis related to potential damaging side effects of the Torera program. As known from previous experiences in the field, trying to prevent unhealthy behavior may sometimes lead to adverse effects, especially in adolescents. This was recently described by the authors of a “Youth Development Programme” in England, where teenage pregnancies and other unwanted behavior increased dramatically in the group of program participants (Wiggins et al. 2009). Several studies also reported such side effects for ED prevention (e.g., Carter et al. 1997). In our case, we found more positive effects and less harm was done in terms of significant downward and upward risk shifts within subjects, yielding a maximum ARR of 7.8 %. Accordingly, 13 students would have to take part in Torera to allow one to significantly benefit from the intervention (= NNT).

Study Limitations and Strengths

As Fig. 1 reveals, only 24 % of the baseline sample agreed to participate in Torera or to serve as controls at the second assessment. From a practical point of view, this participation rate seems to be encouraging because we did not offer any incentives. But in methodological terms, this selection of the sample could have biased the results. Since our project partner, the Thuringian Ministry of Education, “sacrificed” randomized group assignments (see Berger et al. 2008), the motivation to participate could have varied between IG and CG schools. To get some information about the possible influence of self-selection, we conducted two analyzes, one with an untreated CG and one with a pretreated CG. “Pretreated” means that the schools were already engaged in ED prevention 1 year before, whereas “untreated” means that the schools had never engaged in ED prevention. The fact that the results were merely the same under both conditions can serve as a hint for just a minor selection bias. Additionally, Torera was implemented within the regular curriculum, so that participation was independent of the individuals’ motivation. Furthermore, all major analyses were performed to control for baseline differences between the groups.

Further methodological limitations of our study relate to separate analyzes of the “mediators of moderation effects” because our sample size did not allow for simultaneous testing with sufficient statistical power (for details, see Fairchild and MacKinnon 2009). Following the main ANCOVA, we conducted two ANCOVAs to test whether gender (moderator 1) and risk status (moderator 2) moderated the effect of our program. In addition, we simply compared the outcome means for the two adherence conditions (moderator 3) and the two application modes (moderator 4). Our analysis, therefore, provides no information about possible interactions of different program contents.

Compared to other programs (cf. Stice et al. 2007), our effect sizes seem to be lower than expected. As far as we know, other programs have rarely experienced a widespread dissemination, i.e., they were usually evaluated under the ideal conditions of pilot studies. Although the previously mentioned limitations of our study might not be satisfying, the Torera implementation process represents the real-world situation in terms of the Society of Prevention Research (SPR; Wick et al. 2011). Following the recommendations of the SPR (Flay et al. 2005), we first tested the program under ideal conditions (level 1 “efficacy,” within an unpublished master’s thesis by Gerhard 2006) and improved it before implementing the Torera program under real-world conditions as described in this article (level 2 “effectiveness”). Simultaneously, we established organizational structures for program acquisition (level 3 “broad dissemination”; see Berger et al. 2011). Furthermore, many programs in the meta-analysis of Stice et al. (2007) were carried out by external professionals, who were only temporarily available. Moreover, they often targeted older girls (15 years old) or preselected high-risk groups. These procedures may indeed increase effect sizes, as the authors stated, but they do not secure sustainability or meet the stakeholders’ interests. In medical terms, they ignore the epidemiological fact that the peak of incidence of AN is at age 15 (Hoek and van Hoeken 2003). Hence, ED prevention has to start earlier. As we learned from our project partner, the Thuringian Education Ministry, as well as the project teachers, intending to “catch” all students increases the likelihood that a primary prevention approach will be implemented rather than a program only for risk groups. However, as a compromise, it is worth considering how a primary prevention approach could be enriched with additional secondary preventive actions (e.g., special add-ons for high-risk students and their parents).

Lessons Learned and Practical Relevance

Disordered eating behavior is an increasing health problem in children and adolescents. One clinical endpoint of disordered eating is ED, which could be seen as the tip of the icebergs of the psychological (and physiological) consequences of behaviors such as regular restraint eating or frequent purging after overeating. Another endpoint is progressive overweight and obesity which is judged as a global epidemic (World Health Organization 2000). Until now, no single model could sufficiently describe the etiology of EDs and obesity, and even the relationship between the two diseases seems to be unclear (Wardle 2009). Hence, no single preventive approach could stop the development of eating-related health problems. On the other hand, a variety of promising approaches exist and have an impact on variables that are associated with these problems (Stice et al. 2007). Research from the past decades revealed valid knowledge about risk factors and protective factors for EDs (Jacobi et al. 2004) as well as basic principles of the mechanisms of health-related behavioral changes (e.g., Schwarzer 1992). This offers the possibility to combine successful strategies in new programs that could be customized for stakeholders. In our case, we cooperated with the Thuringian Ministry of Education to secure sustainability and program dissemination focusing first on the most threatening ED, AN (see Wick et al. 2011). We had to start our preventive actions with 12-year-old girls because of the epidemiological background of AN. In our current work, we went further to the prevention of BN and BED for both 13-year-old girls and boys. Based on the described theoretical background, we focused on strengthening body self-esteem and reducing risky eating behavior. Others may set their focus on overweight and obesity and, therefore, focus on younger children and the mediation of variables like global self-esteem and group assertiveness. Regarding the increasing knowledge about the shared risk factors of ED and overweight, even combining programs seems to be possible in future (Haines et al. 2010). We recommend customizing not only the program content but also concepts of evaluation. Our study shows that it is possible to change risky eating behavior systematically through a global prevention approach—even beyond the stage of a pilot study—but there is a wide range of opportunities left to increase practical relevance. In our opinion, neither Torera nor any other primary preventive intervention in the field can solve the problem of EDs, but could be an important (first) step within the school setting to promote conditions for a better body self-esteem and, subsequently, healthier eating behavior in adolescents. Further research is needed to simultaneously test mediators and moderators and to test the program’s effectiveness on secondary outcomes (i.e., ED prevalence). The “innovation” of Torera was to transfer successful elements of ED prevention, developed in the past two decades, to preadolescent children (most former programs were developed for children up to the age of 15), to establish an ED prevention approach in a whole state (Thuringia, Germany), and to secure sustainability by empowering teachers to apply the program by themselves, i.e., to realize broad dissemination under real-world conditions.

Conclusion

The Torera program provides a primary prevention approach which significantly reduces risky eating behavior and strengthens body self-esteem, especially in the target group of preadolescent girls. Generally, students at risk at baseline improved more, although on average all Torera participants improved after program application without damaging side effects. It is not obligatory to strictly follow the teachers’ manual in order to achieve these effects (more precisely, the first three lessons could be shortened, if necessary). Furthermore, the program yielded similar results when offered during normal school lessons or in a workshop mode. Considering the small effect sizes, additional efforts seem to be necessary to prevent eating-related problems. However, the school setting provides a fair and low threshold program access. The relatively low costs of €2.50 per pupil as well as the program’s application by in-house teachers instead of external professionals could secure its sustainability. In terms of the Cochrane Collaboration, the study presented here reaches evidence level II. Limitations of the external validity of the results compared to level I (randomized controlled trial) are a possible self-selection bias at the school level and a selection bias due to a relatively low participation rate.