Introduction

Interest in after-school programs for school-age children is at an all-time high. Substantial numbers of children in the United States attend programs; in the most recent nationally representative survey, 23% of children in kindergarten and Grades 1–5 who had nonparental care after school attended a school- or center-based program for an average 7.7 h per week (Carver and Iruka 2006). A primary goal of these programs traditionally has been to provide supervision to children while their parents work. However, in response to federal, state, and local policy initiatives, as well as philanthropic investments, the roles and functions of after-school programs are expanding to include services targeted to low-income children and adolescents with the aim of improving academic achievement and narrowing the achievement gap.

Accompanying the expansion of after-school programming has been an interest in documenting whether the programs influence children’s academic performance and other measures of adjustment. Much of the research to date has compared children who attend programs with children who participate in other after-school contexts such as maternal care and self-care, or with children who did not attend the studied programs. Some investigators have detected no effects or, in some cases, even negative associations between program participation and children’s functioning (e.g., James-Burdumy et al. 2005; NICHD Early Child Care Research Network [ECCRN] 2004; Pettit et al. 1997), whereas others have found participation in after-school programs to be linked positively with academic and social outcomes (e.g., Huang et al. 2000; Mahoney et al. 2005; Posner and Vandell 1994; Reisner et al. 2004). These discrepant findings may be due to differences in the quality of children’s experiences in the programs. All after-school programs are not the same.

A central tenet of ecological systems theory (Bronfenbrenner 1989) is that processes within settings may influence child developmental outcomes. In their synthesis of the after-school literature, primarily reports by expert panels and from workshops convened to identify best practices in after-school programs, Beckett et al. (2001) identified three setting characteristics that receive strong (vs. moderate or limited) endorsement as practices comprising high-quality programming with the potential to yield positive child outcomes: (a) positive staff–child relationships, (b) a diverse array of developmentally appropriate activities that provide opportunities to build skills, and (c) flexible programming that allows for student choice and autonomy in the selection of activities. Preliminary empirical support for the import of these features in after-school programs is found in reports of links between global program quality composites and child engagement in the programs as well as developmental outcomes. For example, participation in high-quality programs (characterized by positive staff–child relationships, a variety of enrichment activities, and student choice and input into program activities), in comparison to participation in lower quality programs, is positively associated with student engagement in the programs (Eccles and Gootman 2002; Grossman et al. 2007; Mahoney et al. 2007) and with children’s report card grades, work habits, and peer relations (Vandell et al. 2005b).

Scholars have called for examination of how specific after-school program features, rather than overall program quality, may be uniquely associated with child developmental outcomes (Durlak and Weissberg 2007; Farber 2007; Granger et al. 2007). Most of the limited research to date has examined a single feature without consideration of other program characteristics. For example, results of an examination of staff–child relations in the NICHD Study of Early Child Care and Youth Development (Vandell et al. 2005a) indicated that less conflictual relationships between children and after-school program staff were associated with children’s higher reading and math achievement, controlling for prior child functioning, child and family characteristics, and the instructional and emotional quality of the children’s school classrooms. Durlak and Weissberg (2007) focused on the activities feature in their meta-analysis of studies of programs targeting personal and social skills. They reported that a sequenced set of activities that encouraged active forms of learning was associated with improvements in children’s school performance and social adjustment, whereas participation in programs using less organized approaches to activity implementation and didactic instruction was not associated with child outcomes.

Research is still needed to examine multiple after-school program features simultaneously as unique and distinct components of program quality. In an earlier report (Pierce et al. 1999), we examined three program features (positive staff–child relations, diverse activities, flexible programming) and their concurrent associations with child developmental outcomes when children were in Grade 1. We determined that boys who attended programs where staff were positive and supportive had higher reading and math grades and fewer behavior problems according to their Grade 1 teachers. Availability of a larger number of age-appropriate activities was linked to poorer reading and math grades, poorer work habits, and more behavior problems for boys. This was an unexpected finding given the importance of diverse activities to older children’s (Grades 3–5) positive perceptions of program quality (Rosenthal and Vandell 1996). We speculated that children in the early years of elementary school (6- and 7-year-olds) may need a more tightly structured program that offers a limited array of activities, and that a larger array of activities might be overwhelming, but that a broader array of activity choices might become more important as the children developed. Finally, boys who attended more flexible after-school programs that allowed children greater autonomy and choice in selecting their activities had better social skills with peers at school.

In the current study, we follow the same group of children to Grade 2 and then Grade 3, and ask if similar associations between program quality features and child functioning are detected at these older ages. Children’s needs may vary as their skills develop rapidly from one year to the next in early primary school, so we examine links between children’s program experiences and child outcomes in consecutive school years. Our examination ends at Grade 3 due to the small number of children continuing their program enrollment into later grades and the associated loss of statistical power to detect effects. This attrition in the higher primary grades is consistent with that found in national surveys of children’s after-school arrangements (Kleiner et al. 2004).

Given the current policy focus on child participation in after-school programs as a means of improving school performance, we consider academic and social functioning in the school classroom. In particular, we examine academic performance in two content areas, reading and math, reported in multiple studies to be associated with program participation (Durlak and Weissberg 2007; Huang et al. 2000; Lord and Mahoney 2007; Mahoney et al. 2005; Posner and Vandell 1994; Reisner et al. 2004). We also examine children’s classroom work habits, a “leading indicator” of academic performance (Grossman et al. 2002), and social skills with peers, reported to facilitate inclusion in the social and learning milieu of the classroom and consequently involvement in learning activities (Ladd et al. 1999). We specifically examine whether boys continue to be more sensitive to variations in program quality, or if program quality effects emerge for girls as well as boys in later middle childhood.

In our earlier work when the children were in Grade 1, we observed that White children and children from families in which incomes were higher, mothers were partnered and had more education, and parenting was more sensitive were more likely to be enrolled in higher quality programs, similar to findings in early child care settings (Burchinal and Nelson 2000; NICHD ECCRN 2006). We control for these child and family characteristics and an additional covariate, child prior functioning, that was not possible in the Grade 1 analyses. This additional control further reduces sample selectivity bias and allows us to take a “value-added” approach in our analyses. We ask, for example, if features of program quality (staff positive regard, number and diversity of age-appropriate activities, programming flexibility) in Grade 2 are related to children’s math grades in Grade 2, controlling for Grade 1 math grades. Then, we ask if program quality in Grade 3 is related to children’s math grades in Grade 3, controlling for Grade 2 math grades.

Based on the proposition that settings which provide positive relationships, varied activities, and appropriate structure facilitate positive developmental outcomes (Eccles and Gootman 2002), as well as the ecological systems principle that children’s experiences in one setting are important to their functioning in other settings (Bronfenbrenner 1989), we expect that each of the three program quality features we examine will be uniquely and positively associated with children’s functioning in the school classroom. Because our previous research detected relations between program quality and child developmental outcomes for boys but not girls, we also expect that relations between program features and child outcomes will be moderated by child sex.

Method

After-School Programs

All after-school programs (N = 92) in and around a mid-size Midwestern city were approached and asked to provide general information about the programs and the enrolled students. Information was obtained from 90 programs (98% response rate) enrolling 781 Grade 1 students. Programs were selected for further study on the basis of program auspice and location, such that approximately equal numbers of proprietary and nonprofit programs, and of school- and community-based programs, were represented. Selection also took into consideration the enrollment of students in Grade 1 (at least three who attended regularly). These selection criteria resulted in an initial sample of 47 programs.

Program directors were asked to distribute a letter introducing the study to the parents of all Grade 1 students in their programs (N = 529). Parents returned a brief survey directly to the project office, indicating the child’s sex and ethnicity, the parents’ marital status and educational attainment, and the number of days each week that the child attended the program. Seven programs enrolling a total of 47 students in Grade 1 did not distribute the letters, and parents at three programs enrolling a total of 28 students in Grade 1 failed to return the parent survey, leaving 37 participating programs enrolling 454 Grade 1 students. Seventeen of the participating programs (46%) were based at the children’s schools; 19 (51%) were nonprofit. There were no significant differences between participating and all nonparticipating programs in terms of the sex and minority status of enrolled Grade 1 students.

Following the enrollment of children and families in the study (see below), one child moved to a newly started nonprofit community-based program that had not been contacted initially, resulting in a sample of 38 programs in the first year of the study. When the study children were in Grade 2 and Grade 3, the number of participating programs changed as children enrolled in additional programs or left the programs. Children were enrolled in 46 programs (52% school-based, 59% nonprofit) in Grade 2, and 37 programs (62% school-based, 65% nonprofit) in Grade 3.

Participants

The brief family survey distributed by the programs was returned by 275 families (57% response rate). Children who attended after-school programs at least 3 days per week were selected for the study using a conditional random sampling strategy so that approximately half were boys. All minority-ethnicity children and all children living in single-parent homes were selected in order to ensure adequate representation of these demographic characteristics in the sample. Other children (nonminority and those living in two-parent homes) were selected randomly.

Telephone contacts with potential participants were conducted until the target sample size of 150 was achieved. We contacted 175 families (86% acceptance rate). The average age of the children at recruitment was 6.5 years (SD = 0.3). Other demographic characteristics of the recruited sample are shown in Table 1. There were no significant differences between families who agreed to participate and those who declined in terms of child minority status, family structure (one- vs. two-parent), and maternal education. Families of boys were more likely to refuse participation than families of girls, χ2(1, N = 175) = 3.92, p < .05. We also compared the recruited sample to the pool of families who returned the demographic survey to the project office but were not selected for the study. There were no differences in terms of child sex, family structure, and maternal education. Study participants were more likely to be of minority ethnicity, χ2(1, N = 274) = 3.90, p < .05.

Table 1 Demographic characteristics of the recruitment and program participant samples

Table 1 also shows the demographic characteristics of the children who continued to attend the programs in Grades 2 and 3. There were no significant differences between the recruited sample in Grade 1 and the children who continued to attend after-school programs in Grade 2 and Grade 3 in terms of child sex, child minority status, and family structure. Maternal education did not differ between the recruitment sample and the program participant sample at Grade 2, but at Grade 3, mothers in the program participant sample had more education, t(147) = 2.70, p < .01, than mothers in the recruitment sample. Finally, there were no significant differences between the male and female program participants in terms of child minority status, family structure, and maternal education in Grades 1, 2, and 3.

Program Enrollment and Attendance

Twice each school year, mothers reported children’s enrollment in programs and the days each week the children attended the programs. From these reports, we computed the average number of days the children attended the programs each year. As shown in Table 2, on average close to 4 study participants attended each of the after-school programs during Grade 1. In Grade 2, although fewer children attended programs (80% of the recruited sample), the number of programs they attended increased due to some children leaving their Grade 1 programs and enrolling in new programs. About 61% of the recruited sample continued to attend the programs in Grade 3, although the number of programs was reduced due to child attrition from the programs. The number of days each week that the program participants attended the programs was consistent, averaging well over four afternoons per week, in line with the average of nearly 8 h per week reported in national surveys (Carver and Iruka 2006).

Table 2 Program attendance across years

Measures of Program Quality

Observations were conducted in the after-school programs several times each year when the study children were in Grades 2 and 3 in order to assess program quality. Four observations were conducted in each program during Grade 2, and three observations during Grade 3. Each program observation was conducted for 90 min. At the end of each observation, a 4-point qualitative rating was made of each of three program features: positive staff–child relations, available activities, and programming flexibility. Table 3 provides descriptive statistics for the program quality measures.

Table 3 Descriptive statistics for measures of family characteristics, program features and child developmental outcomes

The observers were graduate student researchers who completed a two-step training process each year. In the first step, the observers attended a 4-h training meeting that included a review of observational methodologies and written materials related to the study’s measures of program features. In the second step, the observers were paired to conduct pilot program observations until each observer attained a minimum 80% agreement for each program feature across five observations. Interobserver reliability was determined by pairing observers for 22% of the program observations in Grade 2 and 32% of the observations in Grade 3 and calculating Cohen’s linear-weighted kappa from the observers’ individual ratings, as reported below.

Positive Staff–Child Relations

Observers rated staff–child relations using a scale adapted from rating scales used in the Observational Record of the Caregiving Environment (NICHD ECCRN 1995). The rating assessed the degree to which staff evidenced enjoyment of children in the program and was made for each program staff member who was present during the observation, based on his or her behavior toward all children in the program. A rating of 1 was given to those staff who were detached, had flat affect, or were consistently negative with children. Most of their interactions consisted of verbal directions or instructions, with little time spent in informal or spontaneous conversation. A rating of 4 was given to those staff who appeared strongly positive toward children in the program by displaying acceptance and encouragement, as evidenced by a warm tone of voice when speaking, physical gestures to convey affection, smiling or laughing with the children, and enthusiasm. Interactions with children were reciprocal, as opposed to dominated by the caregiver. Annual program-level scores were computed as the mean of the ratings made for each staff member at each observation during the school year. Interobserver agreement (Cohen’s linear-weighted kappa) for the ratings of individual staff was .92 in both years. The annual mean scores in Grades 2 and 3 were correlated .89 and .82 (p < .0001), respectively, with an item assessing the nature and quality of staff–child interactions on the School-Age Care Environment Rating System (SACERS; Harms et al. 1996), also obtained during the program observations.

Available Activities

The available activities rating, adapted from Rosenthal and Vandell (1996), assessed both the variety and age appropriateness of the activities that were available during the observation. The rating was made at the end of the program observation based on what was observed in the program as a whole. A rating of 1 reflected a limited number of activities that focused on only one or two of several areas of development (physical, social, cognitive). A rating of 4 reflected the availability of multiple age-appropriate activities in all three areas of development. We observed children engaged in a range of activities at the programs, including large-motor play (e.g., soccer, using playground equipment), arts and crafts, fantasy play (e.g., playing “house” or with dolls or toy cars), unstructured fine-motor play (e.g., Legos, puzzles), board games and cards, watching movies, academic enrichment (e.g., science experiments, reading for pleasure), performing arts (e.g., drama, dance, music), and computers. Homework or tutoring was observed rarely, at only one program in Grade 2 and two programs in Grade 3. Annual scores were computed by averaging the ratings made at each observation during the school year. Interobserver reliability (Cohen’s linear-weighted kappa) for the available activities rating was .97 in Grade 2 and .96 in Grade 3. Validity was evidenced by concurrent correlations of .76 in Grade 2 and .58 in Grade 3 (p < .0001) with the SACERS Activities scale, which assesses variability in activities and access to materials to support them.

Programming Flexibility

The programming flexibility rating, adapted from Rosenthal and Vandell (1996), measured the degree to which program participants were afforded autonomy and choice at the program. The rating was made at the end of the program observation based on what was observed across all activities. A rating of 1 reflected a highly structured program with required participation in planned activities and staff-determined social groupings. Children were not allowed to choose either their activities or their playmates. A rating of 4 reflected flexible programming that featured individual choice and autonomous decision making. Children were allowed to choose the activities they participated in, create their own activities, and select their playmates. An annual score was computed for each year by averaging the ratings made at all observations during that year. Interobserver reliability (Cohen’s linear-weighted kappa) was 1 in Grade 2 and .91 in Grade 3. The programming flexibility rating was correlated .81 and .74 (p < .0001) in Grades 2 and 3, respectively, with an item measuring child autonomy in selecting activities on the concurrently rated SACERS.

Measures of Family Characteristics

Family Demographics

Mothers provided current information about family characteristics during a visit to the home in the fall of each school year, including family structure (one- or two-parent home), maternal educational attainment, and family income. Mothers reported their educational attainment using a 5-point scale (1 = less than high school diploma or GED, 5 = graduate degree). Reports of income were preceded by a checklist of potential income sources, to ensure that the mothers considered all income their families received. Descriptive statistics for maternal education and family income can be seen in Table 3.

Parenting Practices

In the fall of each school year, mothers completed a 30-item measure of parenting practices, the Raising Children Checklist (Shumow et al. 1998). Items were rated on a 4-point scale (1 = definitely no, 4 = definitely yes). Principal axis factor analysis with Varimax rotation yielded three factors: Firm/Responsive Parenting, Permissive (Lax) Parenting, and Harsh Parenting. We selected the six-item firm/responsive parenting scale for use in analyses due to its documented importance for child development (NICHD ECCRN 2008). Sample items include “Do you praise your child when he/she does something you like?” and “Do you give your child a chance to explain his/her side before punishing him/her?”). Table 3 provides descriptive statistics for the firm parenting scale. The scale’s internal consistency was adequate (α = .74 in Grade 2, α = .69 in Grade 3), and its validity has been demonstrated in other research where it was associated negatively with children’s behavior problems (Shumow et al. 1998), in accord with Baumrind’s (1989) findings for firm/responsive parenting practices.

Measures of Child Developmental Outcomes

Near the end of each school year in Grades 1–3, children’s classroom teachers at school completed measures of child academic and social adjustment. The measures were mailed to the teachers and returned to the project office by mail. Completion rates were high for the program samples: 98% in Grade 1, 88% in Grade 2, and 82% in Grade 3. Table 3 provides descriptive statistics for the outcome measures.

Academic Grades

Classroom teachers reported children’s grades in reading, mathematics, oral language, written language, science, and social studies using the Mock Report Card (Pierce et al. 1999), developed so that standardized information could be obtained across schools. Grades were reported on a 5-point scale ranging from (1) failing to (5) excellent. Reading and math grades were chosen for analysis based on their documented associations with participation in after-school programs in other research. Scores on the Mock Report Card were correlated in the .60s with standardized achievement test scores in the current sample, attesting to the measure’s validity.

Work Habits

Children’s work habits were rated by teachers using six items on the Mock Report Card (Pierce et al. 1999). The items, rated on a 5-point scale ranging from (1) very poor to (5) very good, are “Follows classroom procedures,” “Works well independently,” “Works neatly and carefully,” “Uses time wisely,” “Completes work promptly,” and “Keeps materials organized.” Item scores were averaged to create a single work habits score (α = .93–.94). Validity of the work habits scale was demonstrated by positive correlations with work habits scores from a maternal-report measure of children’s adjustment in the current sample.

Social Skills with Peers

Classroom teachers completed one subscale of the Teacher Checklist of Peer Relations (Coie and Dodge 1988). This subscale contains seven items pertaining to children’s social skills with peers, rated on a 5-point scale ranging from (1) very poor to (5) very good. Sample items include “Is socially aware of what is happening in a situation” and “Generates good-quality solutions to interpersonal problems.” Item scores were averaged to create a single social skills score (α = .94-.95). Coie and Dodge found evidence for validity of the measure in its positive associations with peer ratings of their classmates’ prosocial behavior.

Results

Relations Among Program Quality Indicators

Prior to conducting substantive analyses, we examined associations among the program quality indicators. The positive staff–child relations rating was not significantly correlated with available activities, r(46) = .19, ns in Grade 2, r(37) = .23, ns in Grade 3, or programming flexibility, r(46) = .19, ns in Grade 2, r(37) = .08, ns in Grade 3. More diverse and age-appropriate activities was associated with greater programming flexibility in Grade 2, r(46) = .62, p < .001, and in Grade 3, r(37) = .48, p < .01. We elected not to combine activities and flexibility into a single composite because of our previous findings that these program features were differentially related to child functioning in Grade 1.

Program Quality Features and Children’s Developmental Outcomes

Due to the participation of multiple students at most of the after-school programs, our substantive analyses examining associations between program quality characteristics (positive staff–child relations, available activities, programming flexibility) and child developmental outcomes involved hierarchical linear modeling (HLM). Grade 2 and Grade 3 were analyzed separately because a substantial number of children (n = 29) dropped out of the programs by Grade 3, and the purpose of our analyses was to examine relations between features of program quality and child outcomes at different ages.

For each outcome in each of the two grades we examined, a two-level model was fit in which children (Level 1) were nested within programs (Level 2). This allowed us to control for dependence due to the sampling of children from the same programs, and also to test program quality characteristics using the appropriate unit of analysis (program, as opposed to student). We examined main effects of the program quality indicators as well as their interaction with child sex, given our earlier findings of differential effects of program quality on boys and girls in Grade 1.

We entered child and family selection controls in each model, including child sex and ethnic minority status, and concurrent household structure (single parent vs. two parents), maternal education, family income, and firm/responsive parenting practices. We also entered prior-year adjustment for each outcome, such that in analyses of Grade 2 outcomes, we controlled for Grade 1 adjustment, and in analyses of Grade 3 outcomes, we controlled for Grade 2 adjustment. This allowed us to examine program effects in relation to the residual change that occurred in child adjustment during a given school year.

The HLM model applied was a random slope and intercept model, where all control variables (with the exception of child sex) had fixed effects. By introducing a random effect for child sex, we allowed the effect of child sex (i.e., difference in change between boys and girls) to vary across programs. The observed program quality indicators were entered as predictors of both the random intercepts (accounting for main effects related to program quality) and the program-dependent child sex effect (accounting for program quality × sex interactions). All analyses were conducted using the HLM6 program (Raudenbush et al. 2004).

Results of the HLM analyses of the Grade 2 and Grade 3 outcomes are shown in Tables 4 and 5, respectively. The tables provide regression coefficients, standard errors, and t statistics for each control and predictor variable. The coefficients can be interpreted with reference to the metrics of the relevant Level 2 predictors and the outcomes, as there is no natural effect size measure in HLM. For each unit (rating point) increase in the predictor, the coefficient indicates the unit (scale point) change in the outcome. The meaning of a unit change in the outcome also can be interpreted in terms of the distribution by dividing the coefficient by the SD of the outcome measure.

Table 4 Hierarchical linear models of program quality effects on Grade 2 outcomes
Table 5 Hierarchical linear models of program quality effects on Grade 3 outcomes

Tables 4 and 5 also show three variance estimates: (a) program residual variance, which indicates the amount of variance left unexplained in program main effects when accounting for program-level variables; (b) sex effect residual variance, or the amount of variance left in child sex effects across programs when accounting for program-level variables; and (c) student residual variance, which indicates the amount of variance left unexplained in the outcome when controlling for all student- and program-level variables. Tests of statistical significance (χ2, as performed in HLM6) were conducted for the program main effect and sex effect residual variances in each analysis. HLM6 does not test the student residual variance because any value over 0 implies that perfect prediction of the outcome variable did not occur (Raudenbush and Bryk 2002).

Grade 2 Program Quality and Child Outcomes

We observed three associations between positive staff–child relations in the after-school programs and child functioning in Grade 2 classrooms. Children who participated in after-school programs where staff–child relations were more positive displayed relative gains in both reading and math grades in Grade 2 in comparison to children who attended programs where staff–child relations were less positive. For each 1-point increase in the staff–child positive relations rating, there was an average increase of 0.49 scale points in the reading grade (associated with a 0.43 SD change in reading) and 0.58 scale points in the math grade (associated with a 0.66 SD change in math). In addition, a significant interaction between positive staff–child relations and child sex was detected for children’s social skills with peers, where a 1-point increase in the staff–child relations rating implies an average increase of 0.64 units in the sex effect, suggesting boys gain more than girls as the ratings of staff–child relations increase. Available activities and programming flexibility were not associated with child outcomes in Grade 2.

Grade 3 Program Quality and Child Outcomes

In Grade 3, as in Grade 2, positive staff–child relations were associated with child functioning at school. Children who participated in programs in which staff–child relations were more positive experienced gains in their reading grades in Grade 3 relative to children who attended programs in which these relations were less positive. For each 1-point increase in the rating of staff–child relations, there was an average increase of 0.36 scale points in the reading grade (associated with a 0.34 SD change in reading). Greater availability of diverse, age-appropriate activities was associated with higher math grades and work habits in the classroom in Grade 3. For each 1-point increase in the activities rating, there was an average increase of 0.48 scale points in math grades (associated with a 0.50 SD change in math) and 0.44 scale points in work habits ratings (associated with a 0.47 SD change in work habits). Programming flexibility was not associated with child outcomes in Grade 3. There were no significant interactions between features of program quality and child sex in Grade 3.

Discussion

The aim of this study was to examine associations of three specific features of after-school program quality—positive staff–child relations, the availability of diverse, age-appropriate activities, and programming flexibility—with children’s functioning in the school classroom at two ages, first in Grade 2 and then in Grade 3. We considered the program features simultaneously so that we could determine the unique influence of each feature on children’s outcomes, controlling for other features.

Positive staff–child relations, which we defined as program staff’s positive and supportive behavior with all children in the program, were related to children’s performance in their Grade 2 classrooms, over and above their performance in Grade 1. In particular, children who attended after-school programs in which the staff were more positive posted gains in their reading and math grades relative to children who attended after-school programs in which staff were less positive. Positive staff–child relations continued to be associated with positive changes in children’s reading grades during Grade 3. However, whereas in Grade 1 the associations were evident for boys only (Pierce et al. 1999), the later associations were evident for both boys and girls. It appears that in both Grade 2 and Grade 3, both boys and girls are sensitive to how positive the after-school program staff are to children in their programs. These results underscore the importance of supportive relations with nonparental adults for facilitating child adjustment, as noted in studies of mentoring (Jekielek et al. 2002) and structured activities (Mahoney et al. 2002). They also are in accord with reports that emotionally supportive elementary school classrooms (NICHD ECCRN 2003) and positive teacher–child relationships (Pianta and Stuhlman 2004) are associated with better child outcomes.

We also observed an association between positive staff–child relations and boys’ (but not girls’) social skills, as reported by their Grade 2 teachers. This interaction between a program quality feature and child sex is reminiscent of our earlier finding when the children were in Grade 1, when boys appeared to be more sensitive than girls to variations in staff–child relations, activities, and programming flexibility. It also extends findings of positive associations between child care quality and better social adjustment for preschool boys but not girls (Hagekull and Bohlin 1995; Peisner-Feinberg and Burchinal 1997). However, given that it was the only significant interaction we observed, we are not sure if the contrast to the Grade 1 findings is due to changes in boys’ needs to match those of girls or to the more conservative analytic procedures used in the current study.

A second feature indicative of program quality was associated with Grade 3 outcomes. Greater availability of diverse age-appropriate activities at the programs was associated with positive changes in children’s Grade 3 math grades and work habits, relative to their performance on these outcomes at the end of Grade 2. The availability of multiple activities was not associated with changes in child outcomes in Grade 2 (relative to functioning a year prior). These results stand in contrast to the negative associations of available activities with boys’ adjustment in Grade 1, when higher activities ratings were associated with poorer reading and math grades and poorer work habits. Our findings suggest a developmental change in the import of activities to child outcomes. In Grade 1, when most children are first experiencing the highly structured school context with prescribed activities, boys appear to benefit from a match between the school and after-school contexts. As children gain experience in the school setting and enter subsequent grades, both boys and girls may experience changing needs for opportunities to sample different activities that may lead to the development of skills and competencies that promote positive adjustment.

Given the increasing focus in schools on academic achievement in response to the federal No Child Left Behind Act, after-school programs may be uniquely positioned to provide children with an outlet for pursuing a variety of activities that support their development. Observations of elementary school classrooms in large-scale national research reveal that schooling is generally characterized by basic skills activities taught through whole-class instruction and individual seatwork, with a focus on rote learning (NICHD ECCRN 2005; Pianta et al. 2007). After-school programs, on the other hand, are able to offer interactive enrichment activities such as art, drama, sports, computer learning, music, and science projects, without the singular focus of typical extracurricular activities such as karate lessons or league soccer.

Programming flexibility, which we operationalized as children’s freedom to choose their activities at the programs, was not associated with child outcomes in Grades 2 and 3, suggesting that this type of choice in after-school programs is not related to academic and social development at these ages. In future research, it may be more important to examine support for autonomy in terms of staff behaviors such as giving few directives for how an activity should be conducted, listening to what children have to say about the activity, and asking children how they want to approach a task within the activity (Reeve et al. 1999). This type of support provides for autonomy within activities and is reported to be more important for student learning outcomes than choice of activities (Assor et al. 2002).

After-school policy makers generally advocate the three program features examined in this study—positive staff–child relations, a range of activities, and flexible programming—as best practices in programs serving school-age children (see Beckett et al. 2001). Our current results as well as those in our earlier work suggest that children’s positive relationships with program staff are beneficial at all the ages we studied. In other areas, best practices may vary with child age. For example, a wide variety of activities does not appear to be salient until Grade 3. Programming flexibility, defined as student choice of activities, does not appear to be important through the middle elementary years. We would expect this program feature to become more salient to children as they get older and press for greater autonomy in their out-of-school activities.

The program quality effects we obtained for positive staff–child relations and available activities ranged from 0.34 to 0.66 SD gains, suggesting that after-school programs can play a significant role in fostering academic and social outcomes when children attend frequently and regularly, as the participants in the current study did (averaging over 4 days per week in both Grade 2 and Grade 3). These results must be interpreted in light of our research design, which was not experimental and does not allow us to definitively rule out sample selectivity or omitted variables. Nonetheless, we did control for multiple family and child selection factors as well as children’s prior adjustment, making our design more rigorous than that of many studies of after-school programs. Furthermore, in contrast to much of the program research, we examined children’s experiences at a large number of programs, thereby increasing the generalizability of our findings. A logical next step for future research would be experimental studies in which features of program quality such as available activities and programming flexibility are systematically manipulated. Experimental manipulation of the tone of staff–child relationships is less likely for ethical reasons.

Whereas many evaluations and studies of after-school programs have found effects only for at-risk populations (e.g., Marshall et al. 1997; Scott-Little et al. 2002), the current study found positive associations of program quality features and child outcomes in a more heterogeneous sample that was not particularly at risk for poor functioning in the school context. In concert with other findings that children who have high levels of social competence can benefit from programs aimed at improving social skills (Riggs 2006), the current findings suggest that after-school programs that offer positive staff–child relations and opportunities to participate in a diverse array of age-appropriate activities can confer substantial benefits on all children in the early and middle elementary school years. Future research should investigate whether these benefits are maintained for older school-age children and adolescents.

Other avenues for future research include examination of additional program features that may be associated with school-age children’s outcomes. Preliminary reports suggest that organized implementation of program activities and behavior management (setting reasonable ground rules, positive reinforcement for adherence to the rules, firm and effective response to misbehavior) are important features to consider (Gerstenblith et al. 2005; Grossman et al. 2007). In research with older youth, investigators might consider additional features posited to characterize high-quality programs for adolescents, such as structured opportunities for skill building and intentional learning experiences (Eccles and Gootman 2002).

In conclusion, this study documented differential associations between after-school program features or processes and program participants’ adjustment at school. Positive staff–child relationships in the programs were associated with children’s reading and math grades, and with boys’ social skills with peers in the classroom, in Grade 2, and with reading grades in Grade 3. Diverse and developmentally appropriate activities at the programs were associated with children’s math grades and work habits at school in Grade 3. Programming flexibility was not associated with the child outcomes. Further research is needed to determine whether additional program processes are associated with child outcomes, and which particular processes might be important for older children’s and adolescents’ functioning at school.