1 Introduction

Social inequalities in student achievement have been well documented across countries, domains, and age groups, with students from socioeconomically disadvantaged families typically performing below their classmates from more privileged families (Harwell et al., 2017; OECD, 2019; Sirin, 2005). Given this firm relation between socioeconomic background and academic achievement, reliable, valid, and–ideally–economic indicators of socioeconomic status (SES) are required for educational research. One such indicator that is frequently used in social research is the number of books at home (e.g., Eriksson et al., 2021; Jerrim & Micklewright, 2014; Mullis et al., 2012a, 2012b). This indicator has consistently been found to relate to students’ academic achievement, as measured, for instance, by their reading comprehension (e.g., Eriksson et al., 2021; McElvany et al., 2009; Mullis et al., 2012b; Park, 2008), academic language proficiency (Heppt & Stanat, 2020; Volodina et al., 2020), or years of schooling (Evans et al., 2010). It also shows substantial relations to other indicators of SES (e.g., Eriksson et al., 2021), thus pointing to its general appropriateness for capturing students’ socioeconomic background. Moreover, whereas other measures involve laborious coding (e.g., the coding of parents’ current occupation for assessing the International Socio-Economic Index [ISEI]; Ganzeboom et al., 1992), the books-at-home measure does not require any coding, making it a very economical and easy-to-use indicator that is particularly helpful for use in large survey designs or large-scale assessments. However, despite its widespread use, only a few studies have aimed to investigate and possibly improve this measure’s quality (e.g., Engzell, 2018; Sieben & Lechner, 2019). Thus, it is unclear whether the traditional books-at-home measure holds as a valid indicator of SES over other commonly used measures or whether its incremental validity can be increased by including important extensions, such as the number of children’s books and the number of ebooks. As the books-at-home measure is frequently administered to children, we further explore potential differences in its predictive power depending on whether parents or children share the information. This question is of high methodological and research-practical relevance because participation rates are typically higher for students than for parents in school-based assessments (cf. Engzell & Jonsson, 2015). The present study aimed to investigate these questions using academic language proficiency as an indicator of academic achievement.

2 SES and student achievement

Social inequalities in student achievement have been reported for a range of performance indicators in different domains, such as reading, mathematics, and science (e.g., Eriksson et al., 2021; OECD, 2019; Sirin, 2005). Theoretical explanations of these inequalities often draw on resource- and investment-oriented approaches (e.g., Bourdieu, 1986; Conger & Donnellan, 2007; Erikson & Jonsson, 1996). These approaches basically rely on the assumption that families from different socioeconomic backgrounds differ in the amounts and kinds of resources they can invest in their children and that these varied investments, in turn, result in social inequalities in student achievement. The literature typically distinguishes between three types of resources: (1) economic resources, such as household income or wealth (Hällsten & Thaning, 2021), (2) social resources, including social networks, that provide access to support and valuable information, and convey (educational) norms and values (Carbonaro, 1998; Coleman, 1988) and (3) cultural resources or cultural capital. According to Bourdieu (1986), cultural capital can be subdivided into incorporated, institutionalized, and objectified cultural capital. Whereas incorporated cultural capital refers to a person’s long-lasting dispositions, such as attitudes and preferences, institutionalized cultural capital denotes the certificates and academic titles that are obtained through formal education. Finally, objectified cultural capital includes physical cultural goods, such as books or works of art (Bourdieu, 1986; Sieben & Lechner, 2019). While some researchers highlight the role of cultural resources as opposed to economic resources, thus differentiating conceptually between SES and cultural capital (Evans et al., 2010; Park, 2008), others propose broader conceptualizations of SES that comprise cultural resources, such as the number of books at home (see Engzell, 2018, for an overview; Eriksson et al., 2021; Hanushek & Woessmann, 2011; Harwell et al., 2017). In line with this latter conceptualization, we use SES as an umbrella term, encompassing various economic, educational, and cultural resources that are often used for explaining social inequalities in student achievement.

Regarding their relations to students’ academic achievement, it must be considered that the different resources are closely intertwined and can–at least to some extent–be converted into one another. Institutionalized cultural capital, for example, as reflected in a person’s educational qualifications, can be transferred into economic resources. These enable parents to support the academic achievement of their children by providing them with better learning conditions (e.g., workplace) and learning materials in their homes (see Conger & Donnellan, 2007, for an overview). However, despite this overlap, the different resources refer to different aspects of a family’s SES. They are therefore likely to be differentially related to academic achievement and, thus, cannot be used interchangeably (e.g., Bukodi & Goldthorpe, 2013; Eriksson et al., 2021; Hällsten & Thaning, 2021). In line with these considerations, studies incorporating different family resources typically report the respective indicators of SES to be substantially correlated while still highlighting that each indicator captures unique aspects of a family’s socioeconomic background (Bukodi & Goldthorpe, 2013; Erola et al., 2016). Current research further highlights that various aspects of SES differ in the degree to which they are transferrable from one generation to the next. Unlike education and occupation, which are closely tied to individuals, wealth is only loosely related to an individual’s effort and labor market success and, thus, can be passed on relatively easily and directly from parents to their children (Hällsten & Thaning, 2021). Yet, wealth’s independent effect on student achievement seems negligible, at least in the major economies, whereas particularly pronounced effects have been reported for parents’ occupational status and the number of books at home (Eriksson et al., 2021).

In regard to the number of books at home, resource-oriented approaches, such as the family investment model (Conger & Donnellan, 2007), classify them as part of a family’s cultural resources and stress their importance as potential learning stimulation at home (cf. Mullis et al., 2012a). As such, they may foster students’ reading motivation and their interest in written culture (cf. McElvany et al., 2009) and form an important basis for stimulating learning interactions among parents and children. Specifically, they may serve as drivers for engaging in joint book reading activities, storytelling, talking about reading experiences, and further home literacy activities (Burgess et al., 2002; Eriksson et al., 2021; Grolig et al., 2019; Gustafsson et al., 2011). The number of books at home can thus be conceived of as a proximal process-oriented feature of SES that should be more closely linked to student learning outcomes than more distal structural features of SES, such as parental occupation status, parental education, or family income. Confirming this relational pattern, previous research found that the number of books (McElvany et al., 2009) and, more broadly, learning materials and learning stimulation available at home (Baydar & Akcinar, 2015) mediated the association between other features of SES (i.e., economic well-being and education) and children’s language development. McMullin et al. (2020) found that the mediating role of the number of children’s books between different operationalizations of a family’s SES (e.g., mothers’ education, family income) and children’s language development (i.e., expressive vocabulary) was even more pronounced than that of home learning activities, such as reading to the child or helping the child learn songs, poems, or nursery rhymes (for similar findings, see Martin & Mullis, 2013). The number of books at home therefore seems to capture unique aspects of a family’s SES that are key in explaining students’ academic achievement in general and their language development in particular. Moreover, it has been found to mediate the relationship between more distal structural SES-features, such as parents’ occupational status and education as well as student achievement.

That said, it is important to bear in mind that prior research points to reciprocal relations between literacy activities and language-related learning outcomes, such as reading comprehension (e.g., Harlaar et al., 2007). In a meta-analysis of 99 studies, for instance, Mol and Bus (2011) identified “an upward spiral of causality” (p. 267) of print exposure and language proficiency (i.e., oral language skills such as vocabulary, basic reading and spelling skills, and reading comprehension). Thus, children with a more stimulating reading environment in their homes tend to experience larger growth in oral language and reading comprehension. In turn, the better their reading comprehension, the more they engage in literacy-related activities (Mol & Bus, 2011). Such reciprocal effects and, consequently, reverse causality (cf. Leszczensky & Wolbring, 2019), may also come into play when investigating the relationships between the number of books at home and students’ academic achievement.

3 SES and students’ academic language proficiency

In recent years, a growing number of studies have pointed to social inequalities in students’ academic language proficiency (Uccelli et al., 2019; Volodina et al., 2020, 2021a). Academic language is typically conceived of as the language register of schooling that students must master to participate in classroom discourse, understand textbooks, and accomplish assignments (e.g., Bailey, 2007; Snow, 2010). No unequivocal boundary exists between everyday language, which is used in daily routines and interactions, and the school-based register of academic language; different disciplines also involve different variations of the academic register (e.g., use of subject-specific vocabulary; Snow, 2010). Nevertheless, researchers generally agree that there are certain common language characteristics that are much more prevalent in school-based discourse than in everyday settings and that result in concise, factual, and information-dense texts in both oral and written forms (e.g., Bailey, 2007; Snow, 2010). Among these common characteristics are lexical features such as an often abstract and ambiguous academic vocabulary (e.g., to determine, assumption), an increased use of nominalizations, or the occurrence of complex connectives with very specific meanings (e.g., although, subsequently). These connectives, in turn, contribute to the construction of syntactically complex sentences (e.g., containing embedded subordinate clauses), which form typical grammatical features of academic language (cf. Schleppegrell, 2004; Volodina et al., 2021b).

School textbooks are replete with various features of the academic register. For instance, information in school textbooks is often presented in an abstract way (Achugar & Schleppegrell, 2005; Berendes et al., 2018). As early as elementary school, textbooks in mathematics, science, and social studies expose children to relatively high amounts of academic vocabulary (Fitzgerald et al., 2020, 2022). Moreover, coherence in textbooks is often achieved through the use of connectives (e.g., perhaps, consequently, even; Rodgers, 1974) while extensive explanations are circumvented through the use of short, albeit complex, noun phrases (Berendes et al., 2018).

Over the past few years, several studies have shown that students’ general academic language proficiency is more closely related to academic achievement than more basic language skills and that academic language predicts student performance, even when controlling for general vocabulary or sentence comprehension (e.g., Schuth et al., 2017; Volodina et al., 2021b). These relational patterns have been shown across countries and for different age groups and domains (e.g., Meneses et al., 2018; Volodina et al., 2021b). The findings thus confirm academic language proficiency as a crucial precondition for school success and therefore emphasize its importance as an outcome variable when examining social inequalities in student achievement.

In investigating the relations between students’ family background and their academic language proficiency, previous research drew on a variety of SES indicators, including the number of books at home (e.g., Heppt & Stanat, 2020; Volodina et al., 2020). While most of these studies identified the number of books at home as a significant predictor of students’ academic language comprehension, even when considering other measures such as parents’ education and/or parents’ occupational status, they typically relied only on the number of printed books. That is, the role of important extensions of the traditional books-at-home measure, which might contribute to its incremental validity, has not been investigated systematically.

4 Possible extensions of the traditional books-at-home measure

Studies that draw on the books-at-home measure as an indicator of SES usually use a single item that asks for the number of books at home (e.g., Evans et al., 2010; PISA; OECD, 2014). As this item is frequently accompanied by verbal descriptions and/or illustrations specifying how many books would fit on a bookshelf, it can be assumed that they primarily refer to printed books. While newspapers, magazines, and textbooks are sometimes explicitly excluded from the book count, this is usually not the case for the number of children’s books. Given this blurring in item formulation, assessments should certainly benefit from clearer specifications of which sorts of books should be included in the estimate. Moreover, specific kinds of books that might be particularly important for the validity of the books-at-home measure should be assessed separately. Two such indicators are the number of children’s books and the number of ebooks.

4.1 Number of children’s books

The rationale for additionally assessing the number of children’s books is twofold. First, considering the role of the books at home as potential learning stimulation, it can be assumed that for school-aged children, learning interactions are more likely to evolve around children’s books than around parents’ books. Second, it might also be easier for children to estimate their own books than their parents’ books; therefore, this indicator may be more convenient for inclusion in student questionnaires than the traditional books-at-home measure (cf. Pagel, 2016). While previous research reported low agreement between parents’ and children’s information on the number of (parents’) books (e.g., Engzell, 2018; Jerrim & Micklewright, 2014), information on the agreement is not available for the number of children’s books since this measure is typically not assessed in student questionnaires.

A few studies have assessed both the number of parents’ books and the number of children’s books, but the independent effects of both indicators on academic achievement have typically not been singled out. That is, information on the number of parents’ books and the number of children’s books have usually been collapsed into a single variable, which was then used as a predictor of student achievement. While these studies confirm the mediating role of the number of books between parental education and performance in reading, science, and mathematics (Gustafsson et al., 2011; McElvany et al., 2009), they do not allow for an evaluation of the incremental validity of the number of children’s books over the number of parents’ books. As an exception, an unpublished master’s thesis conducted in Germany revealed that parents’ information on the number of children’s books contributed significantly to the explanation of students’ reading comprehension, arithmetic skills, and academic language comprehension over and above the number of parents’ books at home (Pagel, 2016). This finding supports the aforementioned assumption that the number and use of children’s books, even more so than the number of parents’ books, are highly relevant as learning stimulation during childhood. We therefore assume that the number of children’s books predicts students’ academic language comprehension over and above the number of parents’ books.

4.2 Number of ebooks

Considering the ongoing process of digitalization that is accompanied by an enhanced access to ebooks and other digital devices, the traditional books-at-home measure may no longer be an accurate indicator of a family’s cultural resources (Schwippert, 2019). It therefore seems reasonable to additionally ask for the number of ebooks to increase the incremental validity of the books-at-home measure (Pagel, 2016).

Although this argument seems traceable at first sight, empirical findings indicate that, at least in Germany, printed books are still used much more frequently than ebooks. While the percentage of people who sometimes read ebooks increased from 21% in 2013 to 30% in 2020, the share of people who sometimes read printed books grew from 74 to 81% during the same time span (Statista, 2021). This gap is even more pronounced for children: whereas approximately 70% of all children aged 4–13 use printed books at least once a week, the share of children who regularly use ebooks is below 10% at all ages (Kinder-Medien-Studie, 2018; for corresponding findings from the US, see Rideout, 2014). In addition to this rather low spread of ebooks, research suggests that parents and children tend to use ebooks and printed books differently. Specifically, they spend less time on content-related talk when engaging in joint ebook reading compared to joint reading of printed books, resulting in limited text comprehension in children (Krcmar & Cingel, 2014; Ross et al., 2016).

Taken together, both the relatively low availability of ebooks and the possibly less effective interaction patterns associated with joint ebook reading may limit the predictive validity of the number of ebooks for students’ academic performance. In line with these considerations, using data from 2014, the previously mentioned master’s thesis found no widespread use of ebooks and no incremental validity of the number of ebooks in the prediction of academic achievement (Pagel, 2016). However, to the best of our knowledge, virtually no published studies have investigated the predictive validity of the number of ebooks as an indicator of SES. The present study addresses this lacuna by examining whether the number of ebooks contributes to the prediction of students’ academic language comprehension over and above the number of printed books. Given the relatively low proliferation and use of ebooks (Rideout, 2014; Statista, 2021), we expect the effects to be smaller than those for children’s books.

5 Research questions and hypotheses

The current study aims to investigate the incremental validity of the books-at-home measure and selected extensions in predicting students’ academic achievement. Considering that the number of books at home are frequently included in student questionnaires in national and international studies, we additionally explore the role of the information source, that is, whether parents or students share information. Specifically, we address the following research questions:

  1. (1)

    Does the number of books at home contribute to students’ academic language comprehension, when controlling for other common indicators of family SES (incremental validity)? Based on theoretical accounts and previous findings that highlighted the distinctness of different SES measures (Bukodi & Goldthorpe, 2013) and the proximity of the number of books at home to other process-oriented aspects of the home literacy environment (Gustafsson et al., 2011; McElvany et al., 2009), we expect the number of books at home to be significantly related to students’ academic language comprehension even when considering important control variables (i.e., students’ grade level and language background) and other indicators of family SES (i.e., parental occupation status and parental education).

  2. (2)

    Is the incremental validity of the number of books at home increased by extensions of the traditional books-at-home measure (i.e., number of children’s books and number of ebooks)? Assuming that children’s books may be a more relevant resource for student learning and stimulating interactions than parents’ books (cf. Pagel, 2016), we hypothesize that the number of children’s books predicts students’ academic language comprehension over and above the number of parents’ books. Comparably smaller effects are expected for the number of ebooks, given their relatively low proliferation and use (Rideout, 2014; Statista, 2021).

  3. (3)

    Are the effects of parental occupation status and parental education on students’ academic language comprehension mediated by the number of books and children’s books at home? Whereas parental occupation status and education have previously been described as distal structural features of SES, the number of books at home can be conceived of as a proximal process-oriented feature of SES that should be more closely related to students’ learning outcomes in general and to language development in particular (McElvany et al., 2009). Replicating prior research that identified the number of books as mediator of the relation between more distal structural features of SES and students’ academic achievement (Gustafsson et al., 2011; McElvany et al., 2009; McMullin et al., 2020; Myrberg & Rosén, 2009), we expect both the number of books at home and the number of children’s books to mediate the relationship between distal structural features of SES (i.e., parental occupation status and parental education) and students’ academic language comprehension.

  4. (4)

    Does the predictive value of the number of books at home differ by information source (parents vs. children)? As previous research pointed to elementary school children’s limited ability to estimate amounts (cf. Harel et al., 2007) and found rather low consistency on parents’ and children’s information on the number of (parents’) books (e.g., Engzell, 2018; Jerrim & Micklewright, 2014), we assume parents’ estimates on the number of books to more accurately predict students’ academic language comprehension than children’s information. Yet, effects might be attenuated for the number of children’s books, the number of which might be more easily accessible for children than for parents.

6 Materials and methods

6.1 Study information and sample

Data were drawn from the project “BiSpra-Transfer: Development and validation of a standardized test instrument” (German: BiSpra-Aufgaben: Weiterentwicklung zu einem diagnostisch nutzbaren Testinstrument und Prüfung der Sensitivität für Fördereffekte), which aimed to standardize three measures for assessing the academic language comprehension of primary school children (Heppt et al., 2020; Weinert et al., 2020). The study implemented a cross-sectional design and sampled students attending Grade 2, 3, or 4. Data collection took place in six German federal states in November and December 2017. We randomly sampled full classrooms for participation in the study and the overall sample consisted of 3778 primary school children from three student cohorts (for a more detailed description of the sampling procedure, see Heppt et al., 2020). By design, each student completed two of the three academic language measures. In each school, a coordinating teacher completed a student tracking list and provided information including students’ grade level and gender. We additionally used a student questionnaire and a parent questionnaire to collect information on students’ family background (i.e., language background, parental education and occupation) and the number of books at home. Schools, students, and parents participated in the study voluntarily and we obtained parents’ written consent for their children’s participation.

The analyses presented below are based on a subsample of 2353 students from Grades 2 (n = 828 students from 58 classrooms), 3 (n = 780 students from 56 classrooms), and 4 (n = 745 students from 50 classrooms) who were administered the measure on listening comprehension of academic language at the text level (BiSpra-Text) and for whom information on their language background (German monolingual, bilingual, German as a second language) was available. Students were 8.45 years old on average (SD = 1.03), and half of them were male. As the BiSpra-Transfer project aimed to provide testing norms for both German monolingual students and dual language learners (DLLs; for more detailed descriptions, see Heppt et al., 2020), 31% of the students in the sample simultaneously acquired German and another language (bilinguals), and 23% of the students started learning German when they were 3 years old or older (students with German as a second language).

6.2 Measures

6.2.1 Parents’ information on the number of books at home

Parents indicated (1) the number of printed books at home (excluding ebooks, magazines, newspapers, and children’s books), (2) the number of children’s books (excluding magazines or newspapers), and (3) the number of ebooks. All three items were answered on a 5-point scale. For the number of printed books and the number of children’s books, the five categories were “0–10” (1), “11–25” (2), “26–100” (3), “101–200” (4), and “more than 200” (5). To facilitate estimation, each of these categories was visualized by a bookshelf that included the respective number of books (cf. Richter et al., 2014). The item on the number of ebooks included the categories “0–10” (1), “11–25” (2), “26–50” (3), “51–100” (4), and “more than 100” (5). No visual aid was provided for this item.

6.2.2 Children’s information on the number of books at home

Children indicated (1) the number of printed books at home (excluding magazines, newspapers, and their own books) and (2) the number of children’s books (excluding magazines, newspapers, and schoolbooks). The items were answered on the same 5-point scale as the items in the parent questionnaire, and the five categories were illustrated by the same pictures of bookshelves. The descriptions of the five categories were slightly more detailed than in the parent questionnaire to make the connection between the pictures and categories more pellucid. Thus, the five categories were as follows: “none or only very few (0–10 books)” (1), “enough to fill one board in a shelf (11–25 books)” (2), “enough to fill a shelf (26–100 books)” (3), “enough to fill two shelves (101–200 books)” (4), and “enough to fill three or more shelves (more than 200 books)” (5). Children could read the descriptions in their questionnaires, and test administrators also read them out loud to avoid any comprehension difficulties.

6.2.3 HISEI

Furthermore, we asked the parents to report their current occupation. This information was coded according to the International Standard Classification of Occupations (ISCO-08; International Labour Office, 2012) and subsequently transformed into the ISEI (Ganzeboom et al., 1992). The ISEI is a standardized SES measure incorporating the income and necessary education level of a given occupation. It ranges from 10 to 90, with low values indicating occupations associated with low SES (e.g., unskilled workers) and high values indicating occupations associated with high SES (e.g., professors). The HISEI, that is, the highest ISEI among both parents was used for the current analyses. When information was available for only one parent, this information served as the HISEI.

6.2.4 Parental education

Parents indicated their highest level of schooling and vocational qualifications; we used this information to assess the families’ educational background. Parents’ information was first coded in line with the International Standard Classification of Education (ISCED-97; OECD, 1999), ranging from 0 (preprimary level of education) to 6 (second stage of tertiary education), and then transformed into the number of years of education (OECD, 2009). The so-called PARED varies between 0 and 18 years. We entered the highest PARED among both parents into our analyses. Again, when parents provided information for only one person, we used this information.

6.2.5 Academic language comprehension

The test instrument for assessing elementary school children’s academic language comprehension in German (BiSpra 2–4) comprises three subtests, focusing on (1) listening comprehension of academic language at the text level (BiSpra-Text), (2) comprehension of connectives (BiSpra-Sentence), and (3) comprehension of cross-subject academic vocabulary (BiSpra-Word; Heppt et al., 2020). The following sections refer to BiSpra-Text, which broadly covers various academic language demands and can therefore be conceived of as a measure of general academic language comprehension, while the other two subtests assess specific language components.Footnote 1

BiSpra-Text has previously been described in detail (e.g., Heppt & Stanat, 2020; Heppt et al., 2020); therefore, we provide only a short description here. The measure assesses primary school students’ ability to understand information and draw inferences from short listening comprehension texts that include a variety of lexical (e.g., cross-subject academic vocabulary, connectives) and grammatical (e.g., constructions in passive voice, syntactically complex sentences) features of the academic language register (cf. Bailey, 2007). The texts are entirely fictional and include made-up words that can be understood from the context, thus limiting undesirable effects of differences in prior content knowledge on students’ test performance. BiSpra-Text includes three slightly different test versions for Grades 2, 3, and 4, each consisting of eight listening comprehension texts with four to seven linguistically simple yes/no questions. While some texts and items occur in only one test version, others appear across two or three test versions. Texts and items are presented auditorily via CD (for detailed test descriptions and sample items, see Heppt et al., 2020; Heppt & Stanat, 2020). Each of the three test versions’ reliabilities was good in the present sample (Grade 2: nitems = 38, α = 0.83; Grade 3: nitems = 42, α = 0.82; Grade 4: nitems = 44, α = 0.83).

6.3 Analytic procedure

Given the slightly different item sets for Grades 2, 3, and 4 for students’ academic language comprehension, sum scores would not result in meaningful estimates of students’ achievement across grades. We therefore jointly scaled students’ item responses across all three grades using the R package TAM (Robitzsch et al., 2021). In the following analyses, we used weighted maximum likelihood estimates (Warm, 1989) as ability scores for BiSpra-Text.

The amount of missing data varied considerably among the study variables, ranging from 0% for students’ academic language comprehension to 38% for the number of ebooks (Fig. S1 in the Supplemental Material). These differences can be partly explained by the different response rates for test booklets (100%), student questionnaires (98%), and tracking lists (100%) on the one hand and parent questionnaires (77%) on the other hand. To handle these missing data, we applied multiple imputation using the R package MICE (van Buuren & Groothuis-Oudshoorn, 2011), which generated 35 complete datasets. In addition to the study variables, the imputation model included a number of auxiliary variables that were substantially correlated with the study variables (e.g., students’ grades in different subjects, perceived need for language support as indicated by the teachers) with the purpose of increasing the likelihood of a missing at random mechanism (cf. Collins et al., 2001).

To answer our research questions, we subsequently specified a number of multiple linear regression models in Mplus Version 8.4 (Muthén & Muthén, 1998–2019). The mediating role of the number of books was investigated by including them as additional dependent variables and by using the “model indirect” command. We accounted for the data’s nested structure (students in classes) with the option “type = complex”, thus yielding robust standard errors. To integrate the analyses’ results of the 35 datasets, we used the option “type = imputation” (Muthén & Muthén, 1998–2019).Footnote 2 Given the interrelatedness among the predictor variables, particularly among the SES indicators (see Sect. 7.1), we additionally checked for multicollinearity by determining the variance inflation factors (VIF). Pooled VIFs across all 35 datasets were obtained by specifying the respective regression models in STATA 17 and using mivif (Klein, 2011). While VIFs above 5 are typically considered as indicating substantial multicollinearity (Chatterjee & Simonoff, 2013; Hair et al., 2010; James et al., 2013), some authors propose lower thresholds of VIF ≥ 2.5 (Johnston et al., 2018). The VIFs varied from 1.10 to 2.61 across models and predictors.Footnote 3 Multicollinearity leads to increased standard errors and is therefore particularly problematic in smaller sample sizes (e.g., Hair et al., 2010). Compromising effects due to multicollinearity are thus unlikely to occur in the present analyses. The analyses’ code files in Mplus and STATA can be found on OSF: https://osf.io/4vumy/?view_only=912d9aafc03d4d3080b4aff359195c54

7 Results

7.1 Descriptive results

Descriptive statistics and correlations of the study variables are displayed in Table 1. While the different SES indicators were all substantially and positively correlated, we found particularly high correlations for the number of books and children’s books when the same person (i.e., parents or children) assessed them. Relatively small correlations emerged between the number of ebooks and all other indicators of family SES. Moreover, all indicators were positively correlated with academic language comprehension. In line with previous findings (e.g., Heppt & Stanat, 2020; Heppt et al., 2020; Volodina et al., 2020), we found a positive association between students’ grade level and academic language comprehension. Students’ language background was negatively related to academic language comprehension (e.g., Heppt et al., 2020; Volodina et al., 2020), indicating that DLLs performed below their monolingual peers, and to all indicators of SES. We therefore controlled for grade level and language background in the regression analyses. Gender was uncorrelated to academic language comprehension.

Table 1 Descriptive Statistics and Manifest Correlations of the Study Variables

Using nonimputed data, we determined parent–child agreement for the number of parents’ books and children’s books and examined the distributions of the different books-at-home measures by source. Parent–child agreement was rather low (for an evaluation of kappa coefficients, see Cicchetti, 1994) within grades and overall for the number of both parents’ books (Grades 2 to 4: 0.13 ≤ ҡ ≤ 0.24; overall: ҡ = 0.20) and children’s books (Grades 2 to 4: 0.11 ≤ ҡ ≤ 0.20; overall: ҡ = 0.14). Distributions showed a substantially higher share of missing values for books-at-home measures included in the parent questionnaire than for those included in the student questionnaire (Fig. S1 in the Supplemental Material). Remarkably, the number of missing values was particularly large for the number of ebooks (38%), and almost 47% of the parents indicated that they had 0–10 ebooks compared to 11% who chose this category for the number of printed books and children’s books.

7.2 Multiple linear regressions for explaining academic language comprehension

We performed a series of multiple linear regressions to explore the importance of the books-at-home measure and its extensions (i.e., number of children’s books, number of ebooks) as well as the role of the information source (parents vs. children) for explaining students’ academic language comprehension (Table 2). We additionally tested the indirect effects of parental education and HISEI on students’ academic language comprehension by adding the number of books at home and the number of children’s books at home as mediating variables (Table 3).

Table 2 Multiple Linear Regression Models Predicting Students’ Academic Language Comprehension (N = 2353)
Table 3 Estimates of Specific Indirect Effects of HISEI and Parental Education via the Number of Books on Students’ Academic Language Comprehension (N = 2353)

Model 1 served as a baseline model and included only the control variables grade and language background, i.e., whether students were bilingual or learned German as a second language. As expected, higher grade levels were associated with better test performance. Both bilingual students and students with German as a second language performed below their monolingual German-speaking peers. The control variables accounted for 20.6% of the variance in the outcome variable.

In Models 2 to 7, the various SES indicators were added step by step. Thus, Model 2 additionally included the families’ HISEI and parental education. Both measures contributed to the explanation of students’ academic language comprehension and resulted in a significant increase in the amount of explained variance (R2 = .314, ΔR2 = .108). With the next three models, we aimed to investigate the importance of the number of books, children’s books, and ebooks as indicated by the parents. We found positive effects for the number of books, resulting in a 2.1% increase in the amount of explained variance (Model 3). The number of children’s books was a significant predictor above and beyond HISEI, parental education, and the number of books and led to a small but significant increase in explained variance (Model 4; R2 = .340, ΔR2 = .005).Footnote 4 However, the number of ebooks, which we additionally entered in Model 5, did not contribute to the explanation of students’ academic language comprehension. Further analyses revealed that parental HISEI and education were both significant predictors of the number of books (βHISEI = .39, p < .05, SE = .03; βparental education = .24, p < .05, SE = .03) and the number of children’s books (βHISEI = .33, p < .05, SE = .03; βparental education = .24, p < .05, SE = .03). Moreover, their relation to students’ academic language comprehension was mediated by the number of books and children’s books, as indicated by the significant Sobel test (Sobel, 1982; Model 4a in Table 3).

To investigate the impact of the number of books and number of children’s books as assessed by the children, we removed parents’ information on the number of books at home and added information gathered from the students instead (Model 6). Both variables significantly contributed to the explanation of the outcome variable. Compared to Model 2, which included only the control variables as well as HISEI and parental education, additionally considering students’ information on their own and their parents’ books significantly increased the amount of explained variance by 1.1%, from 31.4 to 32.5%. Similar to parents’ information on the number of books at home, children’s information on the number of books (βHISEI = .26, p < .05, SE = .03; βparental education = .14, p < .05, SE = .03) and children’s books at home (βHISEI = 0.18, p < .05, SE = .03; βparental education = .11, p < .05, SE = .03) was predicted by parental HISEI and education. Both variables mediated the relation of the two distal structural features of SES on students’ academic language comprehension (Model 6a in Table 3).

Ultimately, Model 7 included all variables that were significant predictors of students’ academic language comprehension in the previous models, thus investigating whether the predictive power of the number of books and the number of children’s books at home differed by information source (parents vs. children). In this final model, we found that the number of books and children’s books as indicated by the parents were both significant predictors of students’ academic language comprehension, even when considering the controls, HISEI, and parental education. However, no effects emerged for the number of books and children’s books when information was collected from the children. The amount of explained variance was 34.3%, which was almost the same as in Model 4. The results thus indicate the importance of parents’ information as opposed to children’s information when using the number of books at home as a predictor of academic achievement.

8 Discussion

The current study investigated the incremental validity of the books-at-home measure beyond other commonly used SES indicators (i.e., parents’ occupational status and education) in explaining academic language comprehension. In particular, we examined whether selected extensions of the traditional books-at-home measure, namely, the number of children’s books and the number of ebooks, increase the validity of the books-at-home measure and whether the predictive value of the number of books and the number of children’s books differs by information source (i.e., whether parents or children answered the question). Additional analyses examined whether the number of books and children’s books as indicated by both parents and children mediated the relation between more distal structural SES features and students’ academic language comprehension.

8.1 Predictive validity of the books-at-home measure and its extensions

We found that parents’ information on the number of books and children’s books at home significantly increased the amount of explained variance in students’ academic language comprehension, even when considering parental education and occupational status (HISEI). The number of ebooks, however, did not contribute to the explanation of students’ academic language comprehension. A similar pattern of results emerged when using children’s information instead of parents’ information. Thus, children’s estimates, which were only assessed for the number of books and the number of their own books but not for the number of ebooks, contributed significantly to the explanation of their academic language comprehension. Moreover, parents’ books and children’s books, both when assessed by parents and by children, mediated the relationship between parents’ occupational status and education as well as students’ academic language comprehension. When simultaneously considering parents’ and children’s estimates of the number of books at home, however, only parents’ information on the number of books and the number of children’s books remained significant. The results thus show that children’s information on the number of books at home is of limited predictive value compared to parents’ information for explaining student achievement (see Table S5 in the Supplemental Material for additional findings in support of this interpretation). Overall, the findings indicate that when used in the parent questionnaire, the validity of the traditional books-at-home measure is not compromised by the greater availability of ebooks and can be slightly increased by additionally considering the number of children’s books at home.

The present study’s results corroborate previous findings that confirm the interrelatedness of different SES indicators while underlining their distinctiveness (e.g., Bukodi & Goldthorpe, 2013). Thus, although the different measures for assessing a family’s SES showed substantial amounts of shared variance, parental occupation status (HISEI), parental education, number of books, and number of children’s books all contributed independently to the explanation of student achievement. This suggests that they capture slightly different aspects of SES and cannot be used interchangeably. Simultaneously, the present study adds to the literature by examining the importance of the number of children’s books and the number of ebooks as well as the role of the respondent (parents vs. children) in increasing the predictive value of the traditional books-at-home measure. Furthermore, mediation analyses’ results confirmed and extended prior research that pointed to the mediating role of combined indices of the number of books and children’s books for the relation between different SES measures and students’ academic achievement (e.g., McElvany et al., 2009; McMullin et al., 2020; Myrberg & Rosén, 2009). The present findings thus further support the assumption that parents’ occupational status and education should be conceived of as distal structural features of SES whose impact on students’ learning outcomes can at least partly be explained by more proximal process-oriented features such as the learning stimulation tied to books and children’s books available at home (cf. Gustafsson et al., 2011; McElvany et al., 2009).

In terms of underlying mechanisms, which help to explain the relation between the number of books at home and student academic achievement, theoretical considerations suggest that the number of children’s books may be a better proxy for learning stimulation and joint reading activities than the number of parents’ books and, thus, may be an even more valid indicator of students’ cultural capital and learning resources. However, remarkably, this measure is not typically considered in large-scale assessments (with the exception of TIMSS and PIRLS), and its individual effects in predicting student achievement (i.e., net of the number of parents’ books) are usually not investigated. Hence, whereas prior research mostly used combined measures of the number of parents’ and children’s books and found that they were positively related to, for instance, students’ reading comprehension (Gustafsson et al., 2011; McElvany et al., 2009), we established the number of children’s books as an independent predictor of students’ academic language comprehension.

For the number of ebooks, the present findings challenge the assumption that the process of digitalization and the greater availability of digital devices threaten the validity of the traditional books-at-home measure (cf. Schwippert, 2019). In line with prior studies on media use (Kinder-Medien-Studie, 2018; Statista, 2021) that did not report widespread use of ebooks, the vast majority of parents in our study indicated owning 0 to 10 ebooks or chose not to answer the question. We suspect that this very uneven distribution and resulting variance restriction, which differed sharply from all the other books-at-home indicators (Fig. S1 in the Supplemental Material), is the driving force for explaining the null effects in the present study (for convergent findings, see Pagel, 2016). Potential differences in the use of ebooks and printed books in joint interactions among parents and children (cf. Krcmar & Cingel, 2014; Ross et al., 2016) might additionally have come into play.

The present study is among the very few to examine parents’ and children’s ratings of the number of books at home (e.g., Engzell, 2018; Jerrim & Micklewright, 2014) and, to the best of our knowledge, is the first to compare the predictive value of the number of books and the number of children’s books for these different information sources. In line with the studies by Engzell (2018) and Jerrim and Micklewright (2014), for instance, we found exceptionally low agreement between parents’ and children’s information on parents’ books and children’s books. Although it is not possible to determine which ratings are more accurate, we have reason to assume that elementary school children are the less reliable information source given their limited capacity for estimating amounts (Harel et al., 2007). Additionally, considering the stronger relations between parents’ estimates and students’ academic achievement compared to students’ estimates, our findings support the inclusion of the books-at-home measure in the parent questionnaire rather than in the student questionnaire (see Hovestadt & Schneider, 2021, for convergent findings regarding parental education).

8.2 Limitations, future directions, and conclusion

Several limitations exist in the current study. First, no information was available on the processes that occur within families, which might help to explain the effects of the number of books and the number of children’s books on students’ learning outcomes. While it can reasonably be assumed that books at home form an important basis for home literacy activities, such as joint reading activities or talking about reading experiences (e.g., Martin & Mullis, 2013; McElvany et al., 2009), future studies should deliberately assess students’ home literacy activities to inform our understanding of the role of parents’ books compared to children’s books for student outcomes.

Second, the present analyses are based on cross-sectional data and, thus, do not allow for drawing causal inferences. We assumed and modeled the various SES indicators as predictors of students’ academic language comprehension and can reasonably exclude reverse causality for most relations. Parents’ number of books and, in particular, their occupational status and education are most likely unaffected by their children’s academic language comprehension. However, reciprocal relations may occur between academic language comprehension and children’s own books, as children with greater mastery of the academic register may demand and be supplied with more books than students who are less proficient in academic language comprehension (cf. Mol & Bus, 2011).

Third, students’ academic language comprehension was the only outcome measure involved in our analyses. Academic language proficiency has been shown to be substantially related to competencies in a variety of domains, such as reading comprehension, mathematics, and science (e.g., Schuth et al., 2017; Volodina et al., 2021b), thus confirming it as a meaningful variable for investigating social inequalities in student achievement. However, the predictive value of the books-at-home measure and its extensions might be smaller for less language-bound measures.

Fourth, although we controlled for parents’ occupational status and education, which are both substantially related to the various books-at-home measures and students’ academic language comprehension, further measures that might help capture an even more nuanced picture of the relation between the number of books at home and student achievement were not considered. Specifically, information on parents’ home ownership, living space, and recent or upcoming moves, which all may relate to the number of books at home were not included in the dataset. In particular, frequent relocations may pair with a diminished personal book stock independent of a person’s occupational status and education.

Despite these limitations and open questions, which are subject to future research, the present study’s results may serve as an important basis for selecting and assessing SES indicators in social research. In particular, they increase our knowledge of the validity of the books-at-home measure, which is ubiquitously used in surveys and large-scale assessments but the quality of which has only rarely been scrutinized.