Abstract
Using panel data of schoolclass networks of 11–13yearold students, this study investigates effects of schoolwork collaborationnetworks on grades and schoolrelated wellbeing. It suggests propensity score weightingregression as a method of causal inference for data collected in social contexts, and in studies analyzing nodeattributes as outcomes of interest. It will argued that this alternative approach is useful when stochastic actorbased models (SAOMs) show convergence problems in sparse networks. Three methods of causal analysis dealing with the problems of endogeneity bias and interference between observations will be discussed in this study: first, SAOMs for the coevolution of networks and behavior/attitudes will be estimated, but this results in a systematic loss of data. Second, propensity score matching compares treated cases with untreated nearest neighbors. However, the stableunittreatmentvalue assumption (SUTVA) requires that the analysis controls for network embeddedness in the final analysis. This is possible by using propensity score weightingregression, which is a flexible approach to capture treatment diffusion via multiplex networks.
Introduction
Human communities depend on diffusion of information through networks, social influence (Henrich 2016) and social exchange (Windzio 2018). While most social network analyses were rather descriptive in the 1960s and 1970s (Prell 2012), recent studies focus on network diffusion (Valente 1995) and the coevolution of network ties and behavior (Snijders et al. 2010). New methods for causal inference sensitized researchers to the pitfalls of deriving causal conclusions from crosssectional data, but also from longitudinal regression models (Morgan and Winship 2007; Brüderl and Ludwig 2014, 331p; VanderWeele and An 2014). Causal inference is important in research fields where social influence, contagion or diffusion regularly occur. Nevertheless, the development of longitudinal methods for the analysis of networks took place more or less separately from innovations in methods for causal inference. Surprisingly, appropriate methods of causal inference for network data are rather new (Robins 2015, 216 pp), and only few network researchers seem to be familiar with the literature on causal inference (VanderWeele and An 2014; An 2018; Aral and Nicolaides 2017).
Studies on peer influence face the challenge of disentangling selection and influence (Ragan et al. 2019): crosssectional correlations between ego’s and alter’s characteristics do not distinguish between whether we select our peers with respect to certain characteristics, or whether we assimilate towards our peers’ characteristics (Shalizi and Thomas 2011). Undoubtedly, stochastic actororiented models (SAOMs) for the coevolution of networks and behavior (Snijders et al. 2010) are a breakthrough for empirical research. In some cases, however, SAOMs are prone to convergence problems if networks are either sparse or do not show an appropriate ratio of stability and change. Excluding nonconverging networks from the analysis can lead to considerable loss of data, which is why ‘conventional’ methods of causal analysis such as propensity score matching (PSM) are worthy of consideration, although network data violates the assumption of independent observations. In a recent contribution, Ragan et al. (2019) showed that conventional methods of panel data analysis, namely random effects, hybrid fixed effects and lagged hybrid fixed effects do not tend to overestimate peer influence compared with SAOMs. Nevertheless, in most network data the stable unit treatment value assumption (SUTVA) is violated and causal inference from methods that do not explicitly account for the embeddedness of dyads in the surrounding network structure should be interpreted with caution. According to the SUTVA, causal inference e.g. by using propensity score matching is only reliable when there is no diffusion of relevant information between cases in the treatment and in the control group.
The present study investigates the challenge of estimating causal effects of ties in dyadic collaborationnetworks. In a first step of this study, SAOMs for the coevolution of networks and behavior will estimate the effect of schoolwork collaboration on grades and wellbeing in school. Nonconvergence of SAOMs (VanderWeele and An 2014: 368) indeed results in a considerable loss of data, for instance due to sparseness of ties in the schoolwork network. SAOMs are also vulnerable to omitted variable bias (Shalizi and Thomas 2011, 218; Robins 2015, 220; Ragan et al. 2019, 25)—e.g. when an explanatory variable x is correlated with the error term e (corr(x,e) ≠ 0), and is contaminated with (unobserved) information related to x and the dependent variable y. Secondly, propensity score matching will be used to estimate the causal effect of ties in the schoolworknetwork on grades and on wellbeing. Since this approach suffers from the violation of the SUTVA and does not account for inherent spillover from neighboring dyads, propensity score weighting regression (Morgan and Winship 2007; Guo and Fraser 2010) will be suggested as an alternative third approach. In line with the model suggested by An (2018) propensity score weighting regression can account for multiplex networks, in this case for the embeddedness of collaborative dyads into friendship networks. Friendships between treated and nontreated students can be controlled in order to account for potential ‘diffusion’ of information across groups. The method captures at least partially the effect of contact among observations and is more in line with the SUTVA. Further developing propensity score weighting regression and related methods for the analysis of causal effects in network data can be a fruitful alternative to SAOMs in situations of limited data or sparse networks.
Analyzing outcomes of network ties
Communication among adolescents in networks is considered increasingly important e.g. for political mobilization (Saud 2018; Ida et al. 2020) and cooperation. Over the last three decades, schoolchildren’s groupwork became a growing research field in education science (Howe and Tolmie 2003). The outcome of interest is usually the development of academic performance (e.g. grades) (Webb 1989; Crosnoe 2000; Lubbers 2004). In addition, also social benefits and pupils’ wellbeing are desired results of group work, since they affect the learning environment in the classroom (Howe and Tolmie 2003). Providing causal evidence of groupwork networks is far from being trivial. When two pupils become involved in networks of schoolwork collaboration, they might also have similar attitudes towards academic issues. This selectivity is prone to endogeneity bias (Morgan and Winship 2007: 77p) in the estimation of a causal effect.
Gremmen et al. (2017) analyzed the coevolution of adolescents’ friendship and academic achievement. According to their six wave network analysis in first and second year secondary school, students select friends on the basis of alters’ grades. Subsequently, their own grades develop in the same direction as their alters’ grades (“First selection, than influence”) (Gremmen et al. 2017). To date, systematic analyses of networks of schoolwork collaboration are rare. Schoolworknetworks have been analyzed (Windzio 2013; Ivaniushina et al. 2016), but not the impact of these networks on outcomes.
Studies on the diffusion of knowledge, behaviour or attitudes are well established (Rogers 2003), even though the substantial social structure through which diffusion proceeds, namely the network, became systematically considered not before T. Valente’s work (Valente 1995). In the early standard models of network diffusion, the hazard rate of adoption at time t depends on ties to e.g. infectors or opinion leaders at t − 1. However, there was no systematic treatment of selection processes due to the characteristics of interest, e.g. when noninfected persons get into contact with infected persons in order to care for them, and thereby adopt the disease. Criticism of Christakis and Fowler’s (2007) study on diffusion of obesity through networks sensitized researchers to the problem of latent homophily when making causal inference in networkdiffusion studies. How difficult it is to statistically disentangle selection and influence also in longitudinal settings, and further, how vulnerable diffusion models in general are to omitted variable bias, has been demonstrated by Shalizi et al. (2011).
Analysing the diffusion of mobile service application in a large global instant messaging network, Aral et al. (2009) used propensity score matching methods (PSM). The treatment was the presence of adopters in the subjects’ local messaging network, and was predicted by a vector of behavioral and demographic covariates of the respective individual at time t (Aral et al. 2009). The authors conclude that not accounting for selection by using PSM would lead to a 700% overestimation of the treatment effect.
Arpino et al. (2017) combined social network data with PSM and analyzed effects of countries’ GATT membership (General Agreement on Tariffs and Trade) on global economic interdependence. Units of analysis where 1319 country dyads in the year 1954. The outcome of interest was the log of trade flows in the respective dyad at t + 1. Their study is an important substantial contribution to the field, but also to the methodological problem of applying propensity score methods to network data. When predicting the propensity score, Arpino et al. (2017) controlled for a set of network statistics such as a node’s degreecentrality and local and global clustering. The authors assume that the statistical nonindependence of dyads would be controlled when computing the treatment by accounting for the network characteristics in the selection model.
Propensity score matching methods of causal inference for networks imply the ‘no interference’ assumption: subjects are independent of each other and there is no diffusion of information from treated to control cases (VanderWeele and An 2014). More precisely, the stable value treatment assignment assumption (SUTVA) states “… that the value of Y for unit i when exposed to the treatment w will be the same no matter what mechanism is used to assign treatment w to unit i and no matter what treatments the other units receive” (Guo and Fraser 2010: 35). In other words, it is assumed “… that there is no contamination, no information shared, between treated and untreated matched samples …” (Barringer et al. 2014, p. 18). Ruling out contamination and diffusion is obviously difficult in social network studies, but the problem also affects studies on school classes where networks remain unobserved, but social processes which could be represented by networks nevertheless exist!
A revealing example of how information spreads through networks and thereby affects nontreated subjects comes from a smokingprevention intervention study. Regarding interference itself as the outcome of interest, the study showed how friendship ties significantly increased the log odds of receiving information from an intervention brochure (An and VanderWeele 2019). Whilst interference is a nuisance in many research settings when it violates the SUTVA, it is here the outcome of interest. An (2018) explicitly measured informationdiffusion in the survey, but admits that “… treatment diffusion data may not be readily available” in most studies (An 2018: 172).
Figure 1 shows a schoolclass network where black lines represent ties of schoolwork collaboration and grey lines children’s friendships. Having network ties in two different dimensions—friendship and schoolwork collaboration—is called “multiplexity” in social network terminology. The issue of contamination and shared information becomes obvious in Fig. 1: dyad 1 is just a friendship, there is no “treatment” by joint schoolwork collaboration. If both students in dyad 2 benefit from collaboration (see below), then also dyad 1 will show an average increase in competence because the upperleft actor of dyad 2 is also involved in the untreated dyad 1. There is contamination between these two dyads. Furthermore, information from a treatment could also pass several steps through the network. Violation of the SUTVA might be an issue in the studies of Aral et al. (2009) and Arpino et al. (2017), where this problem has not been systematically discussed.
Units of the following PSM analysis are dyads. The specific characteristic of collaboration networks is the (potential) mutual benefit of both actors in a collaborative dyad. Figure 2 shows four possible dyadic constellations A–D of a collaboration effect in schoolwork networks.
In scenario A of symmetric collaboration, the dark grey small circles indicate a gain in competence for ego and alter. This the ideal case of collective goodgeneration. In scenario B there is an asymmetric, unidirectional transfer of competence from the left to the right actor. Scenario C is asymmetric as well, but due to the collaboration also the left actor further increases his or her competence, since explaining academic issues usually leads to a consolidation of skills and knowledge also at the sender’s side. Finally, the sender in D benefits, but does not succeed in transferring competence to the potential receiver.
There is a specific problem of causal inference of dyadic collaborationeffects in a network N. With respect to the average benefit in a dyad, the empty dyad {d, i} ∈ N in Fig. 2 benefits from the collaboration in {c, d} ∈ N, even though actors in {d, i} ∈ N do not collaborate. If, however, collaboration generates a collective good the outcome is dyadic. While this is not a problem for the SAOM (see below) because it is actororiented, it has consequences for propensity score matching based on dyads. Propensity score weighting regression, in contrast, allows addressing this problem by controlling the relevant covariates.
Modeling approach
Figure 3 summarizes the analytical approach graphically. The first step is the SAOM. In step 2, a probit p* model will be estimated to predict the selectivity of ties in schoolwork networks also by parental contact and children’s friendship (Table 1). Since the causal order between ties in these network dimensions could also be reversed, observed values of the respective independent variable (friendship, parental contact) have been replaced by an instrumented variable, where the instruments are derived from the models in Table 6 in the “Appendix”. Ties in the parental contact and friendship network have been predicted by network selforganization (Lusher et al. 2013; Windzio 2015), namely by network structural effects of mutuality, 2in and 2outstars, transitive and cyclic triads and samesex (the latter in the parental contact network only). What is the reason to estimate the models in Table 6 (“Appendix”)? They predict the propensity of a tie in a friendship or parental contact network basically by ‘network selforganization’. Results are instrumented variables, which will be used as explanatory variables in the probit model in Table 1, instead of the observed network ties (Windzio 2015, and see below).
These models allow a good prediction of the propensity to form a tie in the schoolwork collaboration network used in the matching analysis. Propensity scores are predicted separately for different subgroups in order to get the conditional average treatment effect on the treated (CATT) in step 2 (columns (2) and (3) in Table 1). The subgroups of the CATT are dyads with at least one migrant, and dyads in wholeday schools.
In the first approach, stochastic actorbased models of coevolution of networks and behaviour are applied to disentangle effects of selection and influence in longitudinal network data (Snijders et al. 2010). Losing a large amount of data in the SAOM is prone to bias, which is why, secondly, propensity score matching is used (step 2). This method is based on the ‘stable unit treatment value assignment assumption’ (SUTVA): there should not be any interaction between units from treatment and control group (Morgan and Winship 2007; Gangl 2010). Moreover, having social network information in the data allows at least to control for the embeddedness of dyads in the surrounding social network—e.g. in transitive and cyclic triads, 2in and 2out stars and patterns of mutuality. This will be the third approach, the propensity score weighting regression model (Guo and Fraser 2010) (step 3).
What are the advantages of propensity score weighting regression?

1.
Predicting the propensity score by a p* model for ties in networks takes the statistical nonindependence of dyads into account. However, also the final estimation of the average treatment effect on the treated (ATT) assumes statistical nonindependence of dyads. By using propensity score weighting regression researchers can control for a large set of independent variables in the second step, and thereby can take the embeddedness of a dyad into the surrounding subnetwork into account, e.g. diffusion of information of the treatment ‘schoolwork collaboration’ via friendship networks. Arpino et al. (2017: 545) mention propensity score weighting regression as an option, but regret at the same time: “Unfortunately, there is little guidance on how to select between propensity score methods”. Maybe, also their study could have benefited from the advantage of including further control variables in a regression model.

2.
The propensity score weighting regression model can account for the multiplexity of social networks (VanderWeele and An 2014: 370) by including these networks into the final estimation. Depending on the research question, contact and interference (VanderWeele and An 2014: 353) in different network dimensions and in various forms, also between treated and untreated, can be included into the final model.

3.
In a longitudinal setting, the final estimation of the ATT can account for a lagged dependent variable in the propensity score weighting regression and thereby allows at least for a testing of the causal effect (VanderWeele and An 2014: 366). Furthermore, a longitudinal analysis can use a differencesindifferences (DiD) estimator, which is a further advantage in the identification of a causal effect.

4.
Controlling for the network embeddedness in propensity score weighting regression is particularly important if biasreduction due to propensity score matching is insufficient. Appropriate matching usually reduces bias for (observed) variables that induce the selectivity of the treatment. In some cases, however, the matching procedure does not sufficiently reduce the bias for indicators of network embeddedness, which might be due to a strong dependence of observations in networks. Instead of simply accepting insufficient bias reduction for network indicators and just proceeding with propensity score matching, or abstaining from causal analysis, propensity score weighting regression allows controlling for these network indicators in the final prediction of the outcome of interest.
Data and measurement
Our schoolsurvey collected three wave paneldata on 1676 students between 2010 and 2012 in grades 5–7 in two cities in northern Germany (Windzio 2018). Respondents were 10–12 yearold pupils. The population consisted of 149 fifthgrade schoolclasses, out of which 94 classes in 55 registered schools participated in the first wave (time 1 in 2010). The response varied between these three waves; 1087 children in 58 schoolclasses completed the questionnaire in wave 2, and 1561 children from 65 classes in wave 3. The majority of school principals was willing to cooperate, but teachers could decide on participation. Nonresponse occurred predominantly at the classlevel. At the children’s level, response rates varied between 75.4% (wave 1 in 2010), 80.4% (wave 2 in 2011) and 80.4% (wave 3 in 2012). Only classes where either N = 17 or 75% of pupils were present during the survey were included in the network analysis. Moreover, since the propensity score analysis uses longitudinal information for the differencesindifferences estimation (see below), only classes that participated in the first two waves could be used. Resulting from this selection rule and due to the item nonresponse in the pupils’ cases, N = 382 pupils, 6170 dyads in 26 classrooms were potentially available (see “Appendix” Table 5). Moreover, for the SAOM the data was limited to classes where information on all three waves were available. Due to convergence issues of the SAOM the sample has been limited to 6–12 classes. The mean number of students in a class is 23.7 (grade) and 24.4 (wellbeing), with a range between 20 and 30, and the total number of students varies between N = 148 and N = 281.
During the class interview pupils filled out the questionnaires under the interviewers’ guidance. To guarantee the anonymity of the information clearly visible ID numbers were placed on the pupils’ desks during the survey. The respective pupil’s own ID number was entered into the questionnaire, and the network contacts with classmates were recorded by entering their ID numbers.
Methods
Processes of network selforganization motivate the inclusion of structural parameters in models predicting ties in networks. One important selforganizing process in networks is closure, especially in terms of transitive triads (Lusher et al. 2013; Windzio 2015). Such processes are considered “…’purely structural’ effects because they do not involve actor attributes or other exogenous factors. … the network patterns arise solely from the internal processes of the system of network ties” (Lusher et al. 2013: 23). We take advantage of the ‘purely structural’ internal processes by constructing an instrument in step 2 (Fig. 3). This variable is the predicted propensity of a tie e.g. by triadic closure (see Table 6, “Appendix”, for the models). Only the particular component of a change in x (e.g. parental contact) which results from a change in z (e.g. transitive closure) will be used to predict y (e.g. a tie in the friendship network) (Morgan and Winship 2007, 190). The instrumental variable estimator β_{IV} is thus:
Transitivitybased predictions in the friendship and parental contact networks will be used as explanatory variables in the p* model to predict ties in the schoolwork network (Table 6, “Appendix”). Subsequently, predictions from the p* models in Table 1 will be used to compute a propensity score (Guo and Fraser 2010) (step 2 in Fig. 3). The p* model is a pseudolikelihood estimation of the probability of a tie in a network and is a variant of an exponential random graph model (ERGM) (Harris 2014: 23).
The first approach to analyse the causal effect of schoolwork collaboration on grades and wellbeing are stochastic actororiented models (SAOM) (step 1 in Fig. 3), which disentangle the effects of selection into network ties and social influence (Snijders et al. 2010). The algorithm (SIENA) simulates changes of networks between discrete states by assuming continuous microsteps of actors’ decisions between measurements. The resulting coefficients represent the log odds of observing a tie in the network and the effects on behavioural change.
For the second approach (step 2 in Fig. 3), propensity scores have been predicted from the subgroupspecific probit p* models in Table 1. Figure 4 compares the distribution of the propensity score between the treated and the nontreated in the overall sample, whereby treatment of a dyad is defined as having a tie in the schoolwork network. The overlap of the distributions indicates a good condition for propensity score matching analyses. The average treatment effect of the treated (ATT) will be estimated in two variants (Gangl 2010): First, nearestneighbour matching with four observations of the nontreated as matches (NN+4) will be used to estimate effects on grades or wellbeing at time 2. The same procedure will be applied to compute differences on differences (DiD). The DiD approach also includes a lagged dependent variable. Different variants will be estimated by using callipers with values of 0.2 and 0.01. Callipers are restrictions to the maximal distance between nearest neighbours. If the distance is exceeded, cases will not be matched even if they are nearest neighbours (Guo and Fraser 2010: 147).
This propensity score will be also used to create the probability weight for propensity score weighting (Guo and Fraser 2010: 173) (step 3 in Fig. 3).
The propensity score weight ω was constructed by the indicator W of the treatment (0 or 1), plus 1 minus the indicator W times the probability of the treatment ê, divided by 1 minus the probability of the treatment ê (Guo and Fraser 2010, p.161).
If relational information on students’ ties is available, we can control for simple forms of embeddedness of dyads into the surrounding network. We control for mutuality, 2instars, 2outstars, transitive triads, and cyclic triads in the friendship network. In addition, we can measure the number of each student’s friends who belong to the treatment group. Interacting this indicator with the treatment status captures at least to some degree the social embeddedness and thereby the possible diffusion of information between treated and untreated dyads.
Results
Table 1 shows predictions of propensity scores by three probit p* models. We see that the instrumented variable ‘parental contact (log)(IV)’ (predicted by models in Table 6, “Appendix”) has a positive and highly significant effect on ties in the schoolwork network (all: 0.126***; at least one migrant: 0.132***), except for whole day schools. Similarly, the effect on ties in the friendship network is significantly positive in all three models. Ties in the friendship network might capture a considerable part of homophily. Spatial proximity matters for ties in all three networks: if ego lives close to alter (walk within 5 min) the propensity of a schoolwork tie is considerably increased (e.g. at least on migrant: 0.613***). We also see that the effect of similarity in grade point average is significant (10% level) only in the subset of dyads where at least one node is a migrant (0.135^{ +}). As known from other social network studies, there is a positive effect of ‘same sex’ on ties in all three networkdimensions (all: 0.457**). Similarity in the (negative) learning selfconcept and in the number of books at home are not significant at the 5% level. There is a strong tendency towards mutuality in all three networkdimensions. The effect of 2instars as an indicator of prestige is negative in all three networks, but insignificant in dyads with at least one migrant, whereas 2outstars are generally insignificant. We find the expected effects of transitive (positive) and cyclic triads (negative) for all three models (Robins 2015).
The first approach of the causal analysis is the SAOM (step 1 in Fig. 3). In Table 2 the behaviour change variable in Models 1–2 is the grade point average in mathematics, German and English. The outcome in Models 3–4 is the level of wellbeing in school, ranging from 1 to 10 (see Table 5, “Appendix”). There is a significantly positive effect of having or establishing a reciprocal tie in the schoolwork network. Similarly, transitive triplets show a positive effect, indicating the common tendency towards triadic closure. Aside from that, there are not any other significant effects in the networkparts of the models. In the behavioural changeequations (grades and wellbeing in school) we find negative quadratic shape effects in Models 1–2. Accordingly, there seems to be a significant decrease in average academic performance over time.
The most important information is the absence of effects of networks on behaviour in the dimensions ‘grades’ and ‘wellbeing in school’. If there were noteworthy social influence via grades or wellbeing, we would have found positive effects of ‘total alter’ and/or the ‘average alter’—which is not the case. What would positive effects indicate? In case of ‘total alter’ it would mean that ego adapts to the sum of respective behaviour values of those alter with whom ego is connected, ‘averge alter’ is the average of these values.
The number of networks used in the metaanalysis varies between k = 6 (148 students) and k = 12 (N = 281 students), depending on the number of nonconverging schoolworknetworks which were all of comparatively low density. Overall, the survey data includes 21 schoolclasses with data for 3 waves, so the loss of data is considerable, and might not be at random.
Table 3 shows the results of the propensity score matching analysis (step 2 in Fig. 3). Differencesindifferences (DiD) test whether the outcome in a dyad has increased between t_{1} and t_{2} due to schoolwork collaboration. The propensity score has been estimated from the probit p* model for schoolwork networks at t_{1} (see Table 1). In the lower panel of Table 4, conditional average treatment effects on the treated (CATT) are shown for dyads where at least one node is of migrant background, or where students learn in wholeday schools (see Table 1, model ‘at least one migrant’, and model ‘whole day’).
Results in Table 3 are again pessimistic about effects of schoolwork collaboration on both grades and wellbeing. Balancing the sample by matching leads to an insignificant ATT on grades (− 0.029, t = − 0.56). The DiD is insignificant as well, and small in magnitude (0.011, t = 0.35). Moreover, there even tends to be a negative effect of the DiD estimator of schoolwork ties on wellbeing (− 0.277, t = − 1.76 +). This general pattern does not change when we introduce calipers of the 0.2 or 0.01. The CATTs are not significantly positive for any outcome; there is no evidence of a benefit from being involved in schoolworkdyads. In contrast, the effect of schoolwork collaboration even turns negative in whole day schools when we introduce a caliper of 0.01 (− 0.270, t = − 2.97**).
Objections against this analysis could be that (a) the estimation violates the SUTVA and (b), also untreated dyads will gain in competence (averaged between ego and alter) if at least one node is connected to a (successfully) treated dyad (see Fig. 2). If network information is available, it can be used to mitigate the consequences of the SUTVA violation when estimating the ATT (An 2018). The propensity score weighting regression (Guo and Fraser 2010: 161) allows controlling for the embeddedness of dyads into the surrounding network, which can be particularly important if the amount of bias reduction due to matching is insufficient. For instance, according to a common rule of thumb, a remaining residual bias after matching of 5–8% is acceptable [but recent studies have shown that the balancing should be better (Gangl 2014: 261)]. But if the bias is above a certain threshold, should researchers then abstain from a causal analysis?
Propensity score weighting regression can control for variables that still show considerably bias after matching. Table 7 (“Appendix”) shows the remaining bias estimated in a postmatching test where variables indicating network embeddedness have been excluded from the probit model (not shown in Table 1). Table 8 shows the remaining bias when also the network variables are included. In the first case, the remaining median bias is 6.7, in the second case 8.0, so the remaining bias after matching is at the upper limit. Instead of abstaining from the matching analysis, propensity score weighting regression (step 3 in Fig. 3) offers the opportunity to control the biasinducing variables in the final model. In order to take the network embeddedness into account in the final prediction of the ATT, effects of a dyad’s embeddedness in the friendship network, namely mutuality, 2instars, 2outstars, transitive triads and cyclic triads have been controlled. A further control variable is the number of friendship ties each student has with treated students (‘no. of treated friends’). Similar to the embeddedness effects, in combination with an interaction effect with the treatment status (‘no. of treated friends X treated’), the main effect controls the number of ‘treated’ friendships for those who did not receive treatment—which indicates the opportunities for the ‘diffusion’ of treatment information to nontreated cases. In addition, the analysis controls for whole day schools, the interaction ‘treatment X whole day schools’, number of books at home of ego and alter, migration background, negative learning concept of ego and alter, spatial proximity (ego lives close to either), as well as ego for ‘maternal control of leisure time’ and ‘mother helps with schoolwork if needed’ (see Table 5, “Appendix” for the coding).
Table 4 shows three models of the propensity score weighting regression models for grades and schoolrelated wellbeing at t_{2} (Models 1 and 4). Each second model (Models 2 and 5) contains a lagged dependent variable, each third model (Models 3 and 6) is based on a DiD estimator. The effects of the individual control variables are interesting in themselves, but not interpreted here. All models show insignificant effects of the treatment, and in 5 out of 6 models the insignificant effect is even negative. Accordingly, also step 3 of estimating a causal effect of ties in schoolwork networks does not show any significant effect, neither on grades nor on wellbeing.
Discussion
The research design in the present study differs from studies discussed above: while e.g. Aral et al. (2009) use nodes (individuals) as units of observation, dyads have been analysed in the matching analyses of present study. If actor A in a treated dyad improves his or her grade, this automatically affects untreated dyads that include A as well. In present study, the dyadic characteristic of mean grades or wellbeing could be disaggregated into nodal characteristics—which is impossible for trade flows between countries (Arpino et al. 2017). In their study, edgeattributes (trade flows) are of the outcome of interest, whereas here it is dyadic nodeattributes, namely both actor’s grades and wellbeing. Nodeattributes of ego and alter are often of primary interest e.g. in research on social capital, cooperation and the consequences of network embeddedness. Providing appropriate methods for such research was one motive to develop models for the coevolution models of network and behaviour in the SAOM.
It is controversial whether the SAOM of coevolution actually is a causal model. Ragan et al. (2019) have shown that SAOMresults are similar to conventional methods of panel analysis. In econometrics, panel analysis is an approach to causal analysis when only observational, nonexperimental data is available (Brüderl and Ludwig 2014). In this respect, also SAOMs are an approach to causal analysis, even though causal interpretations will always remain open to easy criticism if the study design is nonexperimental. Alternatively, PSM is useful in limited datasituations where nonconvergence of SAOMs would result in loss of many networks, e.g. when networks are sparse. PSM is based on the SUTVA assumption, which becomes obvious when researchers collect information on social networks. Strictly speaking, the problem of the SUTVA is relevant to any kind of causal inference using schoolclass data or otherwise clustered data, where networks exist, but network information was ignored in the data collection. Researchers usually address statistical nonindependence of observations by applying multilevel models, but these models do not appropriately account for the diffusion of information within clusters, e.g. in subnetworks.
Please note that the approach suggested in this paper is rather conservative because untreated dyads in a network consists of nodes involved in other dyads, which can be treated (see Figs. 1, 2). Hence, if there were a treatment effect due to schoolwork collaboration, also untreated dyads would benefit, which reduces the estimated effect, even though propensity score weighting regression is able to control for many confounders.
Conclusion
The SAOM for the coevolution of schoolwork networks and grades/wellbeing did not show significant effects on the outcomes. An obvious limitation of the SAOM is the considerable loss of data, particularly when there are good reasons to assume that the selection is informative with regard y and x.
Matching methods do not require strong assumptions, except for the SUTVA. Any school or organizationrelated causal research should be aware of this assumption and its implications. Statistical nonindependence becomes obvious in social network data, when relational information is available: pupils interact with each other in friendships, during leisure time, schoolwork and other kinds of social ties. Since causal effects of dyadic collaboration embedded in networks pose a challenge (Figs. 1, 2) for matching methods, propensity score weighting regression allows to control for potential spillover effects.
Overall, SAOMs, propensity score matching and weighting analysis did not show any substantial average treatment effect on the treated. A problem of the present study is the low density of the schoolwork collaboration networks. Effects might be insignificant simply because there is not enough information available in these networks. However, if schoolwork collaboration networks really had a considerable effect on the outcomes, it should become apparent in the statistical models based on around 300 treated dyads (Table 3).
The present study suggested applying propensity score weighting regression, which allows for the statistical control of dyadic embeddedness in the wider network structure, such as transitivity and the number of network ties to subjects from the treatment group, which indicates the opportunity structure for ‘diffusion’ of treatment information via friendshipties. Moreover, the model can account for lagged dependent variables and differencesindifferences in longitudinal studies.
This problem affects most studies using school surveys. A practical implication of the present paper is that school studies should consider collecting networks in clustered data situations. Hence, when researchers conduct surveys in (almost) complete schoolclasses and are interested in causal inference they should consider collecting network data in order to apply propensity score weighting regression. Maybe, the present study stimulates the collection of network data in future studies on organisational contexts, such as schools, even if networks are not of primary interest. Propensity score weighting regression provides a method of causal inference for observational data when the SUTVA is violated.
Future research should think about simulation studies which assess how strongly violations of the SUTVA affect results. Moreover, researchers should more systematically elaborate the consequences of different research settings for causal inference. Regarding the development of causal methods for network data, propensity score weighting regression can be a fruitful contribution, in particular when data limitations or sparse networks impede the convergence of SAOMs or induce severe bias in these models due to systematic loss of data (VanderWeele and An 2014: 368).
References
An, W.: Causal inference with networked treatment diffusion. Sociol. Methodol. 48(1), 152–181 (2018)
An, W., VanderWeele, T.J.: Opening the blackbox of treatment interference. Tracing treatment diffusion through network analysis. Sociol. Methods Res. 62, 004912411985238 (2019). https://doi.org/10.1177/0049124119852384
Aral, S., Nicolaides, Ch: Exercise contagion in a global social network. Nat. Commun. 8, 1–8 (2017). https://doi.org/10.1038/ncomms14753
Aral, S., Muchnik, L., Sundararajan, A.: Distinguishing influencebased contagion from homophilydriven diffusion in dynamic networks. Proc. Natl. Acad. Sci. U.S.A. 106(51), 21544–21549 (2009)
Arpino, B., Benedictis, L., Mattei, A.: Implementing propensity score matching with network data. The effect of the general agreement on tariffs and trade on bilateral trade. J. R. Stat. Soc. C 66, 537–554 (2017)
Barringer, S.N., Eliason, S.R., Leahey, E.: A history of causal analysis in the social sciences. In: Morgan, S.L. (ed.) Handbook of causal analysis for social research, pp. 9–26. Springer, Dordrecht (2014)
Brüderl, J., Ludwig, V.: Fixedeffects panel regression. In: Wolf, C. (ed.) Henning Best; The SAGE Handbook of Regression Analysis and Causal Inference, pp. 327–358. Sage, London (2014)
Christakis, N.A., Fowler, J.H.: The spread of obesity in a large social network over 32 years. New Engl. J. Med. 357(4), 370–379 (2007)
Crosnoe, R.: Friendships in childhood and adolescence: the life course and new directions. Soc. Psychol. Q. 63(4), 377–391 (2000)
Gangl, M.: Causal Inference in Sociological Research. Ann. Rev. Sociol. 36, 21–47 (2010)
Gangl, M.: Matching Estimators for Treatment Effects. In: Wolf, C., Best, H. (ed.) The SAGE Handbook of Regression Analysis and Causal Inference. pp. 251–276, Sage, London (2014)
Gremmen, M.C., Dijkstra, J.K., Steglich, Ch, Veenstra, R.: First selection, then influence. Developmental differences in friendship dynamics regarding academic achievement. Dev. Psychol. 53(7), 1356–1370 (2017)
Guo, Sh, Fraser, M.W.: Propensity score analysis. Statistical methods and applications. Sage, Los Angeles (2010)
Harris, J.K.: An introduction to exponential random graph modeling. Sage, London (2014)
Henrich, J.P.: The secret of our success. How culture is driving human evolution, domesticating our species, and making us smarter. University Press, Princeton (2016)
Howe, Ch, Tolmie, A.: Group work in primary school science: discussion, consensus and guidance from experts. Int. J. Educ. Res. 39, 51–72 (2003)
Ida, R., Saud, M., Mashud, M.: An empirical analysis of social media usage, political learning and participation among youth: A comparative study of Indonesia and Pakistan. Qual. Quant. (2020). https://doi.org/10.1007/s11135020009859
Ivaniushina, V., Lushin, V., Alexandrov, D.: Academic help seeking among Russian minority and nonminority adolescents: A social capital outlook. Learn. Individ. Differ. 50, 283–290 (2016)
Lubbers, M.: The social fabric of the classroom. Peer relations in secondary education. Universal Press, Veenendaal (2004)
Lusher, D., Koskinen, J., Robbins, G. (eds.): Exponential random graph models for social networks. Theories, methods, and applications. University Press, Cambridge (2013)
Morgan, S., Winship, Ch: Counterfactuals and causal inference. Methods and principles for social research. Univ. Press, Cambridge (2007)
Prell, C.: Social network analysis. History, theory & methodology. Sage, Los Angeles (2012)
Ragan, D.T., Osgood, D.W., Ramirez, N.G., Moody, J., Gest, S.D.: A comparison of peer influence estimates from SIENA stochastic actor–based models and from conventional regression approaches. Sociol. Methods Res. (2019). https://doi.org/10.1177/0049124119852369
Robins, G.: Doing social network research. Networkbased research design for social scientists. Sage, Los Angeles (2015)
Rogers, E.M.: Diffusion of innovations. Free Press, New York (2003)
Saud, M.: Social networks and social ties: Changing trends of political participation among youth in PunjabPakistan. J. Adv. Human. Soc. Sci. 4(5), 214–221 (2018)
Shalizi, C.R., Thomas, A.C.: Homophily and Contagion Are Generically Confounded in Observational Social Network Studies. Sociological Methods & Research 40(2), 211–239 (2011)
Snijders, T.A., van Bunt, G., Steglich, C.: Introduction to stochastic actorbased models for network dynamics. Soc. Netw. 32, 44–60 (2010)
Valente, ThW: Network models of the diffusion of innovations. Hampton Press, Cresskill (1995)
VanderWeele, T.J., An, W.: Social networks and causal inference. In: Morgan, S.L. (ed.) Handbook of causal analysis for social research, pp. 353–374. Springer, Dordrecht (2014)
Webb, N.M.: Peer interaction and learning in small groups. Int. J. Educ. Res. 13, 21–39 (1989)
Windzio, M.: Immigrant children's access to social capital in schoolclass networks. In: Windzio, M. (ed.) Integration and Inequality in Educational Institutions, pp. 191–228. Springer, Dordrecht (2013)
Windzio, M.: Immigrant children and their parents: Is there an intergenerational interdependence of integration into social networks? Soc. Netw. 40(1), 197–206 (2015)
Windzio, M.: Social exchange and integration into visitsathome networks: Effects of thirdparty intervention and residential segregation on boundarycrossing. Ration. Soc. 30(4), 491–513 (2018)
Acknowledgements
Open Access funding provided by Projekt DEAL. Thanks to the reviewers of this journal and participants of the Applied Stats Workshop (gov 3009) a the IQSS at Harvard University (Spring 2018).
Funding
Part of this research has been funded by the Deutsche Forschungsgemeinschaft (DFG), Grant No.: WI 3423/11, and Grant No.: 374666841–SFB 1342.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Windzio, M. Causal inference in collaboration networks using propensity score methods. Qual Quant 55, 295–313 (2021). https://doi.org/10.1007/s11135020010056
Published:
Issue Date:
Keywords
 Social networks
 Causal inference
 Stochastic actor based model (SAOM)
 Propensity score matching
 Schoolwork collaboration