# Analyzing individual growth with clustered longitudinal data: A comparison between model-based and design-based multilevel approaches

## Abstract

To prevent biased estimates of intraindividual growth and interindividual variability when working with clustered longitudinal data (e.g., repeated measures nested within students; students nested within schools), individual dependency should be considered. A Monte Carlo study was conducted to examine to what extent two model-based approaches (multilevel latent growth curve model – MLGCM, and maximum model – MM) and one design-based approach (design-based latent growth curve model – D-LGCM) could produce unbiased and efficient parameter estimates of intraindividual growth and interindividual variability given clustered longitudinal data. The solutions of a single-level latent growth curve model (SLGCM) were also provided to demonstrate the consequences of ignoring individual dependency. Design factors considered in the present simulation study were as follows: number of clusters (NC = 10, 30, 50, 100, 150, 200, and 500) and cluster size (CS = 5, 10, and 20). According to our results, when intraindividual growth is of interest, researchers are free to implement MLGCM, MM, or D-LGCM. With regard to interindividual variability, MLGCM and MM were capable of producing accurate parameter estimates and *SE*s. However, when D-LGCM and SLGCM were applied, parameter estimates of interindividual variability were not comprised exclusively of the variability in individual (e.g., students) growth but instead were the combined variability of individual and cluster (e.g., school) growth, which cannot be interpreted. The take-home message is that D-LGCM does not qualify as an alternative approach to analyzing clustered longitudinal data if interindividual variability is of interest.

## Keywords

Clustered longitudinal data Design-based approach Model-based approach Multilevel latent growth curve modelLongitudinal research designs allow for the opportunity to investigate individual development in education and psychology. Particularly, measuring the same individual repeatedly permits researchers to describe how and when attributes of the individual change over time (i.e., intraindividual growth) and whether different individuals change in different ways (i.e., interindividual variability; Curran, Obeidat, & Losardo, 2010; Duncan, Duncan, & Strycker, 2006; Grimm, Ram, & Estabrook, 2016). This description of intraindividual growth outlines the overall individual growth pattern (e.g., growth in mathematics achievement), which is the mean of the trajectory pooling of all the individuals within the sample (Curran et al., 2010). On the other hand, interindividual variability reveals the dissimilarity of the individual growth trajectory (e.g., students differ in their growth in mathematics achievement). Accurate estimation of interindividual variability is extremely important because it provides empirical support for the need to further study explanatory determinants (Duncan et al., 2006; Schaie, 1983; Singer & Willett, 2003) and elicits research questions regarding the individual-related factors accounting for such variability (e.g., what is the role of student-perceived mathematical self-efficacy in the growth of mathematics achievement; Bandura, 1993). Statistically speaking, obtaining accurate estimates of intraindividual growth and interindividual variability are essential to understanding development, and thus should be a priority topic for researchers in developmental psychology. Although several studies have been devoted to this topic by examining some influential factors such as sample size, number of time points, and measurement error (e.g., Diallo, Morin, & Parker, 2014; Hertzog, Oertzen, & Ghisletta, 2008; B. O. Muthén & Curran, 1997; Zhang & Wang, 2009), we found little attention has been paid to the methodological issue of how to obtain more accurate estimates of intraindividual growth and interindividual variability within the context of longitudinal data in a hierarchical setting, especially with large-scale studies. Yet there are certainly multiple large-scale data sets available using such a sampling structure. For example, the Longitudinal Study of American Youth (LSAY; Miller, Kimmel, Hoffer, & Nelson, 2000), a national panel study of mathematics and science education in US public schools, adopted a two-stage stratified probability sampling approach – representative schools were randomly selected and students within the selected school were then randomly sampled. This sampling approach produced *clustered longitudinal data* at three levels: cluster (e.g., school), individual (e.g., student), and repeated measures (e.g., number of time points).

Conceptually, individuals within the same cluster are more likely to be more similar to each other (Hox, 2010; Snijders & Bosker, 2012), and, analytically, such individual dependency should be considered to prevent biased estimates of intraindividual growth and interindividual variability (B. O. Muthén, 1997). The consequences of ignoring individual dependency can lead to seriously biased parameter estimates and standard errors (*SE*s), which has been widely discussed within the context of various multilevel models including measurement models (e.g., Julian, 2001; B. O. Muthén & Satorra, 1995; Pornprasertmanit, Lee, & Preacher, 2014; Wu & Kwok, 2012), regression models (e.g., Lai & Kwok, 2015; Moerbeek, 2004; B. O. Muthén & Satorra, 1995), cross-classified models (e.g., Luo & Kwok, 2009; Meyers & Beretvas, 2006), growth mixture models (Chen, Kwok, Luo, & Willson, 2010), and latent growth curve models (e.g., B. O. Muthén, 1997; Wu, Kwok, & Willson, 2015). Although researchers have been well informed by the aforementioned studies, little is known about whether current commonly-used model-based and design-based analytical approaches, which can accommodate individual dependency in data, can produce unbiased parameter estimates of intraindividual growth and interindividual variability given clustered longitudinal data. The current study aimed to close this literature gap by conducting a Monte Carlo study.

## Model-based and design-based multilevel approaches

*model-based approach*specifies level-specific models on the basis of the data structure (Heck & Thomas, 2009; B. O. Muthén & Asparouhov, 2011). Within the framework of multilevel structural equation modeling (MSEM), the time dimension can be converted into a multivariate vector and, therefore, the three-level clustered longitudinal data can be analyzed with a two-level model (i.e., Multilevel Latent Growth Curve Model – MLGCM; Bovaird, 2007; Heck & Thomas, 2009; B. O. Muthén & Asparouhov, 2009), where the intraindividual growth and interindividual variability is modeled at the

*within-level model*and the cluster-level growth is modeled at the

*between-level model*(see for example, Duncan et al., 1997). If researchers are interested in only the individual-level growth trajectory (e.g., psychological virtues during adolescence; Ferragut, Blanca, & Ortiz-Tallo, 2014), then the parameter estimates in the within-level model of MLGCM (see Fig. 1) can theoretically fullfil researchers’ needs.

*SE*s in the within-level model (Pornprasertmanit et al., 2014; Wu & Kwok, 2012). Moreover, Wu and Kwok (2012) suggested that MM can be used as a substitute for MLGCM, especially when sample sizes are small (e.g., < 50). However, to the best of our knowledge, no studies have been conducted to investigate whether MM demonstrates promise in analyzing individual growth with clustered longitudinal data. In the present study, we hypothesized that MM could generate accurate parameter estimates and

*SE*s of intraindividual growth and interindividual variability, similar to those derived by MLGCM.

The *design-based approach* takes into account clustering in longitudinal data by adjusting the *SE*s of parameter estimates (i.e., robust *SE*s) using a sandwich estimator based on the sampling design and, therefore, it produces accurate statistical inferences (Heeringa, West, & Berglund, 2010; L. K. Muthén & Muthén, 1998-2015; Stapleton, 2006, 2008). In M*plus*, a design-based latent growth curve model (D-LGCM) can be implemented by specifying a model exactly like the within-level model of MLGCM presented in Fig. 1 and using the command “TYPE = COMPLEX” to adjust *SE*s of parameter estimates in the model (L. K. Muthén & Muthén, 1998-2015). Similar to MM, D-LGCM might be favored if the primary interest is the individual-level growth trajectory.

The performance of D-LGCM has been examined by Wu et al. (2015) in terms of the accuracy of the regression coefficients of between-level and within-level predictors. More specifically, Wu and colleagues generated a simulated dataset based on a conditional MLGCM – one predictor at the between-level and within-level, respectively. Their simulation results suggested the D-LGCM, including predictors from both levels, could produce unbiased regression coefficients and *SE*s. However, whether a D-LGCM can produce accurate intraindividual growth and interindividual variability has not been well investigated.

In summary, researchers need be aware that models ignoring individual dependency should not be used for analyzing clustered longitudinal data. Researchers who are primarily interested in the individual-level growth trajectory have two model-based approaches (MLGCM and MM) and one design-based approach (D-LGCM) as options to analyze clustered longitudinal data. Until now, understanding the viability of MM as an alternative to analyze clustered longitudinal data has not been well studied. Moreover, although D-LGCM is theoretically expected to provide robust *SE*s for statistical inferences in the individual-level model, the extent to which D-LGCM can produce unbiased parameter estimates and *SE*s of intraindividual growth and interindividual variability is still unknown.

The purpose of the present study was to contribute to the literature gap by examining the effectiveness of two model-based approaches (MLGCM and MM) and one design-based approach (D-LGCM), which have the potential to accommodate individual dependency in clustered longitudinal data. Simulated data (1,000 replications) were generated based on an the findings of empirical studies of the mathematics literacy trajectory derived from clustered longitudinal data (Baumert, Nagy, & Lehmann, 2012). We analyzed the simulated data with four approaches – MLGCM, MM, D-LGCM as well as a model failing to accommodate individual dependency (Single-level Latent Growth Curve Model – SLGCM). The model solutions of SLGCM presented the consequences of ignoring individual dependency when clustered longitudinal data was inappropriately analyzed. When generating the simulated data, we manipulated two design factors, namely, number of clusters and cluster size, to evaluate the impact of sample size at different levels on the performance of the four approaches. Parameter estimates and *SE*s produced by MLGCM were expected to be less biased. The effectiveness of MM, D-LGCM, as well as SLGCM was examined in terms of unbiasedness and efficiency of parameter estimates in the marginal mean structure (i.e., intraindividual growth) and covariance structure (i.e., interindividual variability).

## Method

A Monte Carlo study was conducted using M*plus* 7.31 (L. K. Muthén & Muthén, 1998-2015) to evaluate the performance of two model-based approaches (MLGCM and MM), a design-based approach (D-LGCM), and a model failing to accommodate individual dependency (SLGCM). A MLGCM presented in Fig. 1 was used to generate simulated clustered longitudinal data. In order to increase the generalizability of our findings, parameter settings were adopted from findings in an empirical study (Baumert et al., 2012), which investigated fourth graders’ growth trajectory of mathematics literacy scores. Two design factors were considered in our simulation: number of clusters (NC) and cluster size (CS). Simulated data were then analyzed by MLGCM, MM, D-LGCM, and SLGCM. M*plus* syntax for the present simulation study are provided in Table 5 of Appendix.

## Population model

According Kwok, West, and Green’s (2007) review of longitudinal studies published in the journal *Developmental Psychology* in 2002, the average time point of measurement was 4. Therefore, a population MLGCM model with four measurement time points was used to create our simulated data. Figure 1 presents the population MLGCM model with parameter settings adopted from Baumert et al.’s (2012) study. The repeated measures were denoted as Y1 – Y4. The intercepts of the repeated measures were fixed at zero. The linear growth pattern was modeled in the within- and between-level models. The factor loadings of the intercept factors (IW and IB) were fixed at 1.0 and those of the linear slope factors (SW and SB) were set as 0, 1, 2, and 3 as part of the growth model parameterization. The parameter settings of the marginal mean structure and covariance structure in the within-level model are presented in the following matrices. In the marginal mean structure matrix (*α* _{ W }), the means of IW and SW are fixed to zero, and referred to as dummy zero means by Muthén (1997). Note the means of IW and SW are estimated in the between-level model (B. O. Muthén, 1997).

In the covariance structure matrix (*Φ* _{ W }), the diagonals are the variances of IW (113.30; *π* _{00}) and SW (8.77; *π* _{11}), where *π* _{00} denotes the interindividual variability of initial mathematics literacy scores and *π* _{11} signifies the interindividual variability of the linear trajectories. The off-diagonals in the *Φ* _{ W } are the covariance between IW and SW (*π* _{10} = 0.315; can be converted into a correlation equal to .01), which suggests that the relationship between individual initial scores and growth trajectories was trivial. The error variances of the outcome variables were constrained to be equal over time and set to 40.57. In other words, the population model showed students’ initial mathematics literacy scores (*π* _{00}) and linear growth in mathematics literacy (*π* _{11}) varied across individuals but students’ initial performance was only trivially related to later growth (small *π* _{10}).

\( {\alpha}_w=\left[\begin{array}{l}0\kern1em \\ {}0\end{array}\kern1em \right],{\varPhi}_w\left[\begin{array}{ll}1130\kern0.2em \left({\pi}_{00}\right)\kern1em & 0.315\kern2.6em \\ {}0.315\kern.19em \left({\pi}_{10}\right)\kern1em & 8.77\left({\pi}_{11}\right)\end{array}\kern1em \right] \)

The structure of the between-level model was identical to that of the within-level model in our simulation. The parameter settings of the marginal mean structure and covariance structure in between-level models are presented in the matrices *α* _{ B } and *Φ* _{ B }, respectively. The means of IB (*α* _{1}) and SB (*α* _{2}) are set to 96.28 and 9.94, respectively, where *α* _{1} denotes average initial mathematics literacy score and *α* _{2} signifies the average linear trajectory of mathematics literacy scores.

In the covariance structure matrix (*Φ* _{ B }), the diagonals are the variances of IB (28.86; *γ* _{00}) and SB (2.80; *γ* _{11}), where *γ* _{00} denotes the variability of initial mathematics literacy scores across cluster units (e.g., school) and *γ* _{11} signifies the variability of the linear trajectory across cluster units. The off-diagonals in the *Φ* _{ B } are the covariance between IB and SB (*γ* _{10} = 2.607; correlation = .29), which suggests that the cluster-level initial mathematics literacy scores were positively related to cluster-level linear trajectories. The errors variances of the outcome variables were set to 0.

\( {\alpha}_B=\left[\begin{array}{l}96.28\left({\alpha}_1\right)\kern1em \\ {}9.94\left({\alpha}_2\right)\end{array}\kern1em \right],\left[\begin{array}{ll}28.86\left({\gamma}_{00}\right)\kern1em & 2.607\kern2.6em \\ {}2.607\left({\gamma}_{10}\right)\kern1em & 2.80\left({\gamma}_{11}\right)\end{array}\kern1em \right] \)

Note the current study intended to evaluate to what extent three multilevel approaches were able to produce unbiased estimates of intraindividual growth (*α* _{1} and *α* _{2}) and interindividual variability (*π* _{00}, *π* _{11}, and *π* _{10}) and the corresponding *SE*s. Note *π* _{00} denotes the interindividual variability of individual initial scores, *π* _{11} signifies the interindividual variability of the linear trajectories, and *π* _{10} is the relationship between individual initial scores and growth trajectories. The parameter settings of *γ* _{00}, *γ* _{11}, and *γ* _{10} were required for simulated data generation but were not the focus of the study. Given the parameter settings listed above, the intraclass correlations for Y1, Y2, Y3, and Y4 ranged from 0.231 to 0.299, which indicated non-ignorable individual dependency (Hox, 2010; B. O. Muthén & Satorra, 1995) in the simulated data.

## Design factors

Design factors considered in the present simulation study were as follows: number of clusters (NC = 10, 30, 50, 100, 150, 200, and 500) and cluster size (CS = 5, 10, and 20). An extremely small NC (10) was specifically considered to evaluate the boundaries of performance of MLGCM, MM, and D-LGCM. On the other hand, we also considered a large NC condition (500) because some current large-scale longitudinal studies have collected data from national representative samples who were nested within a tremendous number of clusters. For example, in the base year of the Educational Longitudinal Study of 2002 (ELS: 2002), 15,362 students nested within 752 schools were interviewed in 2002 and followed-up in the following 10 years (Ingels, Pratt, Rogers, Siegel, & Stutts, 2005). Therefore, including a large NC condition in our simulation design can increase the generalizability of our findings to large-scale clustered longitudinal data analyses. Moreover, because CS was less likely to create biased estimates and *SE*s in MSEM (Hox & Maas, 2001), we adopted small, medium, and large values of CS, which were commonly found in substantive research.

There were a total of 7 (NC) × 3 (CS) = 21 conditions. For each condition, replications with convergence problems or improper solutions (e.g., negative unique variances) were excluded until at least 1,000 replications were generated. To compute the empirical power of each key parameter value in the population model a Monte Carlo approach (L. K. Muthén & Muthén, 2002) was applied (see Table 6 of Appendix). The covariance between IW and SW at the within-level (*π* _{10}) exhibited minimal power (≤ .106) across all simulation conditions; while covariance between IB and SB at the between-level (*γ* _{10}) exhibited power larger than .80 when (a) NC = 200 and CS = 20 or (b) NC = 500, regardless of CS. Other parameters had reasonable power across all simulation conditions, except for NC = 30 and CS = 5 condition. Each set of replicates was analyzed with four approaches – MLGCM, MM, D-LGCM, and SLGCM.

## Relative biases

*SE*s in the current study were not in standardized form, relative biases (Hoogland & Boomsma, 1998) were computed to evaluate the accuracy of model solutions derived through MLGCM, MM, D-LGCM, and SLGCM. The formulas for computing relative biases for parameter estimates (

*θ*) and estimated

*SE*s are presented as follows:

where \( {\overline{\theta}}_{est} \) is the average of parameter estimates in each condition, *θ* _{ true } is the population value of the parameter, \( {\overline{SE}}_{\theta_{est}} \) is the average *SE* of a parameter estimate, and \( {\sigma}_{\theta_{est}} \) is the standard deviation of the parameter estimate across successful replications. Note that relative bias is an average measure of accuracy for each parameter estimate (*θ*) and estimated *SE*. According to Hoogland and Boomsma (1998), relative bias with an absolute value larger than 0.05 is considered unacceptable. In the current study, an estimate with an absolute value of relative bias less than 0.01 was deemed fairly accurate. If needed, factorial ANOVAs were conducted to determine the contributions of the design factors to estimates of parameters and *SEs*. The total sum of squares provided the variability of the parameter and *SE* estimates across all successful replications while eta-squared (*η* ^{2}) indicated the proportion of the variance accounted for by a particular design factor or the interaction effect term. Notably, *η* ^{2} was obtained by dividing the Type III sum of squares of a particular predictor or the interaction effect by the corrected total sum of squares.

## Results

### Convergence rate/improper solution of simulations

Results suggested that when sample size at the between-level was too small (NC = 10), applying MLGCM or MM could lead to serious convergence problems (convergence rate close to 0) and improper solutions (e.g., negative unique variances), regardless of varying CS conditions. On the other hand, satisfactory convergence rates and proper solutions were found in other sample size settings (NC = 30, 50, 100, 150, 200, 500; CS = 5, 10, 20) when MLGCM and MM were applied to analyze simulated data (above 98.61%). D-LGCM and SLGCM performed similarly to one another and both had convergence rates equal to 100% and proper solutions across all sample size settings. Next we present the results of parameter estimates and *SE*s under the various simulation conditions, excluding NC = 10.

### Parameter estimates and *SEs* in the mean structure

*SE*s for two parameters (

*α*

_{1}and

*α*

_{2}) as well as their corresponding relative biases in the mean structure derived from four approaches, where

*α*

_{1}denotes average initial mathematics literacy score and

*α*

_{2}signifies the average linear trajectory of mathematics literacy scores. Note that MLGCM estimated parameters in the mean structure at the between-level, while the other three approaches estimated parameters at the within-level. We found MLGCM was capable of generating fairly accurate estimates of the intercept factor mean (

*α*

_{1}; ranging from 96.20 to 96.30; |relative biases| = 0.00) and slope factor mean (

*α*

_{2}; ranging from 9.93 to 9.96; |relative biases| = 0.00) across all sample size scenarios. Moreover, all

*SE*s derived from MLGCM were not biased except for a few trivially biased ones (|relative biases| ranged from 0.00 to 0.06) across all sample size scenarios.

Parameter settings in mean structure of population model and parameter estimates using different approaches

| | ||||||||
---|---|---|---|---|---|---|---|---|---|

MLGCM | |||||||||

NC | CS | Est. | Est. Bias | | | Est. | Est. Bias | | |

30 | 5 | 96.20 | 0.00 | 1.34 | -0.03 | 9.94 | 0.00 | 0.44 | -0.02 |

30 | 10 | 96.26 | 0.00 | 1.17 | -0.03 | 9.94 | 0.00 | 0.38 | -0.03 |

30 | 20 | 96.27 | 0.00 | 1.07 | -0.02 | 9.96 | 0.00 | 0.34 | -0.03 |

50 | 5 | 96.23 | 0.00 | 1.05 | 0.02 | 9.95 | 0.00 | 0.35 | 0.00 |

50 | 10 | 96.26 | 0.00 | 0.92 | 0.01 | 9.94 | 0.00 | 0.30 | 0.00 |

50 | 20 | 96.30 | 0.00 | 0.84 | -0.01 | 9.96 | 0.00 | 0.27 | -0.04 |

100 | 5 | 96.28 | 0.00 | 0.75 | 0.00 | 9.94 | 0.00 | 0.25 | 0.00 |

100 | 10 | 96.27 | 0.00 | 0.65 | 0.03 | 9.93 | 0.00 | 0.21 | 0.00 |

100 | 20 | 96.29 | 0.00 | 0.60 | 0.02 | 9.95 | 0.00 | 0.19 | 0.00 |

150 | 5 | 96.28 | 0.00 | 0.61 | 0.00 | 9.94 | 0.00 | 0.20 | 0.00 |

150 | 10 | 96.27 | 0.00 | 0.53 | | 9.93 | 0.00 | 0.17 | 0.00 |

150 | 20 | 96.30 | 0.00 | 0.49 | 0.00 | 9.95 | 0.00 | 0.15 | |

200 | 5 | 96.27 | 0.00 | 0.53 | -0.02 | 9.94 | 0.00 | 0.18 | 0.00 |

200 | 10 | 96.27 | 0.00 | 0.46 | 0.02 | 9.93 | 0.00 | 0.15 | 0.00 |

200 | 20 | 96.29 | 0.00 | 0.42 | | 9.95 | 0.00 | 0.13 | 0.00 |

500 | 5 | 96.27 | 0.00 | 0.34 | 0.03 | 9.94 | 0.00 | 0.11 | 0.00 |

500 | 10 | 96.28 | 0.00 | 0.29 | 0.04 | 9.94 | 0.00 | 0.09 | 0.00 |

500 | 20 | 96.30 | 0.00 | 0.27 | 0.00 | 9.94 | 0.00 | 0.08 | 0.00 |

MM | |||||||||

NC | CS | Est. | Est. Bias | | | Est. | Est. Bias | | |

30 | 5 | 96.21 | 0.00 | 1.46 | | 9.95 | 0.00 | 0.48 | |

30 | 10 | 96.27 | 0.00 | 1.24 | 0.03 | 9.93 | 0.00 | 0.40 | 0.03 |

30 | 20 | 96.31 | 0.00 | 1.13 | 0.04 | 9.97 | 0.00 | 0.38 | |

50 | 5 | 96.24 | 0.00 | 1.09 | | 9.95 | 0.00 | 0.35 | 0.00 |

50 | 10 | 96.26 | 0.00 | 0.92 | 0.01 | 9.93 | 0.00 | 0.30 | 0.00 |

50 | 20 | 96.29 | 0.00 | 0.84 | -0.01 | 9.96 | 0.00 | 0.27 | -0.04 |

100 | 5 | 96.28 | 0.00 | 0.75 | 0.00 | 9.94 | 0.00 | 0.25 | 0.00 |

100 | 10 | 96.27 | 0.00 | 0.65 | 0.03 | 9.93 | 0.00 | 0.21 | 0.00 |

100 | 20 | 96.29 | 0.00 | 0.59 | 0.00 | 9.95 | 0.00 | 0.19 | 0.00 |

150 | 5 | 96.28 | 0.00 | 0.61 | 0.00 | 9.94 | 0.00 | 0.20 | 0.00 |

150 | 10 | 96.27 | 0.00 | 0.53 | | 9.93 | 0.00 | 0.17 | 0.00 |

150 | 20 | 96.30 | 0.00 | 0.49 | 0.00 | 9.95 | 0.00 | 0.15 | |

200 | 5 | 96.29 | 0.00 | 0.53 | -0.02 | 9.94 | 0.00 | 0.18 | 0.00 |

200 | 10 | 96.27 | 0.00 | 0.46 | 0.02 | 9.93 | 0.00 | 0.15 | 0.00 |

200 | 20 | 96.29 | 0.00 | 0.42 | | 9.95 | 0.00 | 0.13 | 0.00 |

500 | 5 | 96.27 | 0.00 | 0.34 | 0.03 | 9.94 | 0.00 | 0.11 | 0.00 |

500 | 10 | 96.29 | 0.00 | 0.29 | 0.04 | 9.94 | 0.00 | 0.09 | 0.00 |

500 | 20 | 96.30 | 0.00 | 0.27 | 0.00 | 9.94 | 0.00 | 0.08 | 0.00 |

D-LGCM | |||||||||

NC | CS | Est. | Est. Bias | | | Est. | Est. Bias | | |

30 | 5 | 96.21 | 0.00 | 1.37 | -0.01 | 9.94 | 0.00 | 0.45 | 0.00 |

30 | 10 | 96.26 | 0.00 | 1.19 | -0.01 | 9.94 | 0.00 | 0.38 | -0.03 |

30 | 20 | 96.29 | 0.00 | 1.09 | 0.00 | 9.95 | 0.00 | 0.35 | 0.00 |

50 | 5 | 96.23 | 0.00 | 1.06 | 0.03 | 9.95 | 0.00 | 0.35 | 0.00 |

50 | 10 | 96.26 | 0.00 | 0.93 | 0.02 | 9.94 | 0.00 | 0.30 | 0.00 |

50 | 20 | 96.30 | 0.00 | 0.84 | -0.01 | 9.96 | 0.00 | 0.27 | -0.04 |

100 | 5 | 96.28 | 0.00 | 0.75 | 0.00 | 9.94 | 0.00 | 0.25 | 0.00 |

100 | 10 | 96.27 | 0.00 | 0.66 | | 9.93 | 0.00 | 0.21 | 0.00 |

100 | 20 | 96.29 | 0.00 | 0.60 | 0.02 | 9.95 | 0.00 | 0.19 | 0.00 |

150 | 5 | 96.27 | 0.00 | 0.62 | 0.02 | 9.94 | 0.00 | 0.20 | 0.00 |

150 | 10 | 96.27 | 0.00 | 0.53 | | 9.93 | 0.00 | 0.17 | 0.00 |

150 | 20 | 96.30 | 0.00 | 0.49 | 0.00 | 9.95 | 0.00 | 0.16 | 0.00 |

200 | 5 | 96.27 | 0.00 | 0.53 | -0.02 | 9.94 | 0.00 | 0.18 | 0.00 |

200 | 10 | 96.27 | 0.00 | 0.46 | 0.02 | 9.93 | 0.00 | 0.15 | 0.00 |

200 | 20 | 96.29 | 0.00 | 0.42 | | 9.95 | 0.00 | 0.13 | 0.00 |

500 | 5 | 96.27 | 0.00 | 0.34 | 0.03 | 9.94 | 0.00 | 0.11 | 0.00 |

500 | 10 | 96.28 | 0.00 | 0.29 | 0.04 | 9.94 | 0.00 | 0.09 | 0.00 |

500 | 20 | 96.30 | 0.00 | 0.27 | 0.00 | 9.94 | 0.00 | 0.08 | 0.00 |

SLGCM | |||||||||

NC | CS | Est. | Est. Bias | | | Est. | Est. Bias | | |

30 | 5 | 96.21 | 0.00 | 1.06 | 0.03 | 9.94 | 0.00 | 0.36 | 0.03 |

30 | 10 | 96.28 | 0.00 | 0.75 | | 9.94 | 0.00 | 0.25 | |

30 | 20 | 96.30 | 0.00 | 0.53 | | 9.97 | 0.00 | 0.18 | |

50 | 5 | 96.23 | 0.00 | 0.82 | | 9.95 | 0.00 | 0.28 | |

50 | 10 | 96.26 | 0.00 | 0.59 | | 9.94 | 0.00 | 0.20 | |

50 | 20 | 96.30 | 0.00 | 0.41 | | 9.96 | 0.00 | 0.14 | |

100 | 5 | 96.28 | 0.00 | 0.58 | | 9.94 | 0.00 | 0.20 | |

100 | 10 | 96.27 | 0.00 | 0.41 | | 9.93 | 0.00 | 0.14 | |

100 | 20 | 96.29 | 0.00 | 0.29 | | 9.95 | 0.00 | 0.10 | |

150 | 5 | 96.27 | 0.00 | 0.48 | | 9.94 | 0.00 | 0.16 | |

150 | 10 | 96.27 | 0.00 | 0.34 | | 9.93 | 0.00 | 0.11 | |

150 | 20 | 96.30 | 0.00 | 0.24 | | 9.95 | 0.00 | 0.08 | |

200 | 5 | 96.27 | 0.00 | 0.41 | | 9.94 | 0.00 | 0.14 | |

200 | 10 | 96.27 | 0.00 | 0.29 | | 9.93 | 0.00 | 0.10 | |

200 | 20 | 96.29 | 0.00 | 0.21 | | 9.95 | 0.00 | 0.07 | |

500 | 5 | 96.27 | 0.00 | 0.26 | | 9.94 | 0.00 | 0.09 | |

500 | 10 | 96.28 | 0.00 | 0.18 | | 9.94 | 0.00 | 0.06 | |

500 | 20 | 96.30 | 0.00 | 0.13 | | 9.94 | 0.00 | 0.04 | |

Likewise, we also found MM and D-LGCM generated unbiased parameter estimates of *α* _{1} and *α* _{2} (|relative biases| = 0.00) and unbiased corresponding *SE*s except for a few trivially biased ones (|relative biases| ranged from 0.00 to 0.09) across all sample size scenarios. That is, these two approaches performed in the same way as MLGCM did. When SLGCM was applied, the parameter estimates of *α* _{1} and *α* _{2} were unbiased (|relative biases| = 0.00), but their *SE*s were substantially downwardly biased (|relative biases| ranged from 0.03 to 0.52).

### Parameter estimates and *SEs* in the covariance structure of the within-level model

*SE*s for three parameters in the within-level covariance structure (

*π*

_{00},

*π*

_{11}, and

*π*

_{10}) derived from four approaches, where

*π*

_{00}denotes interindividual variability of initial mathematics literacy scores,

*π*

_{11}signifies interindividual variability of the linear trajectories, and

*π*

_{10}represents the relationship between initial scores and trajectories. Additionally, the corresponding relative biases for each parameter estimate and

*SE*are also presented. When MLGCM was applied, intercept factor variance (

*π*

_{00}; ranging from 112.41 to 113.37) as well as slope factor variance (

*π*

_{11}; ranging from 8.71 to 8.89) were accurately estimated (i.e., |relative bias| ranged from 0 to 0.01) across all sample size scenarios. The estimates for the covariance between intercept and slope (

*π*

_{10}; ranging from 0.30 to 0.77) tended to be positively biased (|relative bias| ranged from 0.03 to 1.41). Our further exploration with ANOVA suggested the estimates of

*π*

_{10}were trivially explained by the main effects of NC and CS and the interaction effect (

*η*

^{2}close to 0). In addition, we found all |relative biases| of

*SE*s were less than 0.05 except for a few trivially biased

*SE*s (ranging from 0.05 to 0.09). Generally speaking, the results suggested reasonable accuracy of

*SE*s.

Parameter settings in the covariance structure of the population within-level model and parameter estimates and *SE* *s* using different approaches

NC | CS | | | | |||||||||

MLGCM | |||||||||||||

Est. | Est. Bias | | | Est. | Est. Bias | | | Est. | Est. Bias | | | ||

30 | 5 | 112.41 | -0.01 | 17.94 | | 8.89 | 0.01 | 2.49 | | 0.77 | | 4.90 | -0.04 |

30 | 10 | 112.46 | -0.01 | 12.21 | -0.03 | 8.73 | 0.00 | 1.71 | -0.03 | 0.42 | | 3.35 | |

30 | 20 | 112.81 | 0.00 | 8.40 | -0.03 | 8.77 | 0.00 | 1.19 | | 0.42 | | 2.32 | |

50 | 5 | 113.17 | 0.00 | 14.20 | -0.02 | 8.71 | -0.01 | 1.96 | -0.02 | 0.42 | | 3.87 | -0.02 |

50 | 10 | 112.78 | 0.00 | 9.60 | -0.02 | 8.77 | 0.00 | 1.34 | -0.02 | 0.30 | | 2.63 | -0.01 |

50 | 20 | 113.32 | 0.00 | 6.63 | -0.03 | 8.77 | 0.00 | 0.93 | | 0.36 | | 1.82 | |

100 | 5 | 113.17 | 0.00 | 10.22 | 0.00 | 8.72 | -0.01 | 1.41 | -0.02 | 0.36 | | 2.77 | -0.02 |

100 | 10 | 113.05 | 0.00 | 6.88 | -0.01 | 8.75 | 0.00 | 0.96 | -0.03 | 0.35 | | 1.87 | -0.02 |

100 | 20 | 113.17 | 0.00 | 4.72 | 0.00 | 8.76 | 0.00 | 0.66 | -0.03 | 0.39 | | 1.29 | |

150 | 5 | 113.24 | 0.00 | 8.39 | -0.01 | 8.73 | 0.00 | 1.16 | -0.02 | 0.40 | | 2.27 | -0.03 |

150 | 10 | 113.14 | 0.00 | 5.62 | 0.02 | 8.75 | 0.00 | 0.78 | -0.03 | 0.33 | 0.03 | 1.53 | -0.04 |

150 | 20 | 113.11 | 0.00 | 3.87 | 0.00 | 8.76 | 0.00 | 0.54 | -0.04 | 0.35 | | 1.06 | |

200 | 5 | 113.37 | 0.00 | 7.29 | 0.00 | 8.75 | 0.00 | 1.00 | 0.01 | 0.40 | | 1.98 | -0.02 |

200 | 10 | 113.16 | 0.00 | 4.87 | 0.02 | 8.77 | 0.00 | 0.68 | 0.00 | 0.36 | | 1.33 | -0.04 |

200 | 20 | 113.16 | 0.00 | 3.36 | 0.01 | 8.76 | 0.00 | 0.47 | -0.04 | 0.36 | | 0.92 | |

500 | 5 | 113.12 | 0.00 | 4.61 | -0.02 | 8.76 | 0.00 | 0.64 | 0.02 | 0.35 | | 1.25 | |

500 | 10 | 113.17 | 0.00 | 3.09 | -0.02 | 8.76 | 0.00 | 0.43 | 0.00 | 0.34 | | 0.84 | |

500 | 20 | 113.25 | 0.00 | 2.13 | 0.03 | 8.77 | 0.00 | 0.30 | -0.03 | 0.33 | 0.03 | 0.58 | -0.03 |

NC | CS | MM | |||||||||||

Est. | Est. Bias | | | Est. | Est. Bias | | | Est. | Est. Bias | | | ||

30 | 5 | 112.51 | -0.01 | 17.72 | | 8.85 | 0.01 | 2.50 | | 0.35 | | 4.86 | -0.04 |

30 | 10 | 113.21 | 0.00 | 12.14 | -0.04 | 8.83 | 0.01 | 1.70 | -0.04 | 0.19 | | 3.35 | |

30 | 20 | 113.22 | 0.00 | 8.44 | -0.02 | 8.84 | 0.01 | 1.19 | | 0.29 | | 2.31 | |

50 | 5 | 113.57 | 0.00 | 14.13 | -0.03 | 8.88 | 0.01 | 1.97 | -0.02 | 0.19 | | 3.87 | -0.02 |

50 | 10 | 113.13 | 0.00 | 9.59 | -0.02 | 8.88 | 0.01 | 1.35 | -0.01 | 0.13 | | 2.63 | -0.01 |

50 | 20 | 113.52 | 0.00 | 6.63 | -0.03 | 8.83 | 0.01 | 0.93 | -0.05 | 0.27 | | 1.82 | |

100 | 5 | 113.62 | 0.00 | 10.22 | 0.00 | 8.88 | 0.01 | 1.43 | -0.01 | 0.14 | | 2.79 | -0.02 |

100 | 10 | 113.34 | 0.00 | 6.88 | -0.01 | 8.83 | 0.01 | 0.97 | -0.02 | 0.22 | | 1.88 | -0.02 |

100 | 20 | 113.31 | 0.00 | 4.72 | 0.00 | 8.80 | 0.00 | 0.67 | -0.01 | 0.33 | 0.03 | 1.29 | |

150 | 5 | 113.67 | 0.00 | 8.40 | -0.01 | 8.86 | 0.01 | 1.17 | -0.01 | 0.20 | | 2.29 | -0.02 |

150 | 10 | 113.40 | 0.00 | 5.62 | 0.02 | 8.82 | 0.01 | 0.79 | -0.01 | 0.22 | | 1.54 | -0.04 |

150 | 20 | 113.23 | 0.00 | 3.87 | 0.00 | 8.80 | 0.00 | 0.55 | -0.02 | 0.30 | | 1.06 | |

200 | 5 | 113.77 | 0.00 | 7.33 | 0.01 | 8.87 | 0.01 | 1.02 | 0.03 | 0.22 | | 1.99 | -0.01 |

200 | 10 | 113.37 | 0.00 | 4.88 | 0.02 | 8.83 | 0.01 | 0.69 | 0.01 | 0.27 | | 1.34 | -0.03 |

200 | 20 | 113.26 | 0.00 | 3.36 | 0.01 | 8.79 | 0.00 | 0.47 | -0.04 | 0.31 | -0.03 | 0.92 | |

500 | 5 | 113.43 | 0.00 | 4.62 | -0.02 | 8.85 | 0.01 | 0.65 | 0.03 | 0.22 | | 1.27 | -0.03 |

500 | 10 | 113.32 | 0.00 | 3.09 | -0.02 | 8.80 | 0.00 | 0.44 | 0.02 | 0.28 | | 0.89 | -0.01 |

500 | 20 | 113.32 | 0.00 | 2.13 | 0.03 | 8.79 | 0.00 | 0.30 | -0.03 | 0.30 | | 0.58 | -0.03 |

NC | CS | D-LGCM | |||||||||||

Est. | Est. Bias | | | Est. | Est. Bias | | | Est. | Est. Bias | | | ||

30 | 5 | 140.42 | | 20.57 | | 11.44 | | 2.68 | 0.02 | 3.15 | | 5.42 | |

30 | 10 | 140.49 | | 15.45 | | 11.37 | | 1.97 | | 2.86 | | 3.99 | |

30 | 20 | 141.00 | | 12.00 | | 11.47 | | 1.49 | | 2.81 | | 3.01 | |

50 | 5 | 140.98 | | 16.10 | | 11.48 | | 2.10 | 0.04 | 3.02 | | 4.22 | |

50 | 10 | 141.13 | | 12.13 | | 11.49 | | 1.54 | | 2.90 | | 3.13 | |

50 | 20 | 141.58 | | 9.45 | | 11.50 | | 1.16 | | 2.87 | | 2.37 | |

100 | 5 | 141.13 | | 11.52 | | 11.50 | | 1.50 | 0.04 | 3.01 | | 3.00 | |

100 | 10 | 141.74 | | 8.70 | | 11.54 | | 1.10 | | 2.97 | | 2.23 | |

100 | 20 | 141.76 | | 6.74 | | 11.51 | | 0.82 | | 2.93 | | 1.69 | |

150 | 5 | 141.56 | | 9.46 | | 11.52 | | 1.23 | 0.04 | 3.09 | | 2.46 | |

150 | 10 | 141.77 | | 7.11 | | 11.52 | | 0.90 | | 2.94 | | 1.82 | |

150 | 20 | 141.72 | | 5.52 | | 11.52 | | 0.67 | | 2.90 | | 1.38 | |

200 | 5 | 141.69 | | 8.21 | | 11.53 | | 1.07 | | 3.05 | | 2.13 | |

200 | 10 | 141.90 | | 6.17 | | 11.54 | | 0.78 | | 2.97 | | 1.58 | |

200 | 20 | 141.77 | | 4.79 | | 11.53 | | 0.58 | | 2.91 | | 1.20 | |

500 | 5 | 141.83 | | 5.21 | | 11.54 | | 0.68 | | 2.97 | | 1.35 | 0.03 |

500 | 10 | 141.95 | | 3.90 | | 11.56 | | 0.50 | | 2.93 | | 1.00 | |

500 | 20 | 142.01 | | 3.05 | | 11.55 | | 0.37 | | 2.92 | | 0.76 | |

NC | CS | SLGCM | |||||||||||

Est. | Est. Bias | | | Est. | Est. Bias | | | Est. | Est. Bias | | | ||

30 | 5 | 140.33 | | 20.08 | | 11.32 | | 2.68 | 0.02 | 3.12 | | 5.26 | 0.04 |

30 | 10 | 141.08 | | 14.25 | | 11.41 | | 1.90 | | 2.93 | | 3.73 | |

30 | 20 | 140.84 | | 10.06 | | 11.41 | | 1.34 | | 2.84 | | 2.64 | |

50 | 5 | 140.98 | | 15.59 | | 11.48 | | 2.08 | 0.03 | 3.02 | | 4.08 | 0.04 |

50 | 10 | 141.13 | | 11.04 | | 11.49 | | 1.47 | | 2.90 | | 2.89 | |

50 | 20 | 141.58 | | 7.83 | | 11.50 | | 1.04 | | 2.87 | | 2.05 | 0.04 |

100 | 5 | 141.13 | | 11.04 | | 11.50 | | 1.48 | 0.03 | 3.01 | | 2.90 | 0.02 |

100 | 10 | 141.74 | | 7.83 | | 11.54 | | 1.04 | | 2.97 | | 2.05 | |

100 | 20 | 141.76 | | 5.54 | | 11.51 | | 0.74 | | 2.93 | | 1.45 | |

150 | 5 | 141.56 | | 9.04 | | 11.52 | | 1.21 | 0.03 | 3.09 | | 2.37 | 0.02 |

150 | 10 | 141.77 | | 6.40 | | 11.52 | | 0.85 | | 2.94 | | 1.67 | 0.04 |

150 | 20 | 141.72 | | 4.52 | | 11.52 | | 0.60 | | 2.90 | | 1.18 | 0.02 |

200 | 5 | 141.69 | | 7.83 | | 11.53 | | 1.04 | | 3.05 | | 2.05 | 0.01 |

200 | 10 | 141.90 | | 5.54 | | 11.54 | | 0.74 | | 2.97 | | 1.45 | |

200 | 20 | 141.77 | | 3.92 | | 11.53 | | 0.52 | | 2.91 | | 1.03 | 0.03 |

500 | 5 | 141.83 | | 4.96 | | 11.54 | | 0.66 | | 2.97 | | 1.30 | -0.01 |

500 | 10 | 141.95 | | 3.51 | | 11.56 | | 0.47 | | 2.93 | | 0.92 | 0.02 |

500 | 20 | 142.01 | | 2.48 | | 11.55 | | 0.33 | | 2.92 | | 0.65 | |

MM performed similarly to MLGCM except for the parameter estimates of *π* _{10}. Results showed reasonably accurate estimates of *π* _{00} (ranging from 112.51 to 113.77), *π* _{11} (ranging from 8.79 to 8.88) and *SE*s (|relative biases| ≤ 0.06). However, *π* _{10} tended to be underestimated (ranging from 0.13 to 0.35) across all sample size scenarios with |relative biases| ranging from 0.03 to 0.59. Further ANOVA explorations suggested the variation of estimated *π* _{10} was not explained by the main effects of NC and CS and interaction effect (*η* ^{2} close to 0).

Both D-LGCM and SLGCM approaches produced similar and severely biased parameter estimates of *π* _{00}, *π* _{11}, and *π* _{10}. For example, the parameter estimates of within-level intercept factor variance (*π* _{00}) derived from the two approaches ranged from 140.42 to 142.01 and were upwardly biased (|relative biases| ranged from 0.24 to 0.25). It should be noted that the estimated within-level intercept factor variance was approximately equal to the sum of the population intercept factor variance within level (113.30) and between level (28.86). Apparently, intercept factor variance in the between-level model was redistributed to the within-level model. Similarly, the within-level slope factor variance (*π* _{11} ranged from 11.32 to 11.56) estimated by both approaches was also upwardly biased (|relative biases| ranged from 0.29 to 0.32) and close to the sum of the population slope factor variance in the within-level (8.77) and between-level (2.80) models. In the same manner, the within-level covariance between intercept and slope (*π* _{10}) estimated by both approaches was also upwardly biased and the between-level component was also redistributed to the within-level. Additionally, *SE*s obtained by SLGCM and D-LGCM were also considerably biased and the latter approach tended to generate more biased *SE*s.

### Parameter estimates and *SEs* in the between-level covariance structure

*γ*

_{00},

*γ*

_{11}, and

*γ*

_{10}) could be only estimated by the MLGCM, and hence these three parameters were not the focus of the study. We present the parameter estimates and

*SE*s of these parameters as well as corresponding relative biases in Table 3. Results can inform readers regarding the performance of MLGCM in the associated different sample size conditions. In our study,

*γ*

_{00}denotes variability of initial mathematics literacy scores across cluster units (e.g., school),

*γ*

_{11}signifies variability of linear trajectories across cluster units, and

*γ*

_{10}represents the relationship between cluster-level initial mathematics literacy score and cluster-level linear trajectory. The estimates for the intercept factor variance (

*γ*

_{00}; ranging from 27.64 to 28.79), slope factor variance (

*γ*

_{11}; ranging from 2.69 to 2.80), and covariance between intercept and slope (

*γ*

_{10}; ranging from 2.49 to 2.69) were fairly accurate across all sample size scenarios. Similarly, we found

*SE*s across all sample size scenarios to be reasonable accurate.

Parameter settings in the covariance structure of the population between-level model and parameter estimates using MLGCM

| | | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MLGCM | |||||||||||||

NC | CS | Est. | Est. Bias | | | Est. | Est. Bias | | | Est. | Est. Bias | | |

30 | 5 | 27.64 | -0.04 | 14.17 | 0.00 | 2.76 | -0.01 | 1.56 | | 2.49 | | 3.43 | |

30 | 10 | 28.13 | -0.03 | 10.28 | | 2.69 | -0.04 | 1.07 | -0.04 | 2.52 | -0.03 | 2.42 | |

30 | 20 | 27.95 | -0.03 | 8.51 | | 2.71 | -0.03 | 0.86 | | 2.51 | -0.04 | 1.97 | |

50 | 5 | 27.85 | -0.03 | 11.08 | -0.01 | 2.77 | -0.01 | 1.21 | -0.03 | 2.59 | -0.01 | 2.65 | 0.04 |

50 | 10 | 28.40 | -0.02 | 8.22 | -0.03 | 2.73 | -0.02 | 0.86 | -0.04 | 2.59 | -0.01 | 1.91 | |

50 | 20 | 28.28 | -0.02 | 6.81 | | 2.73 | -0.02 | 0.69 | 0.00 | 2.50 | -0.04 | 1.57 | |

100 | 5 | 27.99 | -0.03 | 8.03 | -0.01 | 2.78 | -0.01 | 0.88 | -0.02 | 2.65 | 0.02 | 1.90 | 0.03 |

100 | 10 | 28.73 | 0.00 | 6.00 | -0.03 | 2.80 | 0.00 | 0.63 | 0.02 | 2.61 | 0.00 | 1.38 | -0.01 |

100 | 20 | 28.60 | -0.01 | 4.95 | | 2.75 | -0.02 | 0.50 | 0.00 | 2.54 | -0.03 | 1.13 | |

150 | 5 | 28.33 | -0.02 | 6.64 | 0.00 | 2.79 | 0.00 | 0.73 | 0.04 | 2.69 | 0.03 | 1.57 | 0.02 |

150 | 10 | 28.65 | -0.01 | 4.93 | 0.02 | 2.77 | -0.01 | 0.51 | 0.02 | 2.60 | 0.00 | 1.13 | -0.01 |

150 | 20 | 28.61 | -0.01 | 4.06 | -0.02 | 2.76 | -0.01 | 0.41 | 0.02 | 2.54 | -0.03 | 0.93 | -0.04 |

200 | 5 | 28.32 | -0.02 | 5.78 | 0.01 | 2.78 | -0.01 | 0.63 | 0.03 | 2.66 | 0.02 | 1.36 | -0.01 |

200 | 10 | 28.75 | 0.00 | 4.29 | 0.00 | 2.77 | -0.01 | 0.44 | 0.00 | 2.60 | 0.00 | 0.98 | -0.02 |

200 | 20 | 28.61 | -0.01 | 3.53 | -0.03 | 2.77 | -0.01 | 0.36 | 0.00 | 2.55 | -0.02 | 0.81 | -0.04 |

500 | 5 | 28.75 | 0.00 | 3.70 | 0.01 | 2.79 | 0.00 | 0.40 | 0.00 | 2.61 | 0.00 | 0.87 | -0.02 |

500 | 10 | 28.79 | 0.00 | 2.73 | 0.02 | 2.80 | 0.00 | 0.28 | 0.04 | 2.59 | -0.01 | 0.63 | 0.02 |

500 | 20 | 28.75 | 0.00 | 2.26 | -0.03 | 2.78 | -0.01 | 0.23 | 0.00 | 2.59 | -0.01 | 0.52 | 0.00 |

## Discussion and conclusion

The current study investigated the effectiveness of two model-based approaches (MLGCM and MM) and one design-based approach (D-LGCM) that had the potential to accommodate individual dependency in longitudinal data. The accuracy of parameter estimates and *SE*s related to intraindividual growth (*α* _{1} and *α* _{2}) and interindividual variability (*π* _{00}, *π* _{11}, and *π* _{10}) was evaluated through their corresponding relative bias. Moreover, the model solutions of SLGCM were also presented to acknowledge the consequences of ignoring individual dependency.

### Intraindividual growth

In the current study, intraindividual growth can be indicated by two parameters, namely *α* _{1} and *α* _{2}, in the mean structure. As previously mentioned, parameters in the mean structure were estimated in the between-level model by the MLGCM and estimated at the within-level model by MM and D-LGCM. The results showed MLGCM can produce accurate estimated *α* _{1} and *α* _{2} as well as their *SE*s in all simulation scenarios. This finding was in line with Wu et al.’s (2015) and Muthén’s (1997) studies, which also suggested MLGCM was able to estimate intraindividual growth parameters accurately. In terms of the sample size required for accurate estimates, our study suggested MLGCM can perform satisfactorily even in a small NC (30) and CS (5) condition. Similarly, estimated *α* _{1} and *α* _{2} and corresponding *SE*s derived from MM and D-LGCM can also be considered trustworthy. Thus, when intraindividual growth is of interest, researchers are free to implement either MM or D-LGCM on clustered longitudinal data.

On the contrary, we found that although SLGCM provided unbiased estimates of *α* _{1} and *α* _{2}, the *SE*s were seriously underestimated. Our findings on SLGCM were consistent with prior reports in Muthén’s (1997) paper. Note the downward bias of *SE* can cause larger Type I error rates for researchers attempting to conduct inferential tests. As a result, when studying average initial score and average linear trajectory is of primary interest, we recommend researchers consider multilevel approaches, either model-based or design-based ones, that can take into account the individual dependency of the longitudinal data.

### Interindividual variability

Parameter estimates of *π* _{00}, *π* _{11}, and *π* _{10} and their *SE*s in the individual-level covariance structure were used to determine interindividual variability. We found MLGCM satisfactorily provided accurate parameter estimates and *SE*s for *π* _{00} and *π* _{11} regardless of the design factors, NC, and CS. Therefore, we suggest MLGCM can be used to analyze interindividual variability in initial scores and in linear trajectories. Furthermore, our results showed MLGCM tended to overestimate *π* _{10} (relationship between initial score and trajectory), but its *SE*s were not biased. Note *π* _{10} was set to be near-zero (0.32) in the population model (i.e., it can be converted into a correlation equal to .01, a negligible effect size). We did find all estimates of *π* _{10} were not statistically significant at *α* =.05 level across all simulated replications and ANOVA results failed to support that estimates of *π* _{10} were associated with NC and CS, suggesting no systematic pattern. We recommend that the upwardly biased estimates of *π* _{10} be re-examined in future studies.

MM performed quite similarly as MLGCM did except for the parameter estimates of *π* _{10}. This approach in general provided unbiased parameter estimates and *SE*s of *π* _{00} and *π* _{11}. Unlike MLGCM, MM was more likely to underestimate *π* _{10}, but the corresponding *SE*s were still unbiased. As previously mentioned, the effect size of *π* _{10} in the population model was extremely small and we did not find any statistically significant estimates of *π* _{10} in our simulated replications. Moreover, ANOVA results showed estimates of *π* _{10} were not accounted for by NC and CS. Future studies can further examine the downward biased estimates of *π* _{10} in MM. Generally speaking, if researchers are interested in studying interindividual variability in initial scores and in linear trajectories, both MLGCM and MM were capable of producing accurate parameter estimates and *SE*s and thus can be adopted for such analyses.

It must be noted that when D-LGCM and SLGCM were used to study interindividual variability, parameter estimates of *π* _{00}, *π* _{11}, and *π* _{10} did not consist exclusively of the variability in individual (e.g., students) growth but instead were comprised of the integrated variability of individual and cluster (e.g., school) growth, which cannot be interpreted. In line with Muthén (1997), we found ignoring the individual dependency, the factor variance (*π* _{00} and *π* _{11}) or covariance (*π* _{10}) belonging to the between-level was redistributed to the corresponding portions of the within-level model. Therefore, unless variability across cluster units in the between-level model (i.e., *γ* _{00}, *γ* _{11}, and *γ* _{10}) is close or equal to zero in the population, SLGCM and D-LGCM can reduce the accuracy of parameter estimates of *π* _{00}, *π* _{11}, and *π* _{10} to some degree. More specifically, because intercept variances (*π* _{00} and *γ* _{00}) and slope variances (*π* _{11} and *γ* _{11}) at both levels cannot be negative, the *π* _{00} (and *π* _{11}) at the within-level model estimated by D-LGCM and SLGCM will be upwardly biased unless the ignored *γ* _{00} (and *γ* _{11}) at the between-level of the population model are close or equal to zero.

On the other hand, *π* _{10} at the within level can be either upwardly or downwardly biased because the ignored *γ* _{10} at the between level of the population model can be either positive or negative. The biased estimates of *π* _{10} can lead to a misunderstanding of the relationship between individual initial scores and growth trajectories, which can be harmful to substantive research. For example, the idea of the Matthew effect in reading has gained considerable attention in the past few decades (Pfost, Hattie, Dörfler, & Artelt, 2014). The Matthew effect in reading is a positive correlation between initial competence level and individual development, which can be statistically estimated by the parameter *π* _{10}. In other words, an accurate estimate of *π* _{10} is essential to confirm the presence of the Matthew effect. Pfost et al. (2014) reviewed 28 studies focusing on the Matthew effect in reading in a meta-analysis. We reinvestigated the same studies and found some substantive researchers were not aware that SLGCM and D-LGCM could produce seriously biased *π* _{10} when clustered longitudinal data were analyzed. More specifically, we found clustered longitudinal data was used to assess the Matthew effect in 22 studies. However, 16 of 22 studies failed to take into account the clustering effect (e.g., using SLGCM); and two of 22 applied D-LGCM to analyze the data. Only four studies took into account the individual dependency in the data (three used MLGCM; one applied hierarchical linear modeling approach, which was beyond the scope of our study). Based on our simulation finding that *π* _{10} can be either upwardly or downwardly biased, we raise our concern that those 18 studies might have incorrectly determined the presence or absence of the Matthew effect in reading. Future substantive studies that attempt to understand the relationship between individual initial scores and growth trajectories using clustered longitudinal data but fail to account for the structure appropriately need to be aware of the potential bias introduced.

We further explored whether the chi-square (*χ* ^{2}) statistic could effectively detect models with biased parameter estimates of *π* _{00}, *π* _{11}, and *π* _{10} in our simulated replications of either D-LGCM or SLGCM. Unfortunately, the results showed only 4.64% and 4.68% of replications for SLGCM and D-LGCM, respectively, were rejected by the *χ* ^{2} statistic with *α* at .05 level. Practically, researchers could be misled by the *χ* ^{2} statistic. The take-home message is that D-LGCM does not qualify as an alternative approach to analyze clustered longitudinal data if interindividual variability is of interest.

What if either D-LGCM or SLGCM was used to analyze clustered longitudinal data? Practically, parameter estimates of interindividual variability often drives researchers to investigate further whether this variability can be explained by individual-level covariates (e.g., gender). Analytically, researchers can regress interindividual variability (e.g., *π* _{00} and *π* _{11}) on selected individual-level predictors. A previous study showed regression weights would not necessarily be biased (Wu et al., 2015); however, the explained variance of *π* _{00} and *π* _{11} accounted for by the individual-level predictors (Raudenbush & Bryk, 2002) could be misleading. Accordingly, using D-LGCM and SLGCM to analyze clustered longitudinal data is not appropriate and should be avoided.

### An empirical example

*SE*s of interindividual variability, while D-LGCM and SLGCM are not. This example analyzed data drawn from the Longitudinal Study of American Youth (LSAY; Miller et al., 2000), which is a national panel study of mathematics and science education in US public schools. The LSAY has been widely used to study the growth of mathematics and science performance (e.g., Ma & Ma, 2004; Ma & Wilkins, 2007). Following Muthén’s (2004) paper, we analyzed Cohort 2 data containing 3,102 students nested within 52 schools. Students’ grade 7 to grade 10 mathematics achievement scores obtained by item response theory equating were analyzed by MLGCM (see Fig. 1), MM (see Fig. 2), D-LGCM, and SLGCM, separately. The parameter estimates are presented in Table 4. The M

*plus*syntax of four models is provided in Table 7 of Appendix.

Results of an empirical example

Mean structure | Covariance structure of within-level model | Covariance structure of between-level model | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Mean of intercept factor | Mean of slope factor | Variance of intercept factor | Variance of slope factor | Covariance between two factors | Variance of intercept factor | Variance of slope factor | Covariance between two factors | |||||||||

Est. | | Est. | | Est. | | Est. | | Est. | | Est. | | Est. | | Est. | | |

MLGCM | 49.95 | 0.62 | 4.07 | 0.12 | 70.57 | 3.32 | 3.83 | 0.30 | 6.42 | 0.54 | 17.31 | 3.33 | 0.32 | 0.12 | 1.60 | 0.51 |

MM | 49.78 | 0.60 | 4.04 | 0.12 | 70.60 | 3.32 | 3.83 | 0.30 | 6.41 | 0.54 | - | - | - | - | - | - |

D-LGCM | 50.12 | 0.64 | 4.02 | 0.11 | 87.37 | 4.62 | 4.08 | 0.34 | 8.03 | 0.59 | - | - | - | - | - | - |

SLGCM | 50.12 | 0.18 | 4.02 | 0.05 | 87.37 | 2.58 | 4.08 | 0.22 | 8.03 | 0.55 | - | - | - | - | - | - |

The fit indices of the MLGCM [*χ* ^{2} (8) = 119.957, *p* < .05; Root Mean Square Error of Approximation (*RMSEA*) = .067, Comparative Fit Index (*CFI*) = .984, Tucker-Lewis Index (*TLI*) = .976; Standardized Root Mean Square Residual for within (*SRMR-W*) = .007; *SRMR* for between (*SRMR-B*) = .004] suggested adequate model fit (Hu & Bentler, 1999). Therefore, the parameter estimates provided by the MLGCM were expected to be less biased. On the other hand, the fit indices of the MM also suggested adequate model fit [*χ* ^{2} (5) = 109.873, *p* < .05; *RMSEA* = .082, *CFI* = .985, *TLI* = .963; *SRMR-W* = .007; *SRMR-B* = .003].Consistent with our simulation findings, the parameter estimates in the mean structure and within-covariance structure produced by MM were similar to those derived by MLGCM. Thus, both MLGCM and MM can be used to investigate intraindividual growth and interindividual variability given clustered longitudinal data.

D-LGCM produced parameter estimates in the mean structure similar to those derived by MLGCM. However, in line with our simulation findings, when D-LGCM was applied, the parameter estimates in within-covariance structure were greatly biased because factor (co)variance at the between-level was redistributed to the within-level. For example, the estimated within-level intercept factor variance in D-LGCM (87.37) was approximately equal to the sum of the within-level intercept factor variance (70.57) and between-level intercept factor variance (17.31) in MLGCM. Therefore, researchers should avoid using D-LGCM to study interindividual variability given clustered longitudinal data. When SLGCM was applied, the *SE*s in the mean structure were underestimated compared with those derived by MLGCM. Similar to D-LGCM, the parameter estimates in the within-covariance structure were seriously biased. Therefore, using D-LGCM to analyze clustered longitudinal data is not appropriate.

### Limitations and future research direction

This study has limitations that should be addressed. First, the current study adopted a two-level population model depicting the linear growth trajectory at the between- and within-level intercept factor variance. Therefore, findings should only be generalized to studies that apply similar models. Further studies are needed to determine whether the current findings can also be replicated using different models (e.g., a two-level quadratic growth model). Second, we considered a limited number of design factors in the present study. Additional scenarios created by using different design factors, such as unbalanced designs (unequal group conditions) and varying the number of observed indicators per latent factor, are needed in future studies. Third, the information regarding minimum sample size needed for the between-level and within-level intercept factor variance is quite useful for practitioners who attempt to apply a MLGCM to clustered longitudinal data. To the best of our knowledge, no study has been conducted on this issue. Based on our simulation findings, we found a minimum sample size (NC = 30 and CS = 5) equal to 150 is sufficient to attain precise parameter estimates and *SE*s in both within- and between-covariance structures and between-mean structure if the two-level model is correctly specified. Future studies are needed to investigate systematically this sample size topic in MLGCM. Last but not least, the parameter settings in the present simulation study were drawn from an empirical study (Baumert et al., 2012) to increase the generalizability of our findings. Therefore, the variability across cluster units at the between-level (i.e., *γ* _{00}, *γ* _{11}, and *γ* _{10}) intercept factor variance was fixed to one condition and was not manipulated. Future studies can consider the magnitude of variability across cluster units as a design factor and systematically investigate the acceptable degree of bias of interindividual variability (i.e., *π* _{00}, *π* _{11}, and *π* _{10}) derived by model-based and design-based approaches.

## References

- Bandura, A. (1993). Perceived self-efficacy in cognitive develdevelopment and functioning.
*Educational Psychologist, 28*(2), 117–148.CrossRefGoogle Scholar - Baumert, J., Nagy, G., & Lehmann, R. (2012). Cumulative advantages and the emergence of social and ethnic inequality: Matthew effects in reading and mathematics development within elementary schools?
*Child Development, 83*(4), 1347–1367. doi: 10.1111/j.1467-8624.2012.01779.x CrossRefPubMedGoogle Scholar - Bovaird, J. A. (2007). Multilevel structural equation models for contextual factors. In T. D. Little, J. A. Bovaird, & N. A. Card (Eds.),
*Modeling contextual effects in longitudinal studies*. Mahwah: Lawrence Erlbaum Associates.Google Scholar - Chen, Q., Kwok, O.-M., Luo, W., & Willson, V. L. (2010). The impact of ignoring a level of nesting structure in multilevel growth mixture models: A Monte Carlo study.
*Structural Equation Modeling, 17*(4), 570–589. doi: 10.1080/10705511.2010.510046 CrossRefGoogle Scholar - Curran, P. J., Obeidat, K., & Losardo, D. (2010). Twelve frequently asked questions about growth curve modeling.
*Journal of Cognition and Development, 11*(2), 121–136. doi: 10.1080/15248371003699969 CrossRefPubMedPubMedCentralGoogle Scholar - Diallo, T. M. O., Morin, A. J. S., & Parker, P. D. (2014). Statistical power of latent growth curve models to detect quadratic growth.
*Behavior Research Methods, 46*(2), 357–371. doi: 10.3758/s13428-013-0395-1 CrossRefPubMedGoogle Scholar - Duncan, T. E., Duncan, S. C., Alpert, A., Hops, H., Stoolmiller, M., & Muthén, B. O. (1997). Latent variable modeling of longitudinal and multilevel substance use data.
*Multivariate Behavioral Research, 32*(3), 275–318. doi: 10.1207/s15327906mbr3203_3 CrossRefPubMedGoogle Scholar - Duncan, T. E., Duncan, S. C., & Strycker, L. A. (2006).
*An Introduction to Latent Variable Growth Curve Modeling Concepts, Issues, and Applications*(2nd ed.). Mahwah: Lawrence Erlbaum Associates.Google Scholar - Ferragut, M., Blanca, M. J., & Ortiz-Tallo, M. (2014). Psychological virtues during adolescence: A longitudinal study of gender differences.
*European Journal of Developmental Psychology, 11*(5), 521–531. doi: 10.1080/17405629.2013.876403 CrossRefGoogle Scholar - Grimm, K. J., Ram, N., & Estabrook, R. (2016).
*Growth modeling: Structural equation and multilevel modeling approaches*. New York: Guilford Press.Google Scholar - Heck, R. H., & Thomas, S. L. (2009).
*An introduction to multilevel modeling techniques*(2nd ed.). New York: Routledge.Google Scholar - Heeringa, S. G., West, B. T., & Berglund, P. A. (2010).
*Applied survey data analysis*. Boca Raton: Chapman & Hall/CRC.CrossRefGoogle Scholar - Hertzog, C., Oertzen, T. v., & Ghisletta, P. (2008). Evaluating the power of latent growth curve models to detect individual differences in change.
*Structural Equation Modeling, 15,*541–563. doi: 10.1080/10705510802338983 CrossRefGoogle Scholar - Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling: An overview and a meta-analysis.
*Sociological Methods & Research, 26*(3), 329–367. doi: 10.1177/0049124198026003003 CrossRefGoogle Scholar - Hox, J. J. (2010).
*Multilevel Analysis Techniques and Applications*(2nd ed.). New York: Routledge.Google Scholar - Hox, J. J., & Maas, C. J. M. (2001). The accuracy of multilevel structural equation modeling with pseudobalanced groups and small samples.
*Structural Equation Modeling, 8*(2), 157–174. doi: 10.1207/S15328007SEM0802_1 CrossRefGoogle Scholar - Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives.
*Structural Equation Modeling, 6*(1), 1–55.CrossRefGoogle Scholar - Ingels, S. J., Pratt, D. J., Rogers, J. E., Siegel, P. H., & Stutts, E. S. (2005).
*Education Longitudinal Study of 2002: Base-year to first follow-up data file documentation (NCES 2006–344)*. Retrieved from Washington, DC.Google Scholar - Julian, M. W. (2001). The consequences of ignoring multilevel data structures in nonhierarchical covariance modeling.
*Structural Equation Modeling, 8*(3), 325–352. doi: 10.1207/S15328007SEM0803_1 CrossRefGoogle Scholar - Kwok, O., West, S. G., & Green, S. B. (2007). The impact of misspecifying the within-subject covariance structure in multiwave longitudinal multilevel models: A Monte Carlo study.
*Multivariate Behavioral Research, 42*(3), 557–592. doi: 10.1080/00273170701540537 CrossRefGoogle Scholar - Lai, M. H. C., & Kwok, O. (2015). Examining the rule of thumb of not using multilevel modeling: The “design effect smaller than two” rule.
*The Journal of Experimental Education, 83*(3), 423–438. doi: 10.1080/00220973.2014.907229 CrossRefGoogle Scholar - Luo, W., & Kwok, O. (2009). The impacts of ignoring a crossed factor in analyzing cross-classified data.
*Multivariate Behavioral Research, 44*(2), 182–212. doi: 10.1080/00273170902794214 CrossRefPubMedGoogle Scholar - Ma, X., & Ma, L. (2004). Modeling stability of growth between mathematics and science achievement during middle and high school.
*Evaluation Review, 28*(2), 104–122. doi: 10.1177/0193841X03261025 CrossRefPubMedGoogle Scholar - Ma, X., & Wilkins, J. M. (2007). Mathematics coursework regulates growth in mathematics achievement.
*Journal for Research in Mathematics Education, 38*(3), 230–257.Google Scholar - Meyers, J. L., & Beretvas, N. (2006). The impact of inappropriate modeling of cross-classified data structures.
*Multivariate Behavioral Research, 41*(4), 473–497. doi: 10.1207/s15327906mbr4104_3 CrossRefPubMedGoogle Scholar - Miller, J. D., Kimmel, L., Hoffer, T. B., & Nelson, C. (2000).
*Longitudinal study of American youth: User's manual*. Evanston: Northwestern University, International Center for the Advancement of Scientific Literacy.Google Scholar - Moerbeek, M. (2004). The consequence of ignoring a level of nesting in multilevel analysis.
*Multivariate Behavioral Research, 39*(1), 129–149. doi: 10.1207/s15327906mbr3901_5 CrossRefPubMedGoogle Scholar - Muthén, B. O. (1997). Latent variable growth modeling with multilevel data. In M. Berkane (Ed.),
*Latent variable modeling and applications to causality*(pp. 149-161). New York, NY.Google Scholar - Muthén, B. O. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (Ed.),
*Handbook of Quantitative Methodology for the Social Sciences*(pp. 345–368). Newbury Park: Sage Publications.Google Scholar - Muthén, B. O., & Asparouhov, T. (2009). Beyond multilevel regression modeling: Multilevel analysis in a general latent variable framework. In J. Hox & J. K. Roberts (Eds.),
*Handbook of Advanced Multilevel Analysis*. New York: Routledge.Google Scholar - Muthén, B. O., & Asparouhov, T. (2011). Beyond multilevel regression modeling: Multilevel analysis in a general latent variable framework. In J. Hox & J. K. Roberts (Eds.),
*The Handbook of Advanced Multilevel Analysis*. New York: Routledge.Google Scholar - Muthén, B. O., & Curran, P. J. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation.
*Psychological Methods, 2*(4), 371–402. doi: 10.1037/1082-989X.2.4.371 CrossRefGoogle Scholar - Muthén, B. O., & Satorra, A. (1995). Complex sample data in structural equation modeling.
*Sociological Methodology, 25,*267–316. doi: 10.2307/271070 CrossRefGoogle Scholar - Muthén, L. K., & Muthén, B. O. (1998-2015).
*Mplus user’s guild*(7th ed.). Los Angeles, CA: Muthén & Muthén.Google Scholar - Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power.
*Structural Equation Modeling, 9*(4), 599–620.CrossRefGoogle Scholar - Pfost, M., Hattie, J., Dörfler, T., & Artelt, C. (2014). Individual differences in reading development: A review of 25 years of empirical research on Matthew Effects in reading.
*Review of Educational Research, 84*(2), 203–244. doi: 10.3102/0034654313509492 CrossRefGoogle Scholar - Pornprasertmanit, S., Lee, J., & Preacher, K. J. (2014). Ignoring clustering in confirmatory factor analysis: Some consequences for model fit and standardized.
*Multivariate Behavioral Research, 49,*518–543. doi: 10.1080/00273171.2014.933762 CrossRefPubMedGoogle Scholar - Raudenbush, S. W., & Bryk, A. S. (2002).
*Hierarchical Linear Models: Applications and Data Analysis Methods*(2nd ed.). Thousand Oaks: Sage.Google Scholar - Schaie, K. W. (1983).
*Longitudinal studies of adult psychological development*. New York: Guilford Press.Google Scholar - Singer, J. D., & Willett, J. B. (2003).
*Applied longitudinal data analysis, modeling change and event occurrence*. New York: Oxford University Press.CrossRefGoogle Scholar - Snijders, T. A. B., & Bosker, R. J. (2012).
*Multilevel analysis: An introduction to basic and advanced multilevel modeling*(2nd ed.). Thousand Oaks: Sage.Google Scholar - Stapleton, L. M. (2006). Using Multilevel structural equation modeling techniques with complex sample data. In G. R. Hancock & R. O. Mueller (Eds.),
*Structural Equation Modeling: A Secnd Course*. Greenwich: Information Age.Google Scholar - Stapleton, L. M. (2008). Variance estimation using replication methods in structural equation modeling with complex sample data.
*Structural Equation Modeling, 15*(2), 183–210. doi: 10.1080/10705510801922316 CrossRefGoogle Scholar - Wu, J.-Y., & Kwok, O. (2012). Using SEM to analyze complex survey data: A comparison between design-based single-level and model-based multilevel approaches.
*Structural Equation Modeling, 19*(1), 16–35. doi: 10.1080/10705511.2012.634703 CrossRefGoogle Scholar - Wu, J.-Y., Kwok, O., & Willson, V. L. (2015). Using design-based latent growth curve modeling with cluster-level predictor to address dependency.
*Journal of Experimental Education, 82*(4), 431–454. doi: 10.1080/00220973.2013.876226 CrossRefGoogle Scholar - Zhang, Z., & Wang, L. (2009). Statistical power analysis for growth curve models using SAS.
*Behavior Research Methods, 41*(4), 1083–1094. doi: 10.3758/BRM.41.4.1083 CrossRefPubMedGoogle Scholar