Skip to main content
Log in

Frequentist Model Averaging in Structural Equation Modelling

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Model selection from a set of candidate models plays an important role in many structural equation modelling applications. However, traditional model selection methods introduce extra randomness that is not accounted for by post-model selection inference. In the current study, we propose a model averaging technique within the frequentist statistical framework. Instead of selecting an optimal model, the contributions of all candidate models are acknowledged. Valid confidence intervals and a \(\chi ^2\) test statistic are proposed. A simulation study shows that the proposed method is able to produce a robust mean-squared error, a better coverage probability, and a better goodness-of-fit test compared to model selection. It is an interesting compromise between model selection and the full model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Akaike, H. (1973). Information theory as an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Second international symposium on information theory. Budapest: Akademiai Kiado.

    Google Scholar 

  • Ankargren, S., & Jin, S. (2018). On the least squares model averaging interval estimator. Communications in Statistics: Theory and Methods, 47, 118–132.

    Article  Google Scholar 

  • Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246.

    Article  Google Scholar 

  • Berk, R., Brown, L., Buja, A., Zhang, K., & Zhao, L. (2013). Valid post-selection inference. Annals of Statistics, 41, 802–837.

    Article  Google Scholar 

  • Browne, M. W. (1974). Generalized least squares estimators in the analysis of covariance structures. South African Statistical Journal, 8, 1–24. Reprinted in 1977 in D. J. Aigner & A. S. Goldberger (Eds.). Latent variables in socioeconomic models (pp. 205–226). Amsterdam: North Holland.

  • Browne, M. W. (1984). Asymptotically distribution-free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83.

    Article  PubMed  Google Scholar 

  • Browne, M. W. (1987). Robustness of statistical inference in factor analysis and related models. Biometrika, 74, 375–384.

    Article  Google Scholar 

  • Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–161). Newbury Park: Sage.

    Google Scholar 

  • Buckland, S. T., Burnham, K. P. K. P., & Augustin, H. (1997). Model selection: An integral part of inference. Biometrics, 53, 603–618.

    Article  Google Scholar 

  • Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference understanding AIC and BIC in model selection. Sociological Methods & Research, 33, 261–304.

    Article  Google Scholar 

  • Charkhi, A., Claeskens, G., & Hansen, B. E. (2016). Minimum mean square error model averaging in likelihood models. Statistica Sinica, 26, 809–840.

    Google Scholar 

  • Fletcher, D., & Dillingham, P. W. (2011). Model-averaged confidence intervals for factorial experiments. Computational Statistics & Data Analysis, 55, 3041–3048.

    Article  Google Scholar 

  • Fletcher, D., & Turek, D. (2011). Model-averaged profile likelihood intervals. Journal of Agricultural, Biological and Environmental Statistics, 17, 38–51.

    Article  Google Scholar 

  • Hansen, B. E. (2007). Least squares model averaging. Econometrica, 75, 1175–1189.

    Article  Google Scholar 

  • Hansen, B. E. (2014). Model averaging, asymptotic risk, and regression groups. Quantitative Economics, 5, 495–530.

    Article  Google Scholar 

  • Hansen, B. E., & Racine, J. S. (2012). Jackknife model averaging. Journal of Econometrics, 167, 38–46.

    Article  Google Scholar 

  • Hjort, N. L., & Claeskens, G. (2003a). Frequentist model average estimators. Journal of the American Statistical Association, 98, 879–899.

    Article  Google Scholar 

  • Hjort, N. L., & Claeskens, G. (2003b). Rejoinder. Journal of the American Statistical Association, 98, 938–945.

    Article  Google Scholar 

  • Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial (with discussion). Statistical Science, 14, 382–417.

    Article  Google Scholar 

  • Ishwaran, H., & Rao, J. S. (2003). Discussion. Journal of the American Statistical Association, 98, 922–925.

    Article  Google Scholar 

  • Kabaila, P. (1995). The effect of model selection on confidence regions and prediction regions. Econometric Theory, 11, 537–549.

    Article  Google Scholar 

  • Kabaila, P., & Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. Journal of the American Statistical Association, 101, 619–629.

    Article  Google Scholar 

  • Kabaila, P., Welsh, A. H., & Abeysekera, W. (2016). Model-averaged confidence intervals. Scandinavian Journal of Statistics, 43, 35–48.

    Article  Google Scholar 

  • Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab—An S4 package for kernel methods in R. Journal of Statistical Software, 11, 1–20.

    Article  Google Scholar 

  • Kember, D., & Leung, D. Y. P. (2009). Development of a questionnaire for assessing students’ perceptions of the teaching and learning environment and its use in quality assurance. Learning Environments Research, 12, 15–29.

    Article  Google Scholar 

  • Kember, D., & Leung, D. Y. P. (2011). Disciplinary differences in student rating of teaching quality. Research in Higher Education, 52, 278–299.

    Article  Google Scholar 

  • Kim, J., & Pollard, D. (1990). Cube root asymptotics. The Annals of Statistics, 18, 191–219.

    Article  Google Scholar 

  • Knight, K., & Fu, W. (2000). Aasymptotic for lasso-type estimators. The Annals of Statistics, 28, 1356–1378.

    Article  Google Scholar 

  • Lee, W. W. S., Leung, D. Y. P., & Lo, K. C. H. (2013). Development of generic capabilities in teaching and learning eenvironment at associate degree level. In M. S. Khine (Ed.), Application of structural equation modeling in educational research and practice (pp. 169–184). Rotterdam: Sense Publishers.

    Chapter  Google Scholar 

  • Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21, 21–59.

    Article  Google Scholar 

  • Liu, C.-A. (2015). Distribution theory of the least squares averaging estimator. Journal of Econometrics, 186, 142–159.

    Article  Google Scholar 

  • Liu, C.-A., & Kuo, B.-S. (2016). Model averaging in predictive regressions. The Econometrics Journal, 19, 203–231.

    Article  Google Scholar 

  • Liu, Q., & Okui, R. (2013). Heteroscedasticity-robust \(C_p\) model averaging. The Econometrics Journal, 16, 463–472.

    Article  Google Scholar 

  • MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Quantittative Methods in Psychology, 100, 107–120.

    Google Scholar 

  • MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Quantittative Methods in Psychology, 111, 490–504.

    Google Scholar 

  • Madigan, D., & Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association, 89, 1535–1546.

    Article  Google Scholar 

  • Magnus, J. R., & Neudecker, H. (1986). Symmetry, 0–1 matrices and Jacobians: A review. Econometric Theory, 2, 157–190.

    Article  Google Scholar 

  • Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, 551–560.

    Article  Google Scholar 

  • Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Retrieved from https://www.statmodel.com/download/Article_075.pdf. Accessed 12 Sept 2013.

  • Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory, 7, 163–185.

    Article  Google Scholar 

  • Raftery, A. E., & Zheng, Y. (2003). Discussion: Performance of Bayesian model averaging. Journal of the American Statistical Association, 98, 931–938.

    Article  Google Scholar 

  • Sarris, W. E., Satorra, A., & Sorbom, D. (1987). The detection and correction of specification errors in structural equation models. Sociological Methodology, 17, 105–129.

    Article  Google Scholar 

  • Schomaker, M., & Heumann, C. (2011). Model averaging in factor analysis: An analysis of Olympic decathlon data. Journal of Quantitative Analysis in Sports, 7, Article 4.

  • Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  • Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384.

    Article  Google Scholar 

  • Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.

    Article  Google Scholar 

  • Turek, D., & Fletcher, D. (2012). Model-averaged Wald confidence intervals. Computational Statistics & Data Analysis, 56, 2809–2815.

    Article  Google Scholar 

  • Turlach, B. A., & Weingessel, A. (2013). quadprog: Functions to solve quadratic programming problems. R package version 1.5-5.

  • Vanderbei, R. J. (1999). LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11, 451–484.

    Article  Google Scholar 

  • Wang, H., & Zhou, S. Z. F. (2013). Interval estimation by frequentist model averaging. Communications in Statistics: Theory and Methods, 42, 4342–4356.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the reviewers for providing valuable comments. Shaobo Jin was partly supported by Vetenskapsrådet (Swedish Research Council) under the contract 2017-01175.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaobo Jin.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 37 KB)

Appendix

Appendix

Proof of Theorem 1

From the distribution (6), \(\varvec{V}\), \(\varvec{W}\), \(\varvec{K}\), and \(\varvec{K}_{s}\) can be consistently estimated from the full model. \(\varvec{\delta }\varvec{\delta }^{T}\) can be estimated from the full model by \(\hat{\varvec{\delta }}_\mathrm{full}\hat{\varvec{\delta }}_\mathrm{full}^T - 4\hat{\varvec{K}}\) or \(\hat{\varvec{\delta }}_\mathrm{full}\hat{\varvec{\delta }}_\mathrm{full}^T\), where

$$\begin{aligned} \hat{\varvec{\delta }}_\mathrm{full} = \sqrt{n} \left( \hat{\varvec{\gamma }}_\mathrm{full} - \varvec{\gamma }_0 \right) \overset{d}{\rightarrow } \varvec{D}-\varvec{\delta } = 2\varvec{K}\varvec{N}-2\varvec{K}\varvec{J}_{10}\varvec{J}_{00}^{-1}\varvec{M} \sim N\left( \varvec{0},4\varvec{K}\right) \end{aligned}$$

(Hjort & Claeskens, 2003a). Thus, \(Q\left( \{c_{s}\}\right) \overset{d}{\rightarrow } Q^*\left( \{c_{s}\}\right) \) for some \(Q^*\left( \{c_{s}\}\right) \). Such quadratic programming to obtain the model weights is a strictly convex minimization problem when it is positive definite, which indicates that \(Q^{*}\left( \{c_{s}\}\right) \) has a unique minimum. Thus, \(\left\{ {\hat{c}}_{s}\right\} \) converges in distribution to \(\left\{ c_{s}^{*}\right\} \), the minimizer of \(Q^{*}\left( \{c_{s}\}\right) \) (Kim & Pollard, 1990), and the distribution of \(Q^{*}\left( \{c_{s}\}\right) \) depends on \(\varvec{M}\) and \(\varvec{N}\). Note that the distribution of \(\sqrt{n}\left( \hat{\varvec{\mu }}-\varvec{\mu }_\mathrm{true}\right) \) also depends on \(\varvec{M}\) and \(\varvec{N}\) as shown in Eq. (9). Thus, there is joint convergence in the distribution of \(\left\{ {\hat{c}}_{s}\right\} \) and \(\sqrt{n}\left( \hat{\varvec{\mu }} \left( \{{\hat{c}}_{s}\} \right) -\varvec{\mu }_\mathrm{true}\right) \). Therefore,

$$\begin{aligned} \sqrt{n}\left( \hat{\varvec{\mu }} \left( \{{\hat{c}}_{s}\} \right) -\varvec{\mu }_\mathrm{true}\right) \overset{d}{\rightarrow } 2\frac{\partial \varvec{\mu }}{\partial \varvec{\theta }^{T}}\varvec{J}_{00}^{-1}\varvec{M}+\varvec{W}\left[ \varvec{\delta }- \left( \sum _{s}c_{s}^*\varvec{\pi }_{s}^{T}\varvec{K}_{s}\varvec{\pi }_{s} \right) \varvec{K}^{-1}\varvec{D} \right] . \end{aligned}$$

\(\square \)

Proof of Eq. (15)

Define the duplication matrix \(\varvec{P}\) such that \(\mathrm{vec}\left( \varvec{S}\right) =\varvec{P}\text {vech}\left( \varvec{S}\right) \), where \(\text {vech}\left( \right) \) vectorizes the lower diagonal elements of the enclosed symmetric matrix. It is know from (Browne, 1974, 1987) that

$$\begin{aligned} \sqrt{n}\text {vech}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right)&\overset{d}{\rightarrow }N\left( \varvec{0},\varvec{\Omega }\right) , \end{aligned}$$

where \(\varvec{P}^{+}=\left( \varvec{P}^{T}\varvec{P}\right) ^{-1}\varvec{P}^{T}\) and \(\varvec{\Omega }=2\varvec{P}^{+}\left( \varvec{\Sigma }_{0}\otimes \varvec{\Sigma }_{0}\right) \varvec{P}^{+T}\). Consider the symmetric matrix

$$\begin{aligned} \varvec{A}&= \frac{1}{2}\varvec{\Omega }^{1/2}\varvec{P}^{T}\left[ \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) -\varvec{G}\left( \varvec{\mu }_{0}\right) \right] \varvec{P}\varvec{\Omega }^{1/2}. \end{aligned}$$

Lemma 14 in Magnus and Neudecker (1986) indicates that

$$\begin{aligned} \varvec{\Omega }&= 2\left[ \varvec{P}^{T}\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\right] ^{-1}. \end{aligned}$$

Lemma 11 in Magnus and Neudecker (1986) indicates that

$$\begin{aligned}&\varvec{\Delta }^{T}\left( \varvec{\mu }_{0}\right) \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\varvec{\Omega }\varvec{P}^{T}\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P} = 2\varvec{\Delta }^{T}\left( \varvec{\mu }_{0}\right) \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}. \end{aligned}$$
(17)

Consequently, \(\varvec{A}\) can be shown to be idempotent. Lemma 14 in Magnus and Neudecker (1986) also implies that \(\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\varvec{\Omega }\varvec{P}^{T}= 2\varvec{P}\left( \varvec{P}^{T}\varvec{P}\right) ^{-1}\varvec{P}^{T}\). Then, it can be shown that \(\text {tr}\left( \varvec{A}\right) = \left( p_x+p_y\right) \left( p_x+p_y+1\right) /2-r\). Therefore, Eq. 15 holds. \(\square \)

Proof of Theorem 3

For the full model, \(\partial F\left( \hat{\varvec{\beta }}_\mathrm{full}\right) /\partial \varvec{\beta }=\varvec{0}\) and

$$\begin{aligned} F\left( \hat{\varvec{\beta }}_\mathrm{full}\right)&= F\left( \varvec{\beta }_\mathrm{true}\right) -\frac{n}{4}\left( \varvec{\beta }_\mathrm{true}-\hat{\varvec{\beta }}_\mathrm{full}\right) ^{T}\varvec{J}_\mathrm{full}\left( \varvec{\beta }_\mathrm{true}-\hat{\varvec{\beta }}_\mathrm{full}\right) +o_{p}\left( 1\right) . \end{aligned}$$

The distribution (6) indicates that

$$\begin{aligned} \sqrt{n}\left( \hat{\varvec{\beta }}_\mathrm{full}-\varvec{\beta }_\mathrm{true}\right)&= 2\varvec{J}_\mathrm{full} \varvec{\Delta }_{0}^{T}\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \sqrt{n}\mathrm{vec}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right) . \end{aligned}$$

Thus, Eq. (12) becomes

$$\begin{aligned} F\left( \hat{\varvec{\beta }}_\mathrm{full}\right)&= \frac{n}{2}\mathrm{vec}^{T}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right) \left[ \left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) - \varvec{G}\left( \varvec{\beta }_0 \right) \right] \mathrm{vec}\left( \varvec{S}-\varvec{\Sigma }_\mathrm{true}\right) + o_{p}\left( 1\right) \\&= F_{1} + o_{p}\left( 1\right) , \end{aligned}$$

which is asymptotically the same as the FMA test statistic. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, S., Ankargren, S. Frequentist Model Averaging in Structural Equation Modelling. Psychometrika 84, 84–104 (2019). https://doi.org/10.1007/s11336-018-9624-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-018-9624-y

Keywords

Navigation