Abstract
Model selection from a set of candidate models plays an important role in many structural equation modelling applications. However, traditional model selection methods introduce extra randomness that is not accounted for by post-model selection inference. In the current study, we propose a model averaging technique within the frequentist statistical framework. Instead of selecting an optimal model, the contributions of all candidate models are acknowledged. Valid confidence intervals and a \(\chi ^2\) test statistic are proposed. A simulation study shows that the proposed method is able to produce a robust mean-squared error, a better coverage probability, and a better goodness-of-fit test compared to model selection. It is an interesting compromise between model selection and the full model.
Similar content being viewed by others
References
Akaike, H. (1973). Information theory as an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Second international symposium on information theory. Budapest: Akademiai Kiado.
Ankargren, S., & Jin, S. (2018). On the least squares model averaging interval estimator. Communications in Statistics: Theory and Methods, 47, 118–132.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246.
Berk, R., Brown, L., Buja, A., Zhang, K., & Zhao, L. (2013). Valid post-selection inference. Annals of Statistics, 41, 802–837.
Browne, M. W. (1974). Generalized least squares estimators in the analysis of covariance structures. South African Statistical Journal, 8, 1–24. Reprinted in 1977 in D. J. Aigner & A. S. Goldberger (Eds.). Latent variables in socioeconomic models (pp. 205–226). Amsterdam: North Holland.
Browne, M. W. (1984). Asymptotically distribution-free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83.
Browne, M. W. (1987). Robustness of statistical inference in factor analysis and related models. Biometrika, 74, 375–384.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–161). Newbury Park: Sage.
Buckland, S. T., Burnham, K. P. K. P., & Augustin, H. (1997). Model selection: An integral part of inference. Biometrics, 53, 603–618.
Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference understanding AIC and BIC in model selection. Sociological Methods & Research, 33, 261–304.
Charkhi, A., Claeskens, G., & Hansen, B. E. (2016). Minimum mean square error model averaging in likelihood models. Statistica Sinica, 26, 809–840.
Fletcher, D., & Dillingham, P. W. (2011). Model-averaged confidence intervals for factorial experiments. Computational Statistics & Data Analysis, 55, 3041–3048.
Fletcher, D., & Turek, D. (2011). Model-averaged profile likelihood intervals. Journal of Agricultural, Biological and Environmental Statistics, 17, 38–51.
Hansen, B. E. (2007). Least squares model averaging. Econometrica, 75, 1175–1189.
Hansen, B. E. (2014). Model averaging, asymptotic risk, and regression groups. Quantitative Economics, 5, 495–530.
Hansen, B. E., & Racine, J. S. (2012). Jackknife model averaging. Journal of Econometrics, 167, 38–46.
Hjort, N. L., & Claeskens, G. (2003a). Frequentist model average estimators. Journal of the American Statistical Association, 98, 879–899.
Hjort, N. L., & Claeskens, G. (2003b). Rejoinder. Journal of the American Statistical Association, 98, 938–945.
Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial (with discussion). Statistical Science, 14, 382–417.
Ishwaran, H., & Rao, J. S. (2003). Discussion. Journal of the American Statistical Association, 98, 922–925.
Kabaila, P. (1995). The effect of model selection on confidence regions and prediction regions. Econometric Theory, 11, 537–549.
Kabaila, P., & Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. Journal of the American Statistical Association, 101, 619–629.
Kabaila, P., Welsh, A. H., & Abeysekera, W. (2016). Model-averaged confidence intervals. Scandinavian Journal of Statistics, 43, 35–48.
Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab—An S4 package for kernel methods in R. Journal of Statistical Software, 11, 1–20.
Kember, D., & Leung, D. Y. P. (2009). Development of a questionnaire for assessing students’ perceptions of the teaching and learning environment and its use in quality assurance. Learning Environments Research, 12, 15–29.
Kember, D., & Leung, D. Y. P. (2011). Disciplinary differences in student rating of teaching quality. Research in Higher Education, 52, 278–299.
Kim, J., & Pollard, D. (1990). Cube root asymptotics. The Annals of Statistics, 18, 191–219.
Knight, K., & Fu, W. (2000). Aasymptotic for lasso-type estimators. The Annals of Statistics, 28, 1356–1378.
Lee, W. W. S., Leung, D. Y. P., & Lo, K. C. H. (2013). Development of generic capabilities in teaching and learning eenvironment at associate degree level. In M. S. Khine (Ed.), Application of structural equation modeling in educational research and practice (pp. 169–184). Rotterdam: Sense Publishers.
Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21, 21–59.
Liu, C.-A. (2015). Distribution theory of the least squares averaging estimator. Journal of Econometrics, 186, 142–159.
Liu, C.-A., & Kuo, B.-S. (2016). Model averaging in predictive regressions. The Econometrics Journal, 19, 203–231.
Liu, Q., & Okui, R. (2013). Heteroscedasticity-robust \(C_p\) model averaging. The Econometrics Journal, 16, 463–472.
MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Quantittative Methods in Psychology, 100, 107–120.
MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Quantittative Methods in Psychology, 111, 490–504.
Madigan, D., & Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association, 89, 1535–1546.
Magnus, J. R., & Neudecker, H. (1986). Symmetry, 0–1 matrices and Jacobians: A review. Econometric Theory, 2, 157–190.
Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, 551–560.
Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Retrieved from https://www.statmodel.com/download/Article_075.pdf. Accessed 12 Sept 2013.
Pötscher, B. M. (1991). Effects of model selection on inference. Econometric Theory, 7, 163–185.
Raftery, A. E., & Zheng, Y. (2003). Discussion: Performance of Bayesian model averaging. Journal of the American Statistical Association, 98, 931–938.
Sarris, W. E., Satorra, A., & Sorbom, D. (1987). The detection and correction of specification errors in structural equation models. Sociological Methodology, 17, 105–129.
Schomaker, M., & Heumann, C. (2011). Model averaging in factor analysis: An analysis of Olympic decathlon data. Journal of Quantitative Analysis in Sports, 7, Article 4.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384.
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.
Turek, D., & Fletcher, D. (2012). Model-averaged Wald confidence intervals. Computational Statistics & Data Analysis, 56, 2809–2815.
Turlach, B. A., & Weingessel, A. (2013). quadprog: Functions to solve quadratic programming problems. R package version 1.5-5.
Vanderbei, R. J. (1999). LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11, 451–484.
Wang, H., & Zhou, S. Z. F. (2013). Interval estimation by frequentist model averaging. Communications in Statistics: Theory and Methods, 42, 4342–4356.
Acknowledgements
We would like to thank the reviewers for providing valuable comments. Shaobo Jin was partly supported by Vetenskapsrådet (Swedish Research Council) under the contract 2017-01175.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
Proof of Theorem 1
From the distribution (6), \(\varvec{V}\), \(\varvec{W}\), \(\varvec{K}\), and \(\varvec{K}_{s}\) can be consistently estimated from the full model. \(\varvec{\delta }\varvec{\delta }^{T}\) can be estimated from the full model by \(\hat{\varvec{\delta }}_\mathrm{full}\hat{\varvec{\delta }}_\mathrm{full}^T - 4\hat{\varvec{K}}\) or \(\hat{\varvec{\delta }}_\mathrm{full}\hat{\varvec{\delta }}_\mathrm{full}^T\), where
(Hjort & Claeskens, 2003a). Thus, \(Q\left( \{c_{s}\}\right) \overset{d}{\rightarrow } Q^*\left( \{c_{s}\}\right) \) for some \(Q^*\left( \{c_{s}\}\right) \). Such quadratic programming to obtain the model weights is a strictly convex minimization problem when it is positive definite, which indicates that \(Q^{*}\left( \{c_{s}\}\right) \) has a unique minimum. Thus, \(\left\{ {\hat{c}}_{s}\right\} \) converges in distribution to \(\left\{ c_{s}^{*}\right\} \), the minimizer of \(Q^{*}\left( \{c_{s}\}\right) \) (Kim & Pollard, 1990), and the distribution of \(Q^{*}\left( \{c_{s}\}\right) \) depends on \(\varvec{M}\) and \(\varvec{N}\). Note that the distribution of \(\sqrt{n}\left( \hat{\varvec{\mu }}-\varvec{\mu }_\mathrm{true}\right) \) also depends on \(\varvec{M}\) and \(\varvec{N}\) as shown in Eq. (9). Thus, there is joint convergence in the distribution of \(\left\{ {\hat{c}}_{s}\right\} \) and \(\sqrt{n}\left( \hat{\varvec{\mu }} \left( \{{\hat{c}}_{s}\} \right) -\varvec{\mu }_\mathrm{true}\right) \). Therefore,
\(\square \)
Proof of Eq. (15)
Define the duplication matrix \(\varvec{P}\) such that \(\mathrm{vec}\left( \varvec{S}\right) =\varvec{P}\text {vech}\left( \varvec{S}\right) \), where \(\text {vech}\left( \right) \) vectorizes the lower diagonal elements of the enclosed symmetric matrix. It is know from (Browne, 1974, 1987) that
where \(\varvec{P}^{+}=\left( \varvec{P}^{T}\varvec{P}\right) ^{-1}\varvec{P}^{T}\) and \(\varvec{\Omega }=2\varvec{P}^{+}\left( \varvec{\Sigma }_{0}\otimes \varvec{\Sigma }_{0}\right) \varvec{P}^{+T}\). Consider the symmetric matrix
Lemma 14 in Magnus and Neudecker (1986) indicates that
Lemma 11 in Magnus and Neudecker (1986) indicates that
Consequently, \(\varvec{A}\) can be shown to be idempotent. Lemma 14 in Magnus and Neudecker (1986) also implies that \(\left( \varvec{\Sigma }_{0}^{-1}\otimes \varvec{\Sigma }_{0}^{-1}\right) \varvec{P}\varvec{\Omega }\varvec{P}^{T}= 2\varvec{P}\left( \varvec{P}^{T}\varvec{P}\right) ^{-1}\varvec{P}^{T}\). Then, it can be shown that \(\text {tr}\left( \varvec{A}\right) = \left( p_x+p_y\right) \left( p_x+p_y+1\right) /2-r\). Therefore, Eq. 15 holds. \(\square \)
Proof of Theorem 3
For the full model, \(\partial F\left( \hat{\varvec{\beta }}_\mathrm{full}\right) /\partial \varvec{\beta }=\varvec{0}\) and
The distribution (6) indicates that
Thus, Eq. (12) becomes
which is asymptotically the same as the FMA test statistic. \(\square \)
Rights and permissions
About this article
Cite this article
Jin, S., Ankargren, S. Frequentist Model Averaging in Structural Equation Modelling. Psychometrika 84, 84–104 (2019). https://doi.org/10.1007/s11336-018-9624-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-018-9624-y