Specifying Multilevel Mixture Selection Models in Propensity Score Analysis
Causal inference with observational data is challenging, as the assignment to treatment is often not random and people may have different reasons to receive or to be assigned to the treatment. Moreover, the analyst may not have access to all of the important variables and may face omitted variable bias as well as selection bias in nonexperimental studies. It is known that fixed effects models are robust against unobserved cluster variables while random effects models provide biased estimates of model parameters in the presence of omitted variables. This study further investigates the properties of fixed effects models as an alternative to the common random effects models for identifying and classifying subpopulations or “latent classes” when selection or outcome processes are heterogeneous. A recent study by Suk and Kim (2018) found that linear probability models outperform standard logistic selection models in terms of the extraction of the correct number of latent classes, and the authors continue to search for optimal model specifications of mixture selection models across different conditions, such as strong and weak selection, various numbers of clusters and cluster sizes. It is found that fixed-effects models outperform random effects models in terms of classifying units and estimating treatment effects when cluster size is small.
KeywordsCausal inference Finite mixture modeling Latent class analysis Selection bias Balancing scores Heterogeneous selection and treatment effects Fixed-effects and Random-effects models Hierarchical linear modeling
- Gui, R., Meierer, M., & Algesheimer, R. (2017). REndo: Fitting linear models with endogenous regressors using latent instrumental variables. R package version 1.3. https://CRAN.R-project.org/package=REndo.
- Kim, J.-S., Steiner, P. M. & Lim, W.-C. (2016). Mixture modeling strategies for causal inference with multilevel data. In J. R. Harring, L. M. Stapleton, & S. Natasha Beretvas (Eds.), Advances in multilevel modeling for educational research: Addressing practical issues found in real-world applications (pp. 335–359). Charlotte, NC: IAP—Information Age Publishing, Inc.Google Scholar
- Muthén, L. K., Muthén, B. O. (1998–2017). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthén & Muthén.Google Scholar
- Nerlove, M. (2005). Essays in panel data econometrics. Cambridge: Cambridge University Press.Google Scholar
- R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
- Steiner, P. M., & Cook, D. (2013). Matching and propensity scores. In T. Little (Ed.), The oxford handbook of quantitative methods (pp. 236–258). Oxford, England: Oxford University Press.Google Scholar
- Suk, Y., Kim, J.-S. (2018, April). Linear probability models as alternatives to logistic regression models for multilevel propensity score analysis. Paper presented at the annual meeting of American Educational Research Association, New York City, NY.Google Scholar