Abstract
Latent class regression models relate covariates and latent constructs such as psychiatric disorders. Though full maximum likelihood estimation is available, estimation is often in three steps: (i) a latent class model is fitted without covariates; (ii) latent class scores are predicted; and (iii) the scores are regressed on covariates. We propose a new method for predicting class scores that, in contrast to posterior probability-based methods, yields consistent estimators of the parameters in the third step. Additionally, in simulation studies the new methodology exhibited only a minor loss of efficiency. Finally, the new and the posterior probability-based methods are compared in an analysis of mobility/exercise.
Similar content being viewed by others
References
Bandeen-Roche, K., Huang, G.H., Munoz, B., & Rubin, G.S. (1999). Determination of risk factor associations with questionnaire outcomes: A methods case study. American Journal of Epidemiology, 150(11), 1165–1178.
Bandeen-Roche, K., Miglioretti, D.L., Zeger, S.L., & Rathouz, P.J. (1997). Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association, 92, 1375–1386.
Bartolucci, F., & Forcina, A. (2006). A class of latent marginal models for capture-recapture data with continuous covariates. Journal of the American Statistical Association, 101, 786–794.
Bolck, A., Croon, M., & Hagenaars, J. (2004). Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis, 12(1), 3–27.
Bollen, K.A. (1996). An alternative two stage least squares (2sls) estimator for latent variable equations. Psychometrika, 61, 109–121.
Carroll, R.J., Ruppert, D., Stefanski, L.A., & Crainiceanu, C.M. (2006). Measurement error in nonlinear models: a modern perspective. Boca Raton: Chapman & Hall/CRC.
Croon, M. (2002). Using predicted latent scores in general latent structure models. In G.A. Marcoulides & I. Moustaki (Eds.), Latent variable and latent structure models (pp. 195–225). Mahwah: Erlbaum.
Dayton, C.M., & Macready, G.B. (1988). Concomitant-variable latent-class models. Journal of the American Statistical Association, 83, 173–178.
Funderbuck, J.S., Maisto, S.A., Sugarman, D.E., & Wade, M. (2008). The covariation of multiple risk factors in primary care: a latent class analysis. Journal of Behavioral Medicine, 31, 525–535.
Goodman, L.A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231.
Guralnik, J., Fried, L., Simonsick, E., Kasper, J., & Lafferty, M. (1995). The women’s health and aging study: health and social characteristics of older women with disability (Tech. Rep. No. NIH Pub. 95-4009). Bethesda MD: National Institute on Aging.
Huang, G.H., & Bandeen-Roche, K. (2004). Building an identifiable latent class model with covariates effects on underlying and measured variables. Psychometrika, 69(1), 5–32.
Johnson, R.A., & Wichern, D.W. (2007). Applied multivariate statistical analysis. New York: Pearson Prentice Hall.
Lazarsfeld, P.F., & Henry, N.W. (1968). Latent structure analysis. Boston: Houghton Mifflin.
Liang, K.-Y., & Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.
Lu, I.R.R., & Thomas, R. (2008). Avoiding and correcting bias in score-based latent variable regression with discrete manifest items. Structural Equation Modeling, 15, 462–490.
Markkula, R., Jarvinen, P., Leino-Arjas, P., Koskenvuo, M., Kalso, E., & Kaprio, J. (2009). Clustering of symptoms associated with fibromyalgia in a finnish twin cohort. The European Journal of Pain, 13(7), 744–750.
Munoz, B., West, S., Rubin, G.S., Schein, O.D., Fried, L.P., Bandeen-Roche, K., et al. (1999). Who participates in population based studies of visual impairment? The Salisbury eye evaluation project experience. Annals of Epidemiology, 9(1), 53–59.
Muthén, L.K., & Muthén, B.O. (2007). Mplus, statistical analysis with latent variables, user’s guide (5th ed.). Los Angeles: Muthén & Muthén.
Nylund, K.L., Asparouhov, T., & Muthen, B.O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14, 535–569.
Pepe, M.S. (2003). The statistical evaluation of medical tests for classification and prediction. Oxford: Oxford University Press.
Reyes, A.D.L., Henry, D.B., Tolan, P.H., & Wakschlag, L.S. (2009). Linking informant discrepancies to observed variations in young children’s disruptive behavior. Journal of Abnormal Child Psychology, 37(5), 637–652.
Rubin, G.S., Bandeen-Roche, K., Prasada-Rao, P., & Fried, L. (1994). Visual impairment and disability in older adults. Optometry & Vision Science, 71, 750–760.
Sánchez, B.N., Budtz-Jørgensen, E., & Ryan, L.M. (2009). An estimating equations approach to fitting latent exposure models with longitudinal health outcomes. Annals of Applied Statistics, 3, 830–856.
Schorr, G., Ulbricht, S., Schmidt, C.O., Baumeister, S.E., Ruge, J., Schumann, A., et al. (2008). Does precontemplation represent a homogeneous stage category? A latent class analysis on German smokers. Journal of Consulting and Clinical Psychology, 76(5), 840–851.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Serfling, R.J. (1980). Approximation theorems of mathematical statistics. New York: Wiley.
Skrondal, A., & Laake, P. (2001). Regression among factor scores. Psychometrika, 4, 563–576.
Steiger, H., Richardson, J., Schmitz, N., Joober, R., Israel, M., Bruce, K.R., et al. (2009). Association of trait-defined, eating-disorder sub-phenotypes with (biallelic and triallelic) 5httlpr variations. Journal of Psychiatric Research, 43(13), 1086–1094.
Unick, G.J., Snowden, L., & Hastings, J. (2009). Heterogeneity in comorbidity between major depressive disorder and generalized anxiety disorder and its clinical consequences. The Journal of Nervous and Mental Disease, 197(4), 215–224.
Wang, C.P., Brown, C.H., & Bandeen-Roche, K. (2005). Residual diagnostics for growth mixture models: Examining the impact of a preventive intervention on multiple trajectories of aggressive behavior. Journal of the American Statistical Association, 100(471), 1054–1076.
West, C.G., Gildengorin, G., Haegerstrom-Portnoy, G., Schneck, M.E., Lett, L., & Brabyn, J.A. (2002). Is vision function related to physical functional ability in older adults? Journal of the American Geriatrics Society, 50, 136–145.
West, S.K., Munoz, B., Rubin, G.S., Schein, O.D., Bandeen-Roche, K., Zeger, S.L., et al. (1997). Function and visual impairment in a population-based study of older adults: The SEE project. Investigative Ophthalmology & Visual Science, 38(1), 72–82.
Acknowledgements
The authors wish to thank Dr. Sheila West for kindly making the Salisbury Eye Evaluation data available, and the Johns Hopkins Older Americans Independence Center for funding, NIA: P30AG021334.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
SAS program to perform Step 3. This program performs maximum likelihood estimation in the Step 3 regression.
In the line starting with “th0”, one have to give some starting values for the parameters. These have to be given in following order: IS 1,…,IS M−1, x 1 S 1,…,x 1 S M−1, x 2 S 1,…,x P S M−1, var(S 1), cov(S 1,S 2), var(S 2), cov(S 1,S 3), cov(S 2,S 3), var(S 3),…, var(S M−1), where IS m is the intercept for class m and x p S m is the parameter estimate of covariate p on class m. We recommend that one tries serval different sets of starting values, as local maxima can be reached. As starting values for the covariance matrix for the LSC scores, we suggest the empirical covariance matrix for the LSC predictions.
Rights and permissions
About this article
Cite this article
Petersen, J., Bandeen-Roche, K., Budtz-Jørgensen, E. et al. Predicting Latent Class Scores for Subsequent Analysis. Psychometrika 77, 244–262 (2012). https://doi.org/10.1007/s11336-012-9248-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-012-9248-6