Skip to main content
Log in

Predicting Latent Class Scores for Subsequent Analysis

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Latent class regression models relate covariates and latent constructs such as psychiatric disorders. Though full maximum likelihood estimation is available, estimation is often in three steps: (i) a latent class model is fitted without covariates; (ii) latent class scores are predicted; and (iii) the scores are regressed on covariates. We propose a new method for predicting class scores that, in contrast to posterior probability-based methods, yields consistent estimators of the parameters in the third step. Additionally, in simulation studies the new methodology exhibited only a minor loss of efficiency. Finally, the new and the posterior probability-based methods are compared in an analysis of mobility/exercise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bandeen-Roche, K., Huang, G.H., Munoz, B., & Rubin, G.S. (1999). Determination of risk factor associations with questionnaire outcomes: A methods case study. American Journal of Epidemiology, 150(11), 1165–1178.

    PubMed  Google Scholar 

  • Bandeen-Roche, K., Miglioretti, D.L., Zeger, S.L., & Rathouz, P.J. (1997). Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association, 92, 1375–1386.

    Article  Google Scholar 

  • Bartolucci, F., & Forcina, A. (2006). A class of latent marginal models for capture-recapture data with continuous covariates. Journal of the American Statistical Association, 101, 786–794.

    Article  Google Scholar 

  • Bolck, A., Croon, M., & Hagenaars, J. (2004). Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis, 12(1), 3–27.

    Article  Google Scholar 

  • Bollen, K.A. (1996). An alternative two stage least squares (2sls) estimator for latent variable equations. Psychometrika, 61, 109–121.

    Article  Google Scholar 

  • Carroll, R.J., Ruppert, D., Stefanski, L.A., & Crainiceanu, C.M. (2006). Measurement error in nonlinear models: a modern perspective. Boca Raton: Chapman & Hall/CRC.

    Book  Google Scholar 

  • Croon, M. (2002). Using predicted latent scores in general latent structure models. In G.A. Marcoulides & I. Moustaki (Eds.), Latent variable and latent structure models (pp. 195–225). Mahwah: Erlbaum.

    Google Scholar 

  • Dayton, C.M., & Macready, G.B. (1988). Concomitant-variable latent-class models. Journal of the American Statistical Association, 83, 173–178.

    Article  Google Scholar 

  • Funderbuck, J.S., Maisto, S.A., Sugarman, D.E., & Wade, M. (2008). The covariation of multiple risk factors in primary care: a latent class analysis. Journal of Behavioral Medicine, 31, 525–535.

    Article  Google Scholar 

  • Goodman, L.A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231.

    Article  Google Scholar 

  • Guralnik, J., Fried, L., Simonsick, E., Kasper, J., & Lafferty, M. (1995). The women’s health and aging study: health and social characteristics of older women with disability (Tech. Rep. No. NIH Pub. 95-4009). Bethesda MD: National Institute on Aging.

  • Huang, G.H., & Bandeen-Roche, K. (2004). Building an identifiable latent class model with covariates effects on underlying and measured variables. Psychometrika, 69(1), 5–32.

    Article  Google Scholar 

  • Johnson, R.A., & Wichern, D.W. (2007). Applied multivariate statistical analysis. New York: Pearson Prentice Hall.

    Google Scholar 

  • Lazarsfeld, P.F., & Henry, N.W. (1968). Latent structure analysis. Boston: Houghton Mifflin.

    Google Scholar 

  • Liang, K.-Y., & Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.

    Article  Google Scholar 

  • Lu, I.R.R., & Thomas, R. (2008). Avoiding and correcting bias in score-based latent variable regression with discrete manifest items. Structural Equation Modeling, 15, 462–490.

    Article  Google Scholar 

  • Markkula, R., Jarvinen, P., Leino-Arjas, P., Koskenvuo, M., Kalso, E., & Kaprio, J. (2009). Clustering of symptoms associated with fibromyalgia in a finnish twin cohort. The European Journal of Pain, 13(7), 744–750.

    Article  Google Scholar 

  • Munoz, B., West, S., Rubin, G.S., Schein, O.D., Fried, L.P., Bandeen-Roche, K., et al. (1999). Who participates in population based studies of visual impairment? The Salisbury eye evaluation project experience. Annals of Epidemiology, 9(1), 53–59.

    Article  PubMed  Google Scholar 

  • Muthén, L.K., & Muthén, B.O. (2007). Mplus, statistical analysis with latent variables, user’s guide (5th ed.). Los Angeles: Muthén & Muthén.

    Google Scholar 

  • Nylund, K.L., Asparouhov, T., & Muthen, B.O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling, 14, 535–569.

    Article  Google Scholar 

  • Pepe, M.S. (2003). The statistical evaluation of medical tests for classification and prediction. Oxford: Oxford University Press.

    Google Scholar 

  • Reyes, A.D.L., Henry, D.B., Tolan, P.H., & Wakschlag, L.S. (2009). Linking informant discrepancies to observed variations in young children’s disruptive behavior. Journal of Abnormal Child Psychology, 37(5), 637–652.

    Article  Google Scholar 

  • Rubin, G.S., Bandeen-Roche, K., Prasada-Rao, P., & Fried, L. (1994). Visual impairment and disability in older adults. Optometry & Vision Science, 71, 750–760.

    Article  Google Scholar 

  • Sánchez, B.N., Budtz-Jørgensen, E., & Ryan, L.M. (2009). An estimating equations approach to fitting latent exposure models with longitudinal health outcomes. Annals of Applied Statistics, 3, 830–856.

    Article  Google Scholar 

  • Schorr, G., Ulbricht, S., Schmidt, C.O., Baumeister, S.E., Ruge, J., Schumann, A., et al. (2008). Does precontemplation represent a homogeneous stage category? A latent class analysis on German smokers. Journal of Consulting and Clinical Psychology, 76(5), 840–851.

    Article  PubMed  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  • Serfling, R.J. (1980). Approximation theorems of mathematical statistics. New York: Wiley.

    Book  Google Scholar 

  • Skrondal, A., & Laake, P. (2001). Regression among factor scores. Psychometrika, 4, 563–576.

    Article  Google Scholar 

  • Steiger, H., Richardson, J., Schmitz, N., Joober, R., Israel, M., Bruce, K.R., et al. (2009). Association of trait-defined, eating-disorder sub-phenotypes with (biallelic and triallelic) 5httlpr variations. Journal of Psychiatric Research, 43(13), 1086–1094.

    Article  PubMed  Google Scholar 

  • Unick, G.J., Snowden, L., & Hastings, J. (2009). Heterogeneity in comorbidity between major depressive disorder and generalized anxiety disorder and its clinical consequences. The Journal of Nervous and Mental Disease, 197(4), 215–224.

    Article  PubMed  Google Scholar 

  • Wang, C.P., Brown, C.H., & Bandeen-Roche, K. (2005). Residual diagnostics for growth mixture models: Examining the impact of a preventive intervention on multiple trajectories of aggressive behavior. Journal of the American Statistical Association, 100(471), 1054–1076.

    Article  Google Scholar 

  • West, C.G., Gildengorin, G., Haegerstrom-Portnoy, G., Schneck, M.E., Lett, L., & Brabyn, J.A. (2002). Is vision function related to physical functional ability in older adults? Journal of the American Geriatrics Society, 50, 136–145.

    Article  PubMed  Google Scholar 

  • West, S.K., Munoz, B., Rubin, G.S., Schein, O.D., Bandeen-Roche, K., Zeger, S.L., et al. (1997). Function and visual impairment in a population-based study of older adults: The SEE project. Investigative Ophthalmology & Visual Science, 38(1), 72–82.

    Google Scholar 

Download references

Acknowledgements

The authors wish to thank Dr. Sheila West for kindly making the Salisbury Eye Evaluation data available, and the Johns Hopkins Older Americans Independence Center for funding, NIA: P30AG021334.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janne Petersen.

Appendix

Appendix

SAS program to perform Step 3. This program performs maximum likelihood estimation in the Step 3 regression.

figure a

In the line starting with “th0”, one have to give some starting values for the parameters. These have to be given in following order: IS 1,…,IS M−1, x 1 S 1,…,x 1 S M−1, x 2 S 1,…,x P S M−1, var(S 1), cov(S 1,S 2), var(S 2), cov(S 1,S 3), cov(S 2,S 3), var(S 3),…, var(S M−1), where IS m is the intercept for class m and x p S m is the parameter estimate of covariate p on class m. We recommend that one tries serval different sets of starting values, as local maxima can be reached. As starting values for the covariance matrix for the LSC scores, we suggest the empirical covariance matrix for the LSC predictions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Petersen, J., Bandeen-Roche, K., Budtz-Jørgensen, E. et al. Predicting Latent Class Scores for Subsequent Analysis. Psychometrika 77, 244–262 (2012). https://doi.org/10.1007/s11336-012-9248-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-012-9248-6

Key words

Navigation