Latent class models for multiple ordered categorical health data: testing violation of the local independence assumption

  • Paolo Li DonniEmail author
  • Ranjeeta Thomas


Latent class models are now widely applied in health economics to analyse heterogeneity in multiple outcomes generated by subgroups of individuals who vary in unobservable characteristics, such as genetic information or latent traits. These models rely on the underlying assumption that associations between observed outcomes are due to their relationship to underlying subgroups, captured in these models by conditioning on a set of latent classes. This implies that outcomes are locally independent within a class. Local independence assumption, however, is sometimes violated in practical applications when there is uncaptured unobserved heterogeneity resulting in residual associations between classes. While several approaches have been proposed in the case of binary and continuous outcomes, little attention has been directed to the case of multiple ordered categorical outcome variables often used in health economics. In this paper, we develop an approach to test for the violation of the local independence assumption in the case of multiple ordered categorical outcomes. The approach provides a detailed decomposition of identified residual association by allowing it to vary across latent classes and between levels of the ordered categorical outcomes within a class. We show how this level of decomposition is important in the case of ordered categorical outcomes. We illustrate our approach in the context of health insurance and healthcare utilization in the US Medigap market.


Latent class model Local independence assumption Health insurance Healthcare utilization Categorical health data 



  1. Ayyagari P, Deb P, Fletcher J, Gallo W, Sindelar JL (2013) Understanding heterogeneity in price elasticities in the demand for alcohol for older individuals. Health Econ 22(1):89–105Google Scholar
  2. Bago d’Uva T (2005) Latent class models for use of primary care: evidence from a British panel. Health Econ 14(9):873–892Google Scholar
  3. Bago d’Uva T, Jones AM (2009) Health care utilisation in Europe: new evidence from the ECHP. J Health Econ 28(2):265–279Google Scholar
  4. Bartolucci F, Forcina A (2006) A class of latent marginal models for capture–recapture data with continuous covariates. J Am Stat Assoc 101(474):786–794Google Scholar
  5. Bartolucci F, Colombi R, Forcina A (2007) An extended class of marginal link functions for modelling contingency tables by equality and inequality constraints. Stat Sin 17(2):691–711Google Scholar
  6. Becker MP, Yang I (1998) Latent class marginal models for cross-classifications of counts. Sociol Methodol 28(1):293–325Google Scholar
  7. Chiappori PA, Salanié B (2000) Testing for asymmetric information in insurance markets. J Political Econ 108(1):56–78Google Scholar
  8. Colombi R, Forcina A (2001) Marginal regression models for the analysis of positive association of ordinal response variables. Biometrika 88(4):1007–1019Google Scholar
  9. Conway KS, Deb P (2005) Is prenatal care really ineffective? Or, is the ‘devil’ in the distribution? J Health Econ 24(3):489–513Google Scholar
  10. Cutler DM, Finkelstein A, McGarry K (2008) Preference heterogeneity and insurance markets: explaining a puzzle of insurance. Am Econ Rev 98(2):157–162Google Scholar
  11. Dardanoni V, Li Donni P (2012a) Incentive and selection effects of medigap insurance on inpatient care. J Health Econ 31(3):457–470Google Scholar
  12. Dardanoni V, Li Donni P (2012b) Reporting heterogeneity in health: an extended latent class approach. Appl Econ Lett 19(12):1129–1133Google Scholar
  13. Dardanoni V, Li Donni P (2016) The welfare cost of unpriced heterogeneity in insurance markets. RAND J Econ 47(4):998–1028Google Scholar
  14. Dardanoni V, Forcina A, Li Donni P (2018) Testing for asymmetric information in insurance markets: a multivariate ordered regression approach. J Risk Insur 85(1):107–125Google Scholar
  15. Deb P, Trivedi PK (1997) Demand for medical care by the elderly: a finite mixture approach. J Appl Econom 12(3):313–336Google Scholar
  16. Deb P, Trivedi PK (2002) The structure of demand for health care: latent class versus two-part models. J Health Econ 21(4):601–625Google Scholar
  17. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38Google Scholar
  18. Ettner SL (1997) Adverse selection and the purchase of medigap insurance by the elderly. J Health Econ 16(5):543–562Google Scholar
  19. Fang H, Keane MP, Silverman D (2008) Sources of advantageous selection: evidence from the medigap insurance market. J Political Econ 116(2):303–350Google Scholar
  20. Forcina A (2008) Identifiability of extended latent class models with individual covariates. Comput Stat Data Anal 52(12):5263–5268Google Scholar
  21. Forcina A (2017) A Fisher-scoring algorithm for fitting latent class models with individual covariates. Econom Stat 3:132–140Google Scholar
  22. Haberman SJ (1979) Analysis of qualitative data: new developments, vol 2. Academic Press, New YorkGoogle Scholar
  23. Hagenaars JA (1988) Latent structure models with direct effects between indicators: local dependence models. Sociol Methods Res 16(3):379–405Google Scholar
  24. Hagenaars JA, McCutcheon AL (2002) Applied latent class analysis. Cambridge University Press, CambridgeGoogle Scholar
  25. Huang GH, Bandeen-Roche K (2004) Building an identifiable latent class model with covariate effects on underlying and measured variables. Psychometrika 69(1):5–32Google Scholar
  26. Jiménez-Martín S, Labeaga JM, Martńez-Granado M (2002) Latent class versus two-part models in the demand for physician services across the European Union. Health Econ 11(4):301–321Google Scholar
  27. Lang JB (1996) Maximum likelihood methods for a generalized class of log-linear models. Ann Stat 24(2):726–752Google Scholar
  28. Lindsay B, Clogg CC, Grego J (1991) Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. J Am Stat Assoc 86(413):96–107Google Scholar
  29. McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New YorkGoogle Scholar
  30. Meijer E, Kapteyn A, Andreyeva T (2011) Internationally comparable health indices. Health Econ 20(5):600–619Google Scholar
  31. Morduch JJ, Stern HS (1997) Using mixture models to detect sex bias in health outcomes in Bangladesh. J Econom 77(1):259–276Google Scholar
  32. Munkin MK, Trivedi PK (2010) Disentangling incentives effects of insurance coverage from adverse selection in the case of drug expenditure: a finite mixture approach. Health Econ 19(9):1093–1108Google Scholar
  33. Oberski DL, Vermunt JK (2018) The expected parameter change (EPC) for local dependence assessment in binary data latent class models.
  34. Qu Y, Tan M, Kutner MH (1996) Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 52(3):797–810Google Scholar
  35. Reboussin BA, Ip EH, Wolfson M (2008) Locally dependent latent class models with covariates: an application to under-age drinking in the USA. J R Stat Soc Ser A (Stat Soc) 171(4):877–897Google Scholar
  36. Shmueli A (2003) Socio-economic and demographic variation in health and in its measures: the issue of reporting heterogeneity. Soc Sci Med 57(1):125–134Google Scholar
  37. Suppes P, Zanotti M (1981) When are probabilistic explanations possible? Synthese 48(2):191–199Google Scholar
  38. Torrance-Rynard VL, Walter SD (1997) Effects of dependent errors in the assessment of diagnostic test performance. Stat Med 16(19):2157–2175Google Scholar
  39. Vermunt JK, Magidson J (2004) Local independence. In: Lewis-Beck MS, Bryman A, Liao TF (eds) The SAGE encyclopedia of social science research methods, vol 1–3. SAGE Publications, Thousand Oaks, pp 580–581Google Scholar
  40. Wouterse B, Huisman M, Meijboom BR, Deeg DJ, Polder JJ (2013) Modeling the relationship between health and health care expenditures using a latent Markov model. J Health Econ 32(2):423–439Google Scholar
  41. Yang CC, Yang CC (2007) Separating latent classes by information criteria. J Classif 24(2):183–203Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Dipartimento di Scienze Economiche, Statistiche e AziendaliUniversità di PalermoPalermoItaly
  2. 2.Department of Health PolicyLondon School of Economics and Political ScienceLondonUK

Personalised recommendations