Skip to main content

Identifying Multiple Cluster Structures Through Latent Class Models

  • Conference paper
From Data and Information Analysis to Knowledge Engineering

Abstract

Many studies addressing the problem of selecting or weighting variables for cluster analysis assume that all the variables define a unique classification of units. However it is also possible that different classifications of units can be obtained from different subsets of variables. In this paper this problem is considered from a model-based perspective. Limitations and drawbacks of standard latent class cluster analysis are highlighted and a new procedure able to overcome these difficulties is proposed. The results obtained from the application of this procedure on simulated and real data sets are presented and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 159.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • DY, G., BRODLEY, C.E. (2004): Feature Selection for Unsupervised Learning. Journal of Machine Learning Research, 5, 845–889.

    Google Scholar 

  • FOWLKES, E.B., GNANADESIKAN, R., KETTENRING, J.R. (1988): Variable Selection in Clustering. Journal of Classification, 5, 205–228.

    Article  MathSciNet  Google Scholar 

  • FRALEY, C. and RAFTERY, A.E. (2002a): Model-Based Clustering, Discriminant Analysis and Density Estimation. Journal of the American Statistical Association, 97, 611–631.

    Article  MathSciNet  Google Scholar 

  • FRALEY, C. and RAFTERY, A.E. (2002b): MCLUST: Software for Model-Based Clustering, Density Estimation and Discriminant Analysis. Technical Report No. 415, Department of Statistics, University of Washington.

    Google Scholar 

  • FRIEDMAN, J.H. and MEULMAN, J.J. (2004): Clustering Objects on Subsets of Attributes. Journal of the Royal Statistical Society B, 66, 815–849.

    Article  MathSciNet  Google Scholar 

  • GNANADESIKAN, R., KETTENRING, J.R., TSAO, S.L. (1995): Weighting and Selection of Variables for Cluster Analysis. Journal of Classification, 12, 113–136.

    Article  Google Scholar 

  • GORDON, A.D. (1999): Classification, 2nd Edition. Chapman & Hall, Boca Raton.

    Google Scholar 

  • GREEN, P.E., CARMONE, F.J., KIM, J. (1990): A Preliminary Study of Optimal Variable Weighting in k-means Clustering. Journal of Classification, 7, 271–285.

    Article  Google Scholar 

  • HASTIE, T., TIBSHIRANI, R., EISEN, M.B., ALIZADEH, A. et al. (2000): Gene Shaving as a Method for Identifying Distinct Sets of Genes with Similar Expression Patterns. Genome Biology, 1, 1–21.

    Article  Google Scholar 

  • MCLACHLAN, G., PEEL, D. (2000): Finite Mixture Models. John Wiley & Sons, Chichester.

    Google Scholar 

  • MILLIGAN, G.W., COOPER, M.C. (1988): A Study of Standardization of Variables in Cluster Analysis. Journal of Classification, 5, 181–204.

    Article  MathSciNet  Google Scholar 

  • MIRKIN, B. (1999): Concept Learning and Feature Selection Based on Square-Error Clustering. Machine Learning, 35, 25–39.

    Article  MATH  MathSciNet  Google Scholar 

  • MODHA, D.S., SPANGLER, W.S. (2003): Feature Weighting in k-means Clustering. Machine Learning, 52, 217–237.

    Article  Google Scholar 

  • SOFFRITTI, G. (2003): Identifying Multiple Cluster Structures in a Data Matrix. Communications in Statistics: Simulation and Computation, 32, 1151–1177.

    Article  MATH  MathSciNet  Google Scholar 

  • VERMUNT, J.K. and MAGIDSON, J. (2002): Latent Class Cluster Analysis. In: J.A. Hagenaars and A.L. McCutcheon (Eds.): Applied Latent Class Analysis. Cambridge University Press, Cambridge, 89–106.

    Google Scholar 

  • VICHI, M. (2001): Double k-means Clustering for Simultaneous Classification of Objects and Variables. In: S. Borra, R. Rocci, M. Vichi and M. Schader (Eds.): Advances in Classification and Data Analysis. Springer-Verlag, Berlin, 43–52.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Berlin · Heidelberg

About this paper

Cite this paper

Galimberti, G., Soffritti, G. (2006). Identifying Multiple Cluster Structures Through Latent Class Models. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_20

Download citation

Publish with us

Policies and ethics