Skip to main content

Model Based Co-clustering of Mixed Numerical and Binary Data

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 834))

Abstract

Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or contingency tables. However, very little work has been done to perform co-clustering on mixed type data. In this article, we extend the latent block models based co-clustering to the case of mixed data (continuous and binary variables). We then evaluate the effectiveness of the proposed approach on simulated data and we discuss its advantages and potential limits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bhatia, P., Iovleff, S., & Govaert, G. (2014). Blockcluster: An R package for model based co-clustering. Working paper or preprint https://hal.inria.fr/hal-01093554.

  • Bock, H. (1979). Simultaneous clustering of objects and variables. In E. Diday (Ed.), Analyse des données et Informatique, pp. 187–203. INRIA.

    Google Scholar 

  • Brault, V., & Lomet, A. (2015). Revue des méthodes pour la classification jointe des lignes et des colonnes d’un tableau. Journal de la Société Française de Statistique, 156(3), 27–51.

    MathSciNet  MATH  Google Scholar 

  • Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. In Proceedings of the International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103. AAAI Press.

    Google Scholar 

  • Dhillon, I. S., Mallela, S., & Modha, D. S. (2003). Information-theoretic co-clustering. In Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining, pp. 89–98. ACM Press.

    Google Scholar 

  • Good, I. J. (1965). Categorization of classification. Mathematics and Computer Science in Biology and Medicine (pp. 115–125). London: Her Majesty’s Stationery Office.

    Google Scholar 

  • Govaert, G., & Nadif, M. (2003). Clustering with block mixture models. Pattern Recognition, 36(2), 463–473.

    Article  Google Scholar 

  • Govaert, G., & Nadif, M. (2008). Block clustering with Bernoulli mixture models : Comparison of different approaches. Computational Statistics and Data Analysis, 52(6), 3233–3245.

    Article  MathSciNet  Google Scholar 

  • Govaert, G., & Nadif, M. (2013). Co-clustering. ISTE Ltd and Wiley.

    Google Scholar 

  • Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.

    MATH  Google Scholar 

  • Hintze, J. L., & Nelson, R. D. (1998). Violin plots: A box plot-density trace synergism. The American Statistician, 52(2), 181–184.

    Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • Mariadassou, M., & Matias, C. (2015). Convergence of the groups posterior distribution in latent or stochastic block models. Bernoulli, 21(1), 537–573.

    Article  MathSciNet  Google Scholar 

  • McParland, D., & Gormley, I. C. (2016). Model based clustering for mixed data: ClustMD. Advances in Data Analysis and Classification, 10(2), 155–169. Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Boullé .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bouchareb, A., Boullé, M., Clérot, F., Rossi, F. (2019). Model Based Co-clustering of Mixed Numerical and Binary Data. In: Pinaud, B., Guillet, F., Gandon, F., Largeron, C. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 834. Springer, Cham. https://doi.org/10.1007/978-3-030-18129-1_1

Download citation

Publish with us

Policies and ethics