Model Based Co-clustering of Mixed Numerical and Binary Data

Bouchareb, Aichetou; Boullé, Marc; Clérot, Fabrice; Rossi, Fabrice

doi:10.1007/978-3-030-18129-1_1

Aichetou Bouchareb^6,7,
Marc Boullé⁶,
Fabrice Clérot⁶ &
…
Fabrice Rossi⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 834))

237 Accesses
1 Altmetric

Abstract

Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or contingency tables. However, very little work has been done to perform co-clustering on mixed type data. In this article, we extend the latent block models based co-clustering to the case of mixed data (continuous and binary variables). We then evaluate the effectiveness of the proposed approach on simulated data and we discuss its advantages and potential limits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bhatia, P., Iovleff, S., & Govaert, G. (2014). Blockcluster: An R package for model based co-clustering. Working paper or preprint https://hal.inria.fr/hal-01093554.
Bock, H. (1979). Simultaneous clustering of objects and variables. In E. Diday (Ed.), Analyse des données et Informatique, pp. 187–203. INRIA.
Google Scholar
Brault, V., & Lomet, A. (2015). Revue des méthodes pour la classification jointe des lignes et des colonnes d’un tableau. Journal de la Société Française de Statistique, 156(3), 27–51.
MathSciNet MATH Google Scholar
Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. In Proceedings of the International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103. AAAI Press.
Google Scholar
Dhillon, I. S., Mallela, S., & Modha, D. S. (2003). Information-theoretic co-clustering. In Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining, pp. 89–98. ACM Press.
Google Scholar
Good, I. J. (1965). Categorization of classification. Mathematics and Computer Science in Biology and Medicine (pp. 115–125). London: Her Majesty’s Stationery Office.
Google Scholar
Govaert, G., & Nadif, M. (2003). Clustering with block mixture models. Pattern Recognition, 36(2), 463–473.
Article Google Scholar
Govaert, G., & Nadif, M. (2008). Block clustering with Bernoulli mixture models : Comparison of different approaches. Computational Statistics and Data Analysis, 52(6), 3233–3245.
Article MathSciNet Google Scholar
Govaert, G., & Nadif, M. (2013). Co-clustering. ISTE Ltd and Wiley.
Google Scholar
Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.
MATH Google Scholar
Hintze, J. L., & Nelson, R. D. (1998). Violin plots: A box plot-density trace synergism. The American Statistician, 52(2), 181–184.
Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article Google Scholar
Mariadassou, M., & Matias, C. (2015). Convergence of the groups posterior distribution in latent or stochastic block models. Bernoulli, 21(1), 537–573.
Article MathSciNet Google Scholar
McParland, D., & Gormley, I. C. (2016). Model based clustering for mixed data: ClustMD. Advances in Data Analysis and Classification, 10(2), 155–169. Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

Orange Labs, 2 Avenue Pierre Marzin, 22300, Lannion, France
Aichetou Bouchareb, Marc Boullé & Fabrice Clérot
SAMM EA 4534 - University of Paris 1 Panthéon-Sorbonne, 90 rue Tolbiac, 75013, Paris, France
Aichetou Bouchareb & Fabrice Rossi

Authors

Aichetou Bouchareb
View author publications
You can also search for this author in PubMed Google Scholar
Marc Boullé
View author publications
You can also search for this author in PubMed Google Scholar
Fabrice Clérot
View author publications
You can also search for this author in PubMed Google Scholar
Fabrice Rossi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Boullé .

Editor information

Editors and Affiliations

University of Bordeaux, Bordeaux, France
Bruno Pinaud
Polytechnic School of the University of Nantes, University of Nantes, Nantes, France
Fabrice Guillet
University of Côte d'Azur, Inria, Sophia Antipolis, France
Fabien Gandon
CNRS, Hubert Curien Laboratory, University of Lyon, Université Jean Monnet, Saint-Etienne, Saint-Étienne, France
Christine Largeron

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bouchareb, A., Boullé, M., Clérot, F., Rossi, F. (2019). Model Based Co-clustering of Mixed Numerical and Binary Data. In: Pinaud, B., Guillet, F., Gandon, F., Largeron, C. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 834. Springer, Cham. https://doi.org/10.1007/978-3-030-18129-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-18129-1_1
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18128-4
Online ISBN: 978-3-030-18129-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics