Abstract
Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or contingency tables. However, very little work has been done to perform co-clustering on mixed type data. In this article, we extend the latent block models based co-clustering to the case of mixed data (continuous and binary variables). We then evaluate the effectiveness of the proposed approach on simulated data and we discuss its advantages and potential limits.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bhatia, P., Iovleff, S., & Govaert, G. (2014). Blockcluster: An R package for model based co-clustering. Working paper or preprint https://hal.inria.fr/hal-01093554.
Bock, H. (1979). Simultaneous clustering of objects and variables. In E. Diday (Ed.), Analyse des données et Informatique, pp. 187–203. INRIA.
Brault, V., & Lomet, A. (2015). Revue des méthodes pour la classification jointe des lignes et des colonnes d’un tableau. Journal de la Société Française de Statistique, 156(3), 27–51.
Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. In Proceedings of the International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103. AAAI Press.
Dhillon, I. S., Mallela, S., & Modha, D. S. (2003). Information-theoretic co-clustering. In Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining, pp. 89–98. ACM Press.
Good, I. J. (1965). Categorization of classification. Mathematics and Computer Science in Biology and Medicine (pp. 115–125). London: Her Majesty’s Stationery Office.
Govaert, G., & Nadif, M. (2003). Clustering with block mixture models. Pattern Recognition, 36(2), 463–473.
Govaert, G., & Nadif, M. (2008). Block clustering with Bernoulli mixture models : Comparison of different approaches. Computational Statistics and Data Analysis, 52(6), 3233–3245.
Govaert, G., & Nadif, M. (2013). Co-clustering. ISTE Ltd and Wiley.
Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.
Hintze, J. L., & Nelson, R. D. (1998). Violin plots: A box plot-density trace synergism. The American Statistician, 52(2), 181–184.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Mariadassou, M., & Matias, C. (2015). Convergence of the groups posterior distribution in latent or stochastic block models. Bernoulli, 21(1), 537–573.
McParland, D., & Gormley, I. C. (2016). Model based clustering for mixed data: ClustMD. Advances in Data Analysis and Classification, 10(2), 155–169. Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bouchareb, A., Boullé, M., Clérot, F., Rossi, F. (2019). Model Based Co-clustering of Mixed Numerical and Binary Data. In: Pinaud, B., Guillet, F., Gandon, F., Largeron, C. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 834. Springer, Cham. https://doi.org/10.1007/978-3-030-18129-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-18129-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18128-4
Online ISBN: 978-3-030-18129-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)