Abstract
Scientists are facing two important challenges when investigating life processes. First, biological systems, from gene regulation to physiological mechanisms, are inherently multiscale. Second, complex disease data collection is an expensive process, and yet the analyses are presented in a rather empirical and sometimes simplistic way, completely missing the opportunity of uncovering patterns of predictive relationships and meaningful profiles. In this work, we propose a multi-view clustering methodology that, although quite general, could be used to identify patient subgroups, for different omic information, by studying the hierarchical structures of the patient data in each view and merging their topologies. We first demonstrate the ability of our method to identify hierarchical structures in synthetic data sets and then apply it to real multi-view multi-omic data sets. Our results, although preliminary, suggest that this methodology outperforms single-view clustering approaches and could open several directions for improvements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)
Cai, X., Nie, F., Huang, H.: Multi-view k-means clustering on big data. In: IJCAI, pp. 2598–2604 (2013)
Carmel, L., Harel, D., Koren, Y.: Drawing directed graphs using one-dimensional optimization. In: Goodrich, M.T., Kobourov, S.G. (eds.) GD 2002. LNCS, vol. 2528, pp. 193–206. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36151-0_19
Carroll, J.S., et al.: Genome-wide analysis of estrogen receptor binding sites. Nat. Genet. 38(11), 1289 (2006)
Castro, M.A., Wang, X., Fletcher, M.N., Meyer, K.B., Markowetz, F.: Reder: R/bioconductor package for representing modular structures, nested networks and multiple levels of hierarchical associations. Genome Biol. 13(4), R29 (2012)
Chaudhuri, K., Kakade, S.M., Livescu, K., Sridharan, K.: Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 129–136. ACM (2009)
Ciriello, G., Miller, M.L., Aksoy, B.A., Senbabaoglu, Y., Schultz, N., Sander, C.: Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45(10), 1127–1133 (2013)
Greene, D., Cunningham, P.: A matrix factorization approach for integrating multiple data views. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5781, pp. 423–438. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04180-8_45
Grigorov, M.G.: Global properties of biological networks. Drug Discov. Today 10(5), 365–372 (2005)
Herlau, T., Mørup, M., Schmidt, M.N., Hansen, L.K.: Detecting hierarchical structure in networks. In: 2012 3rd International Workshop on Cognitive Information Processing (CIP), pp. 1–6. IEEE (2012)
Kandoth, C., et al.: Mutational landscape and significance across 12 major cancer types. Nature 502(7471), 333–339 (2013)
Liu, J., Wang, C., Gao, J., Han, J.: Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp. 252–260. SIAM (2013)
Mones, E., Vicsek, L., Vicsek, T.: Hierarchy measure for complex networks. PloS one 7(3), e33799 (2012)
Petz, D.: Entropy, von Neumann and the von Neumann entropy. In: Rédei, M., Stöltzner, M. (eds.) John von Neumann and the Foundations of Quantum Physics. VCIY, vol. 8, pp. 83–96. Springer, Dordrecht (2001). https://doi.org/10.1007/978-94-017-2012-0_7
Ruan, J., Dean, A.K., Zhang, W.: A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst. Biol. 4(1), 8 (2010)
Serra, A., Fratello, M., Fortino, V., Raiconi, G., Tagliaferri, R., Greco, D.: MVDA: a multi-view genomic data integration methodology. BMC Bioinform. 16(1), 1 (2015)
Serra, A., Fratello, M., Greco, D., Tagliaferri, R.: Data integration in genomics and systems biology. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp. 1272–1279. IEEE (2016)
Shavit, Y., Walker, B.J., et al.: Hierarchical block matrices as efficient representations of chromosome topologies and their application for 3C data integration. Bioinformatics 32(8), 1121–1129 (2016)
Taskesen, E., et al.: Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics. Sci. Rep. 6, 24949 (2016)
Trusina, A., Maslov, S., Minnhagen, P., Sneppen, K.: Hierarchy measures in complex networks. Phys. Rev. Lett. 92(17), 178702 (2004)
Verhaak, R.G., et al.: Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17(1), 98–110 (2010)
Wang, B., et al.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11(3), 333–337 (2014)
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
West, M., et al.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. 98(20), 11462–11467 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Serra, A., Guida, M.D., Lió, P., Tagliaferri, R. (2019). Hierarchical Block Matrix Approach for Multi-view Clustering. In: Bartoletti, M., et al. Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2017. Lecture Notes in Computer Science(), vol 10834. Springer, Cham. https://doi.org/10.1007/978-3-030-14160-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-14160-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14159-2
Online ISBN: 978-3-030-14160-8
eBook Packages: Computer ScienceComputer Science (R0)