Abstract
Clustering has become one of the important data analysis techniques for the discovery of cancer disease. Numerous clustering approaches have been proposed in the recent years. However, handling of high-dimensional cancer gene expression datasets remains an open challenge for clustering algorithms. In this paper, we present an improved graph based clustering algorithm by applying edge betweenness criterion on spanning subgraph. We carry out empirical analysis on artificial datasets and five cancer gene expression datasets. Results of the study show that the proposed algorithm can effectively discover the cancerous tissues and it performs better than two recent graph based clustering algorithms in terms of cluster quality as well as modularity index.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bayá, A.E., Granitto, P.M.: Clustering gene expression data with a penalized graph-based metric. BMC Bioinform. 12(1), 2–19 (2011)
Bayá, A.E., Larese, M.G., Granitto, P.M.: Clustering using PK-D: a connectivity and density dissimilarity. Expert Syst. Appl. 51(1), 151–160 (2016)
Dost, B., Wu, C., Su, A., Bafna, V.: TCLUST: a fast method for clustering genome-scale expression data. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 8(3), 808–818 (2011)
Hoshida, Y., Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P.: Subclass mapping: identifying common subtypes in independent disease data sets. PLoS ONE 2(11), e1195 (2007)
Huttenhower, C., Flamholz, A.I., Landis, J.N., Sahi, S., Myers, C.L., Olszewski, K.L., Hibbs, M.A., Siemers, N.O., Troyanskaya, O.G., Coller, H.A.: Nearest Neighbor Networks: clustering expression data based on gene neighborhoods. BMC Bioinform. 8(250), 1–13 (2007)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)
Jay, J.J., Eblen, J.D., Zhang, Y., Benson, M., Perkins, A.D., Saxton, A.M., Voy, B.H., Chesler, E.J., Langston, M.A.: A systematic comparison of genome-scale clustering algorithms. BMC Bioinform. 13(Suppl 10), S7 (2012)
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)
Jothi, R., Mohanty, S.K., Ojha, A.: Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph. Comput. Biol. Med. 71, 135–148 (2016)
Jothi, R., Mohanty, S.K., Ojha, A.: Fast approximate minimum spanning tree based clustering algorithm. Neurocomputing 272, 542–557 (2017)
Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Pirim, H., Ekşioğlu, B., Perkins, A.D.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)
Ruan, J., Dean, A.K., Zhang, W.: A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst. Biol. 4(1), 8 (2010)
de Souto, M.C., Costa, I.G., de Araujo, D.S., Ludermir, T.B., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinform. 9(1), 1–14 (2008)
Thalamuthu, A., Mukhopadhyay, I., Zheng, X., Tseng, G.C.: Evaluation and comparison of gene clustering methods in microarray analysis. Bioinformatics 22(19), 2405–2412 (2006)
Xu, R., Wunsch, D.C.: Clustering algorithms in biomedical research: a review. IEEE Rev. Biomed. Eng. 3, 120–154 (2010)
Yu, Z., Wong, H.S., Wang, H.: Graph-based consensus clustering for class discovery from gene expression data. Bioinformatics 23(21), 2888–2896 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Jothi, R. (2017). A Betweenness Centrality Guided Clustering Algorithm and Its Applications to Cancer Diagnosis. In: Ghosh, A., Pal, R., Prasath, R. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2017. Lecture Notes in Computer Science(), vol 10682. Springer, Cham. https://doi.org/10.1007/978-3-319-71928-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-71928-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71927-6
Online ISBN: 978-3-319-71928-3
eBook Packages: Computer ScienceComputer Science (R0)