Abstract
Most biological networks have been proposed to possess modular organization, which increases the robustness, flexibility, and stability of networks. Many clustering methods have been used in mining biological data and partitioning complex networks into functional modules. Most of these methods require presetting the number of modules and therefore can potentially obtain biased results. The Markov clustering method (MCL) and the simulated annealing module-detection method (SA) eliminate this requirement and can objectively separate relatively dense subgraphs. In this paper, we compared these two module-detection methods for three types of biological data: protein family classification, microarray clustering, and modularity of metabolic networks. We found that these two methods show differential advantages for different biological networks. In the case of the gene network based on Affymetrix microarray spike data, MCL exactly identified the same number of groups and same contents in each group set by the spike data. In the case of the gene network derived from actual expression data, although neither of the two methods can perfectly recover the natural classification, MCL performs slightly better than SA. However, with increased random noise added to the gene expression values, SA generates better modular structures with higher modularity. Next we compared the modularization results of MCL and SA for protein family classification and found the modules detected by SA could not be well matched with the Structural Classification of Proteins (SCOP database), which suggests that MCL is ideally suited to the rapid and accurate detection of protein families. In addition, we used both methods to detect modules in the metabolic network of E. coli. MCL gives a trivial clustering, which generates biologically insignificant modules. In contrast, SA detects modules well corresponding to the KEGG functional classification. Moreover the modularity for several other metabolic networks detected by SA is also much higher than that by MCL. In summary, MCL is more suited to modularize relatively complete and definite data, such as a protein family network. In contrast, SA is less sensitive to noise such as experimental error or incomplete data and outperforms MCL when modularizing gene networks based on microarray data and large scale metabolic networks constructed from incomplete databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hartwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell biology. Nature 402, 47–52 (1999)
Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N., Barabási, A.L.: Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002)
Rives, A.W., Galitski, T.: Modular organization of cellular networks. Proc. Natl. Acad. Sci., U. S. A. 100, 1128–1133 (2003)
Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Proc. Natl. Acad. Sci., U. S. A 100, 12123–12128 (2003)
Wilhelm, T., Nasheuer, H.P., Huang, S.: Physical and Functional Modularity of the Protein Network in Yeast. Molecular & Cellular Proteomics 2, 292–298 (2003)
Tanay, A., Sharan, R., Kupiec, M., Shamir, R.: Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl. Acad. Sci., U. S. A 101, 2981–2986 (2004)
Kitano, H.: Biological robustness. Nature Reviews Genetic 5, 826–837 (2004)
Stelling, J., Sauer, U., Szallasi, Z., Doyle, F.J., Doyle, J.: Robustness of Cellular Functions. Cell 118, 675–685 (2004)
Holme, P., Huss, M., Jeong, H.: Subnetwork hierarchies of biochemical pathways. Bioinformatics 19, 532–538 (2003)
Barabasi, A.L., Oltvai, Z.N.: Network biology: Understanding the cells’s functional organization. Nature Rev. Genetics 5, 101–113 (2004)
van Dongen, S.: Graph clustering by flow simulation. PhD thesis. University of Utrecht, Center of mathematics and computer science (2000)
Kannan, R., Vampala, S., Vetta, A.: On clustering: good, bad and spectral. In: Proceedings of 41st Annual Symposium on Foundations of Computer Science, pp. 367–378 (2000)
Gaertler, M.: Clustering with spectral methods. Master’s thesis. University at Kon-stanz (2002)
Brandes, U., Gaertler, M., Wagner, D.: Experiments on graph clustering algorithms. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 568–579. Springer, Heidelberg (2003)
Guimerà , R., Amaral, L.A.N.: Functional cartography of complex metabolic networks. Nature 433, 895–900 (2005)
Guimerà , R., Amaral, L.A.N.: Cartography of complex networks: modules and universal roles. J. Stat. Mech. Theor. Exp. P02001, 1–13 (2005)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998)
Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2, 418–427 (2001)
Guimerà , R., Sales-Pardo, M., Amaral, L.A.N.: Modularity from fluctuations in random graphs and complex networks. Physical Review E 70, 025101(R) (2004)
http://www.affymetrix.com/support/technical/sample_data/datasets.affx
Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30, 1575–1584 (2002)
Harlow, T.J., Gogarten, J.P., Ragan, M.A.: A hybrid clustering approach to recognition of protein families in 114 microbial genomes. BMC Bioinformatics 5, 45 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Z., Zhu, XG., Chen, Y., Li, Y., Liu, L. (2006). Comparison of Modularization Methods in Application to Different Biological Networks. In: Dalkilic, M.M., Kim, S., Yang, J. (eds) Data Mining and Bioinformatics. VDMB 2006. Lecture Notes in Computer Science(), vol 4316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11960669_16
Download citation
DOI: https://doi.org/10.1007/11960669_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68970-6
Online ISBN: 978-3-540-68971-3
eBook Packages: Computer ScienceComputer Science (R0)