Abstract
The discovery of dense biclusters in biological networks received an increasing attention in recent years. However, despite the importance of understanding the cell behavior, dense biclusters can only identify modules where genes, proteins or metabolites are strongly connected. These modules are thus often associated with trivial, already known interactions or background processes not necessarily related with the studied conditions. Furthermore, despite the availability of biclustering algorithms able to discover modules with more flexible coherency, their application over large-scale biological networks is hampered by efficiency bottlenecks. In this work, we propose BicNET (Biclustering NETworks), an algorithm to discover non-trivial yet coherent modules in weighted biological networks with heightened efficiency. First, we motivate the relevance of discovering network modules given by constant, symmetric and plaid biclustering models. Second, we propose a solution to discover these flexible modules without time and memory bottlenecks by seizing high efficiency gains from the inherent structural sparsity of networks. Results from the analysis of protein and gene interaction networks support the relevance and efficiency of BicNET.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Let \(\mathcal {L}\) be a finite set of items, and P an itemset \(P\subseteq \mathcal {L}\). A discrete matrix D is a set of transactions in \(\mathcal {L}\), \(\{P_1,..,P_n\}\). Let the coverage \(\varPhi _{P}\) of an itemset P be the set of transactions in D in which P occurs, \(\{P_i \in D\mid P\subseteq P_i\}\), and its support \(sup_P\) be the coverage size, \(\mid \) \(\varPhi _{P}\) \(\mid \). Given D and a minimum support \(\theta \), the frequent itemset mining task aims to compute: \(\{P \mid P \subseteq \mathcal {L}, sup_P \ge \theta \}\).
Given D, let a matrix A be the concatenation of D elements with their column indexes. Let \(\varPsi _P\) of an itemset P in A be its indexes, and \(\varUpsilon _P\) be its original items in \(\mathcal {L}\). A set of biclusters \(\cup _k (I_k,J_k)\) can be derived from frequent itemsets \(\cup _k P_k\) by mapping \((I_k,J_k)\)=\((\varPhi _{P_k},\varPsi _{P_k})\) to compose constant biclusters with coherency across rows (\((I_k,J_k)\)=\((\varPsi _{P_k},\varPhi _{P_k})\) for column-coherency) with pattern \(\varUpsilon _P\).
- 2.
Sparse prior equation with decreasing sparsity until able to retrieve a non-empty set of biclusters.
References
Atluri, G., Bellay, J., Pandey, G., Myers, C., Kumar, V.: Discovering coherent value bicliques in genetic interaction data. In: IW on Data Mining in Bioinformatics (2010)
Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5(2), 101–113 (2004)
Bellay, J., Atluri, G., Sing, T.L., Toufighi, K., Costanzo, M., et al.: Putting genetic interactions in context through a global modular decomposition. Genome Res. 21(8), 1375–1387 (2011)
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving submatrix problem. In: RECOMB, pp. 49–57. ACM (2002)
Berg, J., Lässig, M.: Local graph alignment and motif search in biological networks. Nat. Acad. Sci. 101(41), 14689–14694 (2004)
Chen, J., Yuan, B.: Detecting functional modules in the yeast protein protein interaction network. Bioinformatics 22(18), 2283–2290 (2006)
Cheng, Y., Church, G.: Biclustering of expression data. In: ISMB, pp. 93–103. AAAI (2000)
Colak, R.: Towards finding the complete modulome: density constrained biclustering. Ph.D. thesis, Simon Fraser University (2008)
Colak, R., Moser, F., Chu, J.S.C., Schönhuth, A., Chen, N., Ester, M.: Module discovery by exhaustive search for densely connected, co-expressed regions in biomolecular interaction networks. PLoS One 5(10), e13348 (2010)
Dao, P., Colak, R., Salari, R., Moser, F., Davicioni, E., Schnhuth, A., Ester, M.: Inferring cancer subnetwork markers using density-constrained biclustering. Bioinformatics 26(18), i625–i631 (2010)
Ding, C., Zhang, Y., Li, T., Holbrook, S.: Biclustering protein complex interactions with a biclique finding algorithm. In: ICDM, pp. 178–187 (2006)
Georgii, E., Dietmann, S., Uno, T., Pagel, P., Tsuda, K.: Enumeration of condition-dependent dense modules in protein interaction networks. Bioinformatics 25(7), 933–940 (2009)
Henriques, R., Madeira, S.: Biclustering with flexible plaid models to unravel interactions between biological processes. IEEE/ACM TCBB (2015). doi:10.1109/TCBB.2014.2388206
Henriques, R., Antunes, C., Madeira, S.C.: A structured view on pattern mining-based biclustering. Pattern Recognition (2015). http://www.sciencedirect.com/science/article/pii/S003132031500240X
Henriques, R., Madeira, S.: Bicpam: pattern-based biclustering for biomedical data analysis. Algorithms Mol. Biol. 9(1), 27 (2014)
Henriques, R., Madeira, S.C.: Pattern-based biclustering with constraints for gene expression data analysis. In: 17th Portuguese Conference on Artificial Intelligence (EPIA-2015), Computational Methods in Bioinformatics and Systems Biology (CMBSB), Coimbra, Portugal. LNAI. Springer, Heidelberg (2015)
Henriques, R., Madeira, S.C., Antunes, C.: F2g: efficient discovery of full-patterns. In: ECML/PKDD IW on New Frontiers to Mine Complex Patterns. Springer-Verlag (2013)
Hochreiter, S., et al.: FABIA: factor analysis for bicluster acquisition. Bioinformatics 26(12), 1520–1527 (2010)
Ideker, T., Ozier, O., Schwikowski, B., Siegel, A.F.: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18(suppl 1), S233–S240 (2002)
Ihmels, J., Bergmann, S., Barkai, N.: Defining transcription modules using large-scale gene expression data. Bioinformatics 20(13), 1993–2003 (2004)
Koh, J.L.Y., Ding, H., Costanzo, M., Baryshnikova, A., Toufighi, K., Bader, G.D., Myers, C.L., Andrews, B.J., Boone, C.: Drygin: a database of quantitative genetic interaction networks in yeast. Nucleic Acids Res. 38(suppl 1), D502–D507 (2010)
MacPherson, J.I., Dickerson, J., Pinney, J., Robertson, D.: Patterns of HIV-1 protein interaction identify perturbed host-cellular subsystems. PLoS Comput. Biol. 6(7), e1000863 (2010)
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM TCBB 1(1), 24–45 (2004)
Maulik, U., Mukhopadhyay, A., Bhattacharyya, M., Kaderali, L., Brors, B., Bandyopadhyay, S., Eils, R.: Mining quasi-bicliques from HIV-1-human protein interaction network: a multiobjective biclustering approach. IEEE/ACM TCBB 10(2), 423–435 (2013)
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: A novel biclustering approach to association rule mining for predicting HIV-1 human protein interactions. PLoS ONE 7(4), e32289 (2012)
Murali, T.M., Kasif, S.: Extracting conserved gene expression motifs from gene expression data. Pac. Symp. Biocomput. 8, 77–88 (2003)
Pereira-Leal, J.B., Enright, A.J., Ouzounis, C.A.: Detection of functional modules from protein interaction networks. Proteins Struct. Funct. Bioinf. 54(1), 49–57 (2004)
Segal, E., Wang, H., Koller, D.: Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics 19(suppl 1), i264–i272 (2003)
Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3(1), 88 (2007)
Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Natl. Acad. Sci. 100(21), 12123–12128 (2003)
Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., Huerta-Cepas, J., Simonovic, M., Roth, A., Santos, A., Tsafou, K.P., et al.: String v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2014). p.gku1003
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, 136–144 (2002)
Tomaino, V., Guzzi, P.H., Cannataro, M., Veltri, P.: Experimental comparison of biclustering algorithms for PPI networks. In: BCB, pp. 671–676. ACM (2010)
Xiong, H., Heb, X.F., Ding, C., Zhang, Y., Kumar, V., Holbrook, S.R.: Identification of functional modules in protein complexes via hyperclique pattern discovery. Pac. Symp. Biocomput. 10, 221–232 (2005)
Acknowledgments
This work was supported by FCT under the project UID/CEC/ 50021/2013 and the PhD grant SFRH/BD/75924/2011 to RH.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Henriques, R., Madeira, S.C. (2015). BicNET: Efficient Biclustering of Biological Networks to Unravel Non-Trivial Modules. In: Pop, M., Touzet, H. (eds) Algorithms in Bioinformatics. WABI 2015. Lecture Notes in Computer Science(), vol 9289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48221-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-662-48221-6_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48220-9
Online ISBN: 978-3-662-48221-6
eBook Packages: Computer ScienceComputer Science (R0)