Cohesive Sub-network Mining in Protein Interaction Networks Using Score-Based Co-clustering with MapReduce Model (MR-CoC)

  • R. Gowri
  • R. Rathipriya
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 645)


Nowadays, due to the data deluge situation, every computation has to be carried out in voluminous data. The sub-network mining from the complex and voluminous interaction data is one of the research challenges. The highly connected sub-networks will be more cohesive in the network. They are responsible for communication among the network, which is useful for studying their functionalities. A novel score-based co-clustering (MR-CoC) technique with MapReduce is proposed to mine the highly connected sub-network from interaction networks. The MapReduce environment is chosen to cope with complex, voluminous data and to parallelize the computation process. This approach is used to mine cliques, non-cliques, and overlapping sub-network patterns from the adjacency matrix of the network. The complexity of the proposed work is O (Es + log Ns), which is minimal than the existing approaches like MCODE and spectral clustering.


MapReduce Clustering Protein interaction network Co-clustering Functional coherence Sub-network mining Distributed computing Big data 


  1. 1.
    Structures of Life (2007)Google Scholar
  2. 2.
    Gowri, R., Rathipriya, R.: A study on clustering the protein interaction networks using bio-inspired optimization. IJCII 3, 89–95 (2013)Google Scholar
  3. 3.
    Ekanayake, J., Pallickara, S., Fox, G.: MapReduce for data intensive scientific analyses. In: Proceeding ESCIENCE ‘08 Proceedings of the 2008 Fourth IEEE International Conference on eScience, pp. 277–284 (2008)Google Scholar
  4. 4.
    Chen, S., Schlosser, S.W.: Map-reduce meets wider varieties of applications. Intel Research Pittsburgh, Technical Report 2008, IRP-TR-08-05Google Scholar
  5. 5.
    Rosen, J., Polyzotis, N., Borkar, V., Bu, Y., Carey, M.J., Weimer, M., Condie, T., Ramakrishnan, R.: Iterative mapreduce for large scale machine learningGoogle Scholar
  6. 6.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun ACM 51, 107–113 (2008)Google Scholar
  7. 7.
    Aridhi, S., D’Orazio, L., Maddouri, M., Mephu, E.: A Novel MapReduce-Based Approach for Distributed Frequent Subgraph Mining. RFIA (2014)Google Scholar
  8. 8.
    Hill, S., Srichandan, B., Sunderraman, R.: An iterative MapReduce approach to frequent subgraph mining in biological datasets. In: ACM-BCB’12, pp. 7–10 (2012)Google Scholar
  9. 9.
    Gowri, R., Rathipriya, R.: Extraction of protein sequence motif information using PSO K-Means. J. Netw. Inf. Secur. (2014)Google Scholar
  10. 10.
    Gowri, R., Sivabalan, S., Rathipriya, R.: Biclustering using venus flytrap optimization algorithm. In: Proceedings of International Conference on Computational Intelligence in Data Mining CIDM, Advances in Intelligent Systems and Computing series, vol. 410, pp. 199–207 (2015)Google Scholar
  11. 11.
    Gowri, R., Rathipriya, R.: Protein motif comparator using PSO k-means. Int. J. Appl. Metaheuristic Comput. (IJAMC) 7 (2016)Google Scholar
  12. 12.
    Bader, G.D., Hogue, C.W.V.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4 (2003)Google Scholar
  13. 13.
    Laix, L., Qinzx, L., Linx, X., Chang, L.: Scalable subgraph enumeration in MapReduce. In: Proceedings of the VLDB Endowment, vol. 8, pp. 974–985Google Scholar
  14. 14.
    Pereira-Leal, J.B., Enright, A.J., Ouzounis, C.A.: Detection of functional modules from protein interaction networks. In: PROTEINS: Struct. Funct. Bioinform. 54, 49–57 (2004)Google Scholar
  15. 15.
    Dittrich, M.T., Klau, G.W., Rosenwald, A., Dandekar, T., Müller, T.: Identifying functional modules in protein-protein interaction networks: an integrated exact approach. ISMB 24, 223–231 (2008)Google Scholar
  16. 16.
    Pinkert, S., Schultz, J., Reichardt, J.: Protein interaction networks—more than mere modules. PLoS Comput. Biol. 6 (2010)Google Scholar
  17. 17.
    Zhang, Y., Zeng, E., Li, T., Narasimhan, G.: Weighted Consensus Clustering for Identifying Functional Modules In Protein-Protein Interaction NetworksGoogle Scholar
  18. 18.
    A Faster Algorithm for Detecting Motifs. In: 5th WABI-05, vol. 3692, pp. 165–177. Springer (2005) Google Scholar
  19. 19.
    Kashtan, N., Itzkovitz, S., Milo, R., Alon, U.: Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20 (11), 1746–1758 (2004)Google Scholar
  20. 20.
    Gursoy, A., Keskin, O., Nussinov, R.: Topological properties of protein interaction networks from a structural perspective. Biochem. Soc. Trans. 1398–1403 (2008)Google Scholar
  21. 21.
    Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 27–64 (2007)Google Scholar
  22. 22.
    Diestel, R.: Graph Theory. Springer (2016)Google Scholar
  23. 23.
    Ray, S.S.: Subgraphs, paths and connected graphs. In: Graph Theory with Algorithms and its Applications, pp. 11–24 (2013)Google Scholar
  24. 24.
    Bapat, R.B.: Graphs and Matrices. Springer, Hindustan Book Agency (2010)Google Scholar
  25. 25.
    Ke, H., Li, P., Guo, S., Guo, M.: On traffic-aware partition and aggregation in MapReduce for Big Data applications. IEEE Trans. Parallel Distrib. Syst. (2015)Google Scholar
  26. 26.
    Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., Huerta-Cepas, J., Simonovic, M., Roth, A., Santos, A., Tsafou, K.P., Kuhn, M., Bork, P., Jensen, L.J., von Mering, C.: STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucl. Acids Res. 43, 447–452 (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Department of Computer SciencePeriyar UniversitySalemIndia

Personalised recommendations