Skip to main content
Log in

WSM: a novel algorithm for subgraph matching in large weighted graphs

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Given an undirected/directed large weighted data graph and a similar smaller weighted pattern graph, the problem of weighted subgraph matching is to find a mapping of the nodes in the pattern graph to a subset of nodes in the data graph such that the sum of edge weight differences is minimum. Biological interaction networks such as protein-protein interaction networks and molecular pathways are often modeled as weighted graphs in order to account for the high false positive rate occurring intrinsically during the detection process of the interactions. Nonetheless, complex biological problems such as disease gene prioritization and conserved phylogenetic tree construction largely depend on the similarity calculation among the networks. Although several existing methods provide efficient methods for graph and subgraph similarity measurement, they produce nonintuitive results due to the underlying unweighted graph model assumption. Moreover, very few algorithms exist for weighted graph matching that are applicable with the restriction that the data and pattern graph sizes are equal. In this paper, we introduce a novel algorithm for weighted subgraph matching which can effectively be applied to directed/undirected weighted subgraph matching. Experimental results demonstrate the superiority and relative scalability of the algorithm over available state of the art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Almohamad, H. A., & Duffuaa, S. O. (1993). A linear programming approach for the weighted graph matching problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(5), 522–525.

    Article  Google Scholar 

  • Amin, M. S., Bhattacharjee, A., Finley, Jr., R. L., & Jamil, H. (2010a). A stochastic approach to candidate disease gene subnetwork extraction. In ACM international symposium on applied computing (pp. 1534–1538). Sierre, Switzerland.

  • Amin, M. S., Bhattacharjee, A., & Jamil, H. (2010b). A cytoscape based integrative framework for efficient sub-graph isomorphic protein-protein interaction motif lookup. In ACM international symposium on applied computing (pp. 1572–1576). Sierre, Switzerland.

  • Basuchowdhuri, P. (2009). Greedy methods for approximate graph matching with applications for social network analysis. Master’s thesis, Louisiana State University.

  • Bhattacharjee, A., & Jamil, H. (2011). CodeBlast: A graph matching approach toward computing functional similarity of interacting networks. Department of Computer Science, Wayne State University.

  • Date, S. V. (2007). Estimating protein function using protein-protein relationships. Methods in Molecular Biology, 408(12), 109–127.

    Article  Google Scholar 

  • El-Sonbaty, Y., & Ismail, M. A. (1998). A new algorithm for subgraph optimal isomorphism. Pattern Recognition, 31(2), 205–218.

    Article  Google Scholar 

  • Fortin, S. (1996). The graph isomorphism problem. Tech. rep., University of Alberta, Edmonton, Alberta, Canada.

  • Frank, M., & Wolfe, P. (1956). An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3, 95–110.

    Article  MathSciNet  Google Scholar 

  • Gold, S., & Rangarajan, A. (1996). A graduated assignment algorithm for graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 377–388.

    Article  Google Scholar 

  • Hardy, J., & Singleton, A. (2009). Genomewide association studies and human disease. New England Journal of Medicine, 360, 1759–1768.

    Article  Google Scholar 

  • Ideker, T. (2007). Network genomics. Ernst Schering Foundation Symposium Proceedings, 61, 89–115.

    Article  Google Scholar 

  • Kann, M. G. (2007). Protein interactions and disease: computational approaches to uncover the etiology of diseases. Briefings in Bioinformatics, 8(5), 333–346.

    Article  Google Scholar 

  • Knossow, D., Sharma, A., Mateus, D., & Horaud, R. (2009). Inexact matching of large and sparse graphs using laplacian eigenvectors. In International workshop on graph-based representations in pattern recognition (pp. 144–153).

  • Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistic Quarterly, 2, 83–97.

    Article  Google Scholar 

  • Luo, B., & Hancock, E. R. (2001). Structural graph matching using the em algorithm and singular value decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 1120–1136.

    Article  Google Scholar 

  • McKusick, V. A. (1998). Mendelian inheritance in man. A catalog of human genes and genetic disorders (12th ed.). Baltimore: Johns Hopkins University Press.

    Google Scholar 

  • Munkres, J. (1957). Algorithms for the assignment and transportation problems. Journal of the Society of Industrial and Applied Mathematics, 5(1), 32–38.

    Article  MathSciNet  Google Scholar 

  • Navarro, G. (1999). A guided tour to approximate string matching. ACM Computing Surveys, 33, 2001.

    Google Scholar 

  • Raveaux, R., Burie, J. C., & Ogier, J. M. (2010). A graph matching method and a graph matching distance based on subgraph assignments. Pattern Recognition Letters, 31(5), 394–406.

    Article  Google Scholar 

  • Riesen, K., & Bunke, H. (2009). Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision Computing, 27(7), 950–959.

    Article  Google Scholar 

  • Schwikowski, B., Uetz, P., & Fields, S. (2000). A network of protein-protein interactions in yeast. Nature Biotechnology, 18(12), 1257–1261.

    Article  Google Scholar 

  • Tarapata, Z., & Kasprzyk, R. (2009). An application of multicriteria weighted graph similarity method to social networks analyzing. In International conference on advances in social network analysis and mining (pp. 366–368).

  • Tian, Y., McEachin, R. C., Santos, C., States, D. J., & Patel, J. M. (2007). SAGA: A subgraph matching tool for biological graphs. Bioinformatics, 23(2), 232–239.

    Article  Google Scholar 

  • Tohsato, Y., Matsuda, H., & Hashimoto, A. (2000). A multiple alignment algorithm for metabolic pathway analysis using enzyme hierarchy. In: ISMB (pp. 376–383).

  • Uetz, P., & Finley, Jr., R. L. (2005). From protein networks to biological systems. FEBS Letters, 579(8), 1821–1827.

    Article  Google Scholar 

  • Umeyama, S. (1988). An eigendecomposition approach to weighted graph matching problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(5), 695–703.

    Article  Google Scholar 

  • Yu, J., Finley, J., & Russell, L. (2009). Combining multiple positive training sets to generate confidence scores for protein-protein interactions. Bioinformatics, 25(1), 105–111.

    Article  Google Scholar 

  • Yu, J., Pacifico, S., Liu, G., & Finley, R. (2008). DroID: The Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions. BMC Genomics, 9(1), 461–469.

    Article  Google Scholar 

  • Zaslavskiy, M., Bach, F., & Vert, J. P. (2009). A path following algorithm for the graph matching problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 2227–2242.

    Article  Google Scholar 

  • Zavlanos, M. M., & Pappas, G. J. (2008). A dynamical systems approach to weighted graph matching. Automatica, 44(11), 2817–2824.

    Article  MathSciNet  Google Scholar 

  • Zhao, G., Luo, B., Tang, J., & Ma, J. (2007). Using eigen-decomposition method for weighted graph matching. In ICIC (pp. 1283–1294).

Download references

Acknowledgement

This research was supported in part by National Science Foundation grant IIS 0612203.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hasan M. Jamil.

Additional information

This paper is dedicated to the loving memory of Anupam Bhattacharjee, the co-author and the driving force behind this paper, who passed away unexpectedly on September 6, 2010.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhattacharjee, A., Jamil, H.M. WSM: a novel algorithm for subgraph matching in large weighted graphs. J Intell Inf Syst 38, 767–784 (2012). https://doi.org/10.1007/s10844-011-0178-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0178-z

Keywords

Navigation