Abstract
Graphs have been widely used in social networks to find interesting relationships between individuals. To mine the wealthy information in an attributed graph, effective and efficient graph matching methods are critical. However, due to the noisy and the incomplete nature of real graph data, approximate graph matching is essential. On the other hand, most users are only interested in the top-k similar matching, which proposed the problem of top-k similarity search in large attributed graphs. In this paper, we propose a novel technique to find top-k similar subgraphs. To prune unpromising data nodes effectively, our indexing structure is established based on the nodes degrees and their neighborhood connections. Then, a novel method combining graph structure and node attributes is used to calculate the similarity of matchings to find the top-k results. We integrate the adapted TA into the procedure to further enhance the similar graph search. Extensive experiments are performed on a social graph to evaluate the effectiveness and efficiency of our methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tassa, T., Cohen, D.: Anonymization of Centralized and Distributed Social Networks by Sequential Clustering. In: TKDE, pp. 1–14. IEEE Press, New York (2011)
Ullmann, J.R.: An Algorithm for Subgraph Isomorphism. J. ACM, 31–42 (1976)
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: Subgraph Transformations for the Inexact Matching of Attributed Relational Graphs, pp. 43–52. Springer, Vienna (1998)
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An Improved Algorithm for Matching Large Graphs. In: 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, pp. 149–159 (2001)
Shang, H., Zhang, Y., Lin, X., Yu, J.X.: Taming Verification Hardness: An Efficient Algorithm for Testing Subgraph Isomorphism. In: VLDB, pp. 364–375 (2008)
Shasha, D., Wang, J.T.L., Giugno, R.: Algorithmics and Applications of Tree and Graph Searching. In: PODS, pp. 39–52. ACM Press, New York (2002)
Cheng, J., Ke, Y., Ng, W., Lu, A.: FG-Index: Towards Verification-Free Query Processing on Graph Databases. In: SIGMOD, pp. 857–872. ACM Press, New York (2007)
Yan, X., Yu, P.S., Han, J.: Graph Indexing: A Frequent Structure-based Approach. In: SIGMOD, pp. 335–346. ACM Press, New York (2004)
Zhao, P., Yu, J.X., Yu, P.S.: Graph Iindexing: Tree+ Delta > = Graph. In: VLDB, pp. 938–949 (2007)
He, H., Singh, A.K.: Closure-Tree: An Index Structure for Graph Queries. In: ICDE, pp. 38–49. IEEE Press, New York (2006)
Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. In: PODS, pp. 102–113 (2001)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco (1979)
Han, J., Kamber, M., Pei, J.: Data mining: Concepts and Techniques. Morgan Kaufmann (2006)
Tong, H., Gallagher, B., Faloutsos, C., Eliassi-Rad, T.: Fast Best-Effort Pattern Matching in Large Attributed Graphs. In: ACM KDD, New York, pp. 737–746 (2007)
Amin, M.S., Finley Jr., R.L., Jamil, H.M.: Top-k Similar Graph Matching Using TraM in Biological Networks. In: TCBB, New York, pp. 1790–1804 (2012)
Wang, G., Wang, B., Yang, X., Yu, G.: Efficiently Indexing Large Sparse Graphs for Similarity Search. In: TKDE, pp. 440–451. IEEE Press, New York (2012)
Mongiovi, M., Natale, R.D., Giugno, R., Pulvirenti, A., Ferro, A.: Sigma: A Set-Cover-Based Inexact Graph Matching Algorithm. Journal of Bioinformatics and Computational Biology, 199–218 (2010)
Shang, H., Zhu, K., Lin, X., Zhang, Y., Ichise, R.: Similarity Search on Supergraph Containment. In: ICDE, pp. 637–648. IEEE Press, New York (2004)
Wang, X., Ding, X., Tung, A.K.H., Ying, S., Jin, H.: An Efficient Graph Indexing Method. In: ICDE, pp. 210–221. IEEE Press, New York (2012)
Zhao, X., Xiao, C., Lin, X., Wang, W.: Efficient Graph Similarity Joins with Edit Distance Constraints. In: ICDE, pp. 834–845. IEEE Press, New York (2012)
Zeng, Z., Tung, A.K.H., Wang, J., Feng, J., Zhou, L.: Comparing Stars: On Approximating Graph Edit Distance. In: VLDB, pp. 25–36 (2009)
Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood Based Fast Graph Search in Large Networks. In: SIGMOD, New York, pp. 901–912 (2011)
Tian, Y., Patel, J.M.: TALE: A Tool for Approximate Large Graph Matching. In: ICDE, pp. 963–972. IEEE Press, New York (2008)
Shang, H., Lin, X., Zhang, Y., Yu, J.X., Wang, W.: Connected Substructure Similarity Search. In: SIGMOD, pp. 903–914. ACM Press, New York (2010)
Yan, X., Yu, P.S., Han, J.: Substructure Similarity Search in Graph Databases. In: SIGMOD, pp. 766–777. ACM Press, New York (2005)
Zhu, G., Lin, X., Zhu, K., Zhang, W., Yu, J.X.: TreeSpan: Efficiently Computing Similarity All-Matching. In: SIGMOD, New York, pp. 529–540 (2012)
Sun, Z., Wang, H., Wang, H., Shao, B., Li, J.: Efficient Subgraph Matching on Billion Node Graphs. In: VLDB, pp. 788–799 (2012)
Zou, L., Chen, L., Lu, Y.: Top-K Subgraph Matching Query in A Large Graph. In: CIKM, pp. 139–146 (2007)
Kriege, N., Mutzel, P.: Subgraph Matching Kernels for Attributed Graphs. In: ICML, pp. 1–8 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ding, X., Jia, J., Li, J., Liu, J., Jin, H. (2014). Top-k Similarity Matching in Large Graphs with Attributes. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds) Database Systems for Advanced Applications. DASFAA 2014. Lecture Notes in Computer Science, vol 8422. Springer, Cham. https://doi.org/10.1007/978-3-319-05813-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-05813-9_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05812-2
Online ISBN: 978-3-319-05813-9
eBook Packages: Computer ScienceComputer Science (R0)