Skip to main content

Summarization Graph Indexing: Beyond Frequent Structure-Based Approach

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4947))

Included in the following conference series:

Abstract

Graph is an important data structure to model complex structural data, such as chemical compounds, proteins, and XML documents. Among many graph data-based applications, sub-graph search is a key problem, which is defined as given a query Q, retrieving all graphs containing Q as a sub-graph in the graph database. Most existing sub-graph search methods try to filter out false positives (graphs that are not possible in the results) as many as possible by indexing some frequent sub-structures in graph database, such as [20,22,4,23]. However, due to ignoring the relationships between sub-structures, these methods still admit a high percentage of false positives. In this paper, we propose a novel concept, Summarization Graph, which is a complete graph and captures most topology information of the original graph, such as sub-structures and their relationships. Based on Summarization Graphs, we convert the filtering problem into retrieving objects with set-valued attributes. Moreover, we build an efficient signature file-based index to improve the filtering process. We prove theoretically that the pruning power of our method is larger than existing structure-based approaches. Finally, we show by extensive experimental study on real and synthetic data sets that the size of candidate set generated by Summarization Graph-based approach is only about 50% of that left by existing graph indexing methods, and the total response time of our method is reduced 2-10 times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Available at: http://amalfi.dis.unina.it/graph

  2. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne., P.E.: Frequent subtree mining - an overview. Nucleic Acids Research 23(10) (2000)

    Google Scholar 

  3. Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Community mining from multi-relational networks. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Cheng, J., Ke, Y., Ng, W., Lu, A.: fg-index: Towards verification-free query processing on graph databases. In: SIGMOD (2007)

    Google Scholar 

  5. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithm, 2nd edn. MIT Press, Cambridge (2000)

    Google Scholar 

  6. Huan, D.W.W.J., Wang, W.: Graph database indexing using structured graph decomposition. In: ICDE (2007)

    Google Scholar 

  7. Fortin, S.: The graph isomorphism problem. Department of Computing Science, University of Alberta (1996)

    Google Scholar 

  8. Jiang, P.Y.H., Wang, H., Zhou, S.: Gstring: A novel approach for efficient search in graph databases. In: ICDE (2007)

    Google Scholar 

  9. He, H., Singh, A.K.: Closure-tree: An index structure for graph queries. In: ICDE (2006)

    Google Scholar 

  10. Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  11. James, C.A., Weininger, D., Delany, J.: Daylight theory manual daylisght version 4.82. Daylight Chemical Information Systems, Inc. (2003)

    Google Scholar 

  12. Ke, Y., Cheng, J., Ng, W.: Correlation search in graph databases. In: SIGKDD (2007)

    Google Scholar 

  13. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM (2001)

    Google Scholar 

  14. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM (2001)

    Google Scholar 

  15. Petrakis, E.G.M., Faloutsos, C.: Similarity searching in medical image databases. IEEE Transactions on Knowledge and Data Enginnering 9(3) (1997)

    Google Scholar 

  16. Shasha, D., Wang, J.T.-L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS (2002)

    Google Scholar 

  17. Tousidou, E., Bozanis, P., Manolopoulos, Y.: Signature-based structures for objects with set-valued attributes. Inf. Syst. 27(2) (2002)

    Google Scholar 

  18. Willett., P.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38(6) (1998)

    Google Scholar 

  19. Yan, X., Han., J.: Gspan: Graph-based substructure pattern mining. In: Proc. of Int. Conf. on Data Mining (2002)

    Google Scholar 

  20. Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: SIGMOD (2004)

    Google Scholar 

  21. Zhang, N., Özsu, M.T., Ilyas, I.F., Aboulnaga, A.: Fix: Feature-based indexing technique for XML documents. In: VLDB (2006)

    Google Scholar 

  22. Zhang, S., Hu, M., Yang, J.: Treepi: A novel graph indexing method. In: ICDE (2007)

    Google Scholar 

  23. Zhao, P., Yu, J.X., Yu, P.S.: Graph Indexing: Tree + Delta > =Graph. In: VLDB (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jayant R. Haritsa Ramamohanarao Kotagiri Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zou, L., Chen, L., Zhang, H., Lu, Y., Lou, Q. (2008). Summarization Graph Indexing: Beyond Frequent Structure-Based Approach. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78568-2_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78567-5

  • Online ISBN: 978-3-540-78568-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics