Advertisement

Enhancing Graph Database Indexing by Suffix Tree Structure

  • Vincenzo Bonnici
  • Alfredo Ferro
  • Rosalba Giugno
  • Alfredo Pulvirenti
  • Dennis Shasha
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)

Abstract

Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these graph databases, scientists require systems that search for all occurrences of a query graph. To deal efficiently with graph searching, advanced methods for indexing, representation and matching of graphs have been proposed.

This paper presents GraphGrepSX. The system implements efficient graph searching algorithms together with an advanced filtering technique.

GraphGrepSX is compared with SING, GraphFind, CTree and GCoding. Experiments show that GraphGrepSX outperforms the compared systems on a very large collection of molecular data. In particular, it reduces the size and the time for the construction of large database index and outperforms the most popular systems.

Keywords

subgraph isomorphism graph database search indexing suffix tree molecular database 

References

  1. 1.
    Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graph databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 857–872 (2007)Google Scholar
  2. 2.
    Cohen, E., Datar, M., Fujiwara, S., Gionis, A., Indyk, P., Motwani, R., Ullman, J.D., Yang, C.: Finding interesting associations without support pruning. IEEE Transactions on Knowledge and Data Engineering 13(1), 64–78 (2001)CrossRefGoogle Scholar
  3. 3.
    Cordella, L., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(10), 1367–1372 (2004)CrossRefPubMedGoogle Scholar
  4. 4.
    Daylight chemical information systems, http://www.daylight.com/
  5. 5.
    Di Natale, R., Ferro, A., Giugno, R., Mongiovi, M., Pulvirenti, A., Shasha, D.: Sing: Subgraph search in non-homogeneous graphs. BMC bioinformatics 11(1), 96 (2010)CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Ferro, A., Giugno, R., Mongiovì, M., Pulvirenti, A., Skripin, D., Shasha, D.: Graphfind: enhancing graph searching by low support data mining techniques. BMC bioinformatics 9(suppl. 4), S10 (2008)CrossRefGoogle Scholar
  7. 7.
  8. 8.
    Giugno, R., Shasha, D.: Graphgrep: A fast and universal method for querying graphs. In: Proceeding of the International Conference in Pattern Recognition (ICPR), pp. 112–115 (2002)Google Scholar
  9. 9.
    He, H., Singh, A.K.: Closure-tree: An index structure for graph queries. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, p. 38 (2006)Google Scholar
  10. 10.
    Messmer, B.T., Bunke, H.: Subgraph isomorphism detection in polynominal time on preprocessed model graphs. In: Proceedings of Asian Conference on Computer Vision, pp. 373–382 (1995)Google Scholar
  11. 11.
    National Cancer Institute. U.S. National Institute of Health, http://www.cancer.gov/
  12. 12.
    Shasha, D., Wang, J.T.-L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: Proceeding of the ACM Symposium on Principles of Database Systems (PODS), pp. 39–52 (2002)Google Scholar
  13. 13.
    Ukkonen, E.: Approximate string-matching over suffix trees. In: Combinatorial Pattern Matching, pp. 228–242. Springer, Heidelberg (1993)CrossRefGoogle Scholar
  14. 14.
    Yan, X., Yu, P.S., Han, J.: Graph indexing based on discriminative frequent structure analysis. ACM Transactions on Database Systems 30(4), 960–993 (2005)CrossRefGoogle Scholar
  15. 15.
    Zhang, S., Hu, M., Yang, J.: Treepi: A novel graph indexing method. In: Proceedings of IEEE International Conference on Data Engineering, pp. 966–975 (2007)Google Scholar
  16. 16.
    Zou, L., Chen, L., Yu, J.X., Lu, Y.: A novel spectral coding in a large graph database. In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology, pp. 181–192. ACM, New York (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Vincenzo Bonnici
    • 1
  • Alfredo Ferro
    • 1
  • Rosalba Giugno
    • 1
  • Alfredo Pulvirenti
    • 1
  • Dennis Shasha
    • 2
  1. 1.Dipartimento di Matematica ed InformaticaUniversità di CataniaCataniaItaly
  2. 2.Courant Institute of Mathematical SciencesNew York UniversityNew YorkUSA

Personalised recommendations