Abstract
Proliferation of NoSQL and graph databases indicates a move towards alternate forms of data representation beyond the traditional relational data model. This raises the question of processing queries efficiently over these representations. Graphs have become one of the preferred ways to represent and store data related to social networks and other domains where relationships and their labels need to be captured explicitly. Currently, for querying graph databases, users have to either learn a new graph query language (e.g. Metaweb Query language or MQL [6]) for posing their queries or use customized searches of specific substructures [14]. Hence, there is a clear need for posing queries using the same representation as that of a graph database, generate and evaluate alternate plans, develop cost metrics for evaluating plans, and prune the search space to converge on a good plan that can be evaluated directly over the graph database.
In this paper, we propose an approach for effective evaluation of queries specified over graph databases. The proposed optimizer generates query plans systematically and evaluates them using appropriate cost metrics gleaned from the graph database. For the time being, a graph mining algorithm has been modified for evaluating a given query plan using constrained expansion. Relevant metadata pertaining to the graph database is collected and used for evaluating a query plan using a branch and bound algorithm. Experiments on different types of queries over two graph databases (Internet Movie Database or IMDB and DBLP) are performed to validate our approach. Experimental results show that the query plan generated by our system results in exploring significantly fewer portions of the graph as compared to any other query plan for the same query.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Batra, S., Tyagi, C.: Comparative analysis of relational and graph databases. Int. J. Soft Comput. Eng. (IJSCE) 2(2), 509–512 (2012)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD 2008, pp. 1247–1250. ACM, New York (2008)
Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graphdatabases. In: SIGMOD, pp. 857–872 (2007)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1979)
Giugno, R., Shasha, D.: Graphgrep: a fast and universal method for querying graphs. In: 16th International Conference on Pattern Recognition, ICPR 2002, Quebec, Canada, 11–15 August 2002, pp. 112–115 (2002)
Goyal, A.: QP-SUBDUE: Processing Queries Over Graph Databases. Master’s thesis, The University of Texas at Arlington, December 2015
Holder, L.B., Cook, D.J., Djoko, S.: Substucture discovery in the SUBDUE system. In: Knowledge Discovery and Data Mining, pp. 169–180 (1994)
Holzschuher, F., Peinl, R.: Performance of graph query languages: comparison of cypher, gremlinand native access in neo4j. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 195–204. ACM (2013)
Jarke, M., Koch, J.: Query optimization in database systems. ACM Comput. Surv. (CsUR) 16(2), 111–152 (1984)
Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: World Wide Web Conference Series, pp. 607–614 (2011)
Tian, Y., Patel, J.M.: TALE: a tool for approximate large graph matching. In: ICDE 2008, 7–12 April 2008, Cancún, México (2008)
Tong, H., Faloutsos, C., Gallagher, B., Eliassi-Rad, T.: Fast best-effort pattern matching in large attributed graphs. In: SIGKDD, pp. 737–746 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Das, S., Goyal, A., Chakravarthy, S. (2016). Plan Before You Execute: A Cost-Based Query Optimizer for Attributed Graph Databases. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-43946-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)