Abstract
Big Data scenarios often involve massive collections of nested data objects, typically referred to as “documents.” The challenges of document management at web scale have stimulated a recent trend towards the development of document-centric “NoSQL” data stores. Many query tasks naturally involve reasoning over data residing across NoSQL and relational “SQL” databases. Having data divided over separate stores currently implies labor-intensive manual work for data consumers. In this paper, we propose a general framework to seamlessly bridge the gap between SQL and NoSQL. In our framework, documents are logically incorporated in the relational store, and querying is performed via a novel NoSQL query pattern extension to the SQL language. These patterns allow the user to describe conditions on the document-centric data, while the rest of the SQL query refers to the corresponding NoSQL data via variable bindings. We give an effective solution for translating the user query to an equivalent pure SQL query, and present optimization strategies for query processing. We have implemented a prototype of our framework using PostgreSQL and MongoDB and have performed an extensive empirical analysis. Our study shows the practical feasibility of our framework, proving the possibility of seamless coordinated query processing over relational and document-centric data stores.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., et al.: Web data management. Cambridge University Press (2011)
Aca, U., et al.: A graph model of data and workflow provenance. In: Proc. TAPP, San Jose, California (2010)
Agrawal, R., Somani, A., Xu, Y.: Storage and querying of e-commerce data. In: Proc. VLDB, Rome, pp. 149–158 (2001)
Atzeni, P., Bugiotti, F., Rossi, L.: Uniform access to non-relational database systems: The SOS platform. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 160–174. Springer, Heidelberg (2012)
Benzaken, V., Castagna, G., Nguyen, K., Siméon, J.: Static and dynamic semantics of NoSQL languages. In: Proc. ACM POPL, Rome, pp. 101–114 (2013)
Cattell, R.: Scalable SQL and NoSQL data stores. SIGMOD Record 39(4), 12–27 (2010)
Chong, E.I., Das, S., Eadon, G., Srinivasan, J.: An efficient SQL-based RDF querying scheme. In: Proc. VLDB, Trondheim, Norway, pp. 1216–1227 (2005)
Crockford, D.: The application/json media type for javascript object notation, JSON (2006), http://www.ietf.org/rfc/rfc4627.txt
Fletcher, G.H.L., Wyss, C.M.: Towards a general framework for effective solutions to the data mapping problem. J. Data Sem. 14, 37–73 (2009)
JSONiq, http://jsoniq.org
Litwin, W., Ketabchi, M., Krishnamurthy, R.: First order normal form for relational databases and multidatabases. SIGMOD Record 20(4), 74–76 (1991)
Ludäscher, B., Weske, M., McPhillips, T., Bowers, S.: Scientific workflows: Business as usual? In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 31–47. Springer, Heidelberg (2009)
Luo, Y., et al.: Storing and indexing massive RDF datasets. In: Virgilio, R., et al. (eds.) Semantic Search over the Web, pp. 31–60. Springer, Berlin (2012)
Management of external data (SQL/MED). ISO/IEC 9075-9 (2008)
Melnik, S., et al.: Dremel: Interactive analysis of web-scale datasets. PVLDB 3(1), 330–339 (2010)
Roijackers, J.: Bridging SQL and NoSQL. MSc Thesis, Eindhoven University of Technology (2012)
Sadalage, P.J., Fowler, M.: NoSQL distilled: A brief guide to the emerging world of polyglot persistence. Addison Wesley (2012)
Shanmugasundaram, J., et al.: A general technique for querying XML documents using a relational database system. SIGMOD Record 30(3), 20–26 (2001)
UnQL, http://unql.sqlite.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Roijackers, J., Fletcher, G.H.L. (2013). On Bridging Relational and Document-Centric Data Stores. In: Gottlob, G., Grasso, G., Olteanu, D., Schallhart, C. (eds) Big Data. BNCOD 2013. Lecture Notes in Computer Science, vol 7968. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39467-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-39467-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39466-9
Online ISBN: 978-3-642-39467-6
eBook Packages: Computer ScienceComputer Science (R0)