Skip to main content

On Bridging Relational and Document-Centric Data Stores

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7968))

Abstract

Big Data scenarios often involve massive collections of nested data objects, typically referred to as “documents.” The challenges of document management at web scale have stimulated a recent trend towards the development of document-centric “NoSQL” data stores. Many query tasks naturally involve reasoning over data residing across NoSQL and relational “SQL” databases. Having data divided over separate stores currently implies labor-intensive manual work for data consumers. In this paper, we propose a general framework to seamlessly bridge the gap between SQL and NoSQL. In our framework, documents are logically incorporated in the relational store, and querying is performed via a novel NoSQL query pattern extension to the SQL language. These patterns allow the user to describe conditions on the document-centric data, while the rest of the SQL query refers to the corresponding NoSQL data via variable bindings. We give an effective solution for translating the user query to an equivalent pure SQL query, and present optimization strategies for query processing. We have implemented a prototype of our framework using PostgreSQL and MongoDB and have performed an extensive empirical analysis. Our study shows the practical feasibility of our framework, proving the possibility of seamless coordinated query processing over relational and document-centric data stores.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., et al.: Web data management. Cambridge University Press (2011)

    Google Scholar 

  2. Aca, U., et al.: A graph model of data and workflow provenance. In: Proc. TAPP, San Jose, California (2010)

    Google Scholar 

  3. Agrawal, R., Somani, A., Xu, Y.: Storage and querying of e-commerce data. In: Proc. VLDB, Rome, pp. 149–158 (2001)

    Google Scholar 

  4. Atzeni, P., Bugiotti, F., Rossi, L.: Uniform access to non-relational database systems: The SOS platform. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 160–174. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Benzaken, V., Castagna, G., Nguyen, K., Siméon, J.: Static and dynamic semantics of NoSQL languages. In: Proc. ACM POPL, Rome, pp. 101–114 (2013)

    Google Scholar 

  6. Cattell, R.: Scalable SQL and NoSQL data stores. SIGMOD Record 39(4), 12–27 (2010)

    Article  Google Scholar 

  7. Chong, E.I., Das, S., Eadon, G., Srinivasan, J.: An efficient SQL-based RDF querying scheme. In: Proc. VLDB, Trondheim, Norway, pp. 1216–1227 (2005)

    Google Scholar 

  8. Crockford, D.: The application/json media type for javascript object notation, JSON (2006), http://www.ietf.org/rfc/rfc4627.txt

  9. Fletcher, G.H.L., Wyss, C.M.: Towards a general framework for effective solutions to the data mapping problem. J. Data Sem. 14, 37–73 (2009)

    Article  Google Scholar 

  10. JSONiq, http://jsoniq.org

  11. Litwin, W., Ketabchi, M., Krishnamurthy, R.: First order normal form for relational databases and multidatabases. SIGMOD Record 20(4), 74–76 (1991)

    Article  Google Scholar 

  12. Ludäscher, B., Weske, M., McPhillips, T., Bowers, S.: Scientific workflows: Business as usual? In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 31–47. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Luo, Y., et al.: Storing and indexing massive RDF datasets. In: Virgilio, R., et al. (eds.) Semantic Search over the Web, pp. 31–60. Springer, Berlin (2012)

    Chapter  Google Scholar 

  14. Management of external data (SQL/MED). ISO/IEC 9075-9 (2008)

    Google Scholar 

  15. Melnik, S., et al.: Dremel: Interactive analysis of web-scale datasets. PVLDB 3(1), 330–339 (2010)

    Google Scholar 

  16. Roijackers, J.: Bridging SQL and NoSQL. MSc Thesis, Eindhoven University of Technology (2012)

    Google Scholar 

  17. Sadalage, P.J., Fowler, M.: NoSQL distilled: A brief guide to the emerging world of polyglot persistence. Addison Wesley (2012)

    Google Scholar 

  18. Shanmugasundaram, J., et al.: A general technique for querying XML documents using a relational database system. SIGMOD Record 30(3), 20–26 (2001)

    Article  Google Scholar 

  19. UnQL, http://unql.sqlite.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Roijackers, J., Fletcher, G.H.L. (2013). On Bridging Relational and Document-Centric Data Stores. In: Gottlob, G., Grasso, G., Olteanu, D., Schallhart, C. (eds) Big Data. BNCOD 2013. Lecture Notes in Computer Science, vol 7968. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39467-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39467-6_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39466-9

  • Online ISBN: 978-3-642-39467-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics