Advertisement

UDBMS: Road to Unification for Multi-model Data Management

  • Jiaheng Lu
  • Zhen Hua Liu
  • Pengfei Xu
  • Chao Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11158)

Abstract

One of the greatest challenges in big data management is the “Variety” of the data. The data may be presented in various types and formats: structured, semi-structured and unstructured. For instance, data can be modeled as relational, key-value, and graph models. Having a single data platform for managing both well-structured data and NoSQL data is beneficial to users; this approach reduces significantly integration, migration, development, maintenance, and operational issues. Therefore, a challenging research work is how to develop an efficient consolidated single data management platform covering both NoSQL and relational data to reduce integration issues, simplify operations, and eliminate migration issues. In this paper, we envision novel principles and technologies to handle multiple models of data in one unified database system, including model-agnostic storage, unified query processing and indexes, in-memory structures and multi-model transactions. We discuss our visions as well as present research challenges that we need to address.

Notes

Acknowledgment

Contact email: Jiaheng.Lu@helsinki.fi. This work is partially supported by Academy of Finland (Project No. 310321).

References

  1. 1.
    Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. PVLDB 2(1), 922–933 (2009)Google Scholar
  2. 2.
    Afrati, F.N.: Storing and querying tree-structured records in Dremel. PVLDB 7(12), 1131–1142 (2014)Google Scholar
  3. 3.
    Borkar, V.R., et al.: Algebricks: a data model-agnostic compiler backend for big data languages. In: ACM SoCC, pp. 422–433 (2015)Google Scholar
  4. 4.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: ACM SIGMOD, pp. 310–321 (2002)Google Scholar
  5. 5.
    Bugiotti, F., Bursztyn, D., Deutsch, A., Ileana, I., Manolescu, I.: Invisible glue: scalable self-tunning multi-stores. In: CIDR (2015)Google Scholar
  6. 6.
    Chen, J., et al.: Big data challenge: a data management perspective. Front. Comput. Sci. 7(2), 157–164 (2013)MathSciNetCrossRefGoogle Scholar
  7. 7.
    DeWitt, D.J., et al.: Split query processing in polybase. In: SIGMOD, pp. 1255–1266 (2013)Google Scholar
  8. 8.
    Elmore, A.J., et al.: A demonstration of the BigDAWG polystore system. PVLDB 8(12), 1908–1911 (2015)Google Scholar
  9. 9.
    Franklin, M.J., Halevy, A.Y., Maier, D.: From databases to dataspaces: a new abstraction for information management. SIGMOD Rec. 34(4), 27–33 (2005)CrossRefGoogle Scholar
  10. 10.
    Gog, I., et al.: Musketeer: all for one, one for all in data processing systems. In: EuroSys, pp. 1–16 (2015)Google Scholar
  11. 11.
    Heimbigner, D., McLeod, D.: A federated architecture for information management. ACM Trans. Inf. Syst. 3(3), 253–278 (1985)CrossRefGoogle Scholar
  12. 12.
    Jindal, A., et al.: VERTEXICA: your relational friend for graph analytics!. PVLDB 7(13), 1669–1672 (2014)Google Scholar
  13. 13.
    Lim, H., Han, Y., Babu, S.: How to fit when no one size fits. In: CIDR (2013)Google Scholar
  14. 14.
    Lin, C., Lu, J., Wei, Z., Wang, J., Xiao, X.: Optimal algorithms for selecting top-k combinations of attributes: theory and applications. VLDB J. 27(1), 27–52 (2018)CrossRefGoogle Scholar
  15. 15.
    Liu, Y., Lu, J., Yang, H., Xiao, X., Wei, Z.: Towards maximum independent sets on massive graphs. PVLDB 8(13), 2122–2133 (2015)Google Scholar
  16. 16.
    Liu, Y., et al.: ProbeSim: scalable single-source and top-k simrank computations on dynamic graphs. PVLDB 11(1), 14–26 (2017)Google Scholar
  17. 17.
    Lu, J.: Towards benchmarking multi-model databases. In: CIDR (2017)Google Scholar
  18. 18.
    Lu, J., Holubová, I.: Multi-model data management: what’s new and what’s next? In: EDBT, pp. 602–605 (2017)Google Scholar
  19. 19.
    Lu, J., Ling, T.W., Bao, Z., Wang, C.: Extended XML tree pattern matching: theories and algorithms. IEEE Trans. Knowl. Data Eng. 23(3), 402–416 (2011)CrossRefGoogle Scholar
  20. 20.
    Lu, J., Ling, T.W., Chan, C.Y., Chen, T.: From region encoding to extended dewey: on efficient processing of XML twig pattern matching. In: VLDB, pp. 193–204 (2005)Google Scholar
  21. 21.
    Ong, K.W., Papakonstantinou, Y., Vernoux, R.: The SQL++ semi-structured data model and query language: A capabilities survey of SQL-on-Hadoop, NoSQL and NewSQL databases. CoRR abs/1405.3631 (2014)Google Scholar
  22. 22.
    Xu, P., Lu, J.: Top-k string auto-completion with synonyms. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10178, pp. 202–218. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-55699-4_13CrossRefGoogle Scholar
  23. 23.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD, pp. 335–346 (2004)Google Scholar
  24. 24.
    Zhu, M., Risch, T.: Querying combined cloud-based and relational databases. In: CSC, pp. 330–335 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Jiaheng Lu
    • 1
  • Zhen Hua Liu
    • 2
  • Pengfei Xu
    • 1
  • Chao Zhang
    • 1
  1. 1.University of HelsinkiHelsinkiFinland
  2. 2.OracleRedwood ShoreUSA

Personalised recommendations