Advertisement

Towards Efficient Multi-domain Data Processing

  • Johannes LuongEmail author
  • Dirk Habich
  • Thomas Kissinger
  • Wolfgang Lehner
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 737)

Abstract

Economy and research increasingly depend on the timely analysis of large datasets to guide decision making. Complex analysis often involve a rich variety of data types and special purpose processing models. We believe, the database system of the future will use compilation techniques to translate specialized and abstract high level programming models into scalable low level operations on efficient physical data formats. We currently envision optimized relational and linear algebra languages, a flexible data flow language(A language inspired by the programming models of popular data flow engines like Apache Spark (spark.apache.org) or Apache Flink (flink.apache.org).) and scaleable physical operators and formats for relational and array data types. In this article, we propose a database system architecture that is designed around these ideas and we introduce our prototypical implementation of that architecture.

References

  1. 1.
    Luong, J., Habich, D., Kissinger, T., Lehner, W.: Architecture of a multi-domain processing and storage engine. In: Proceedings of the 5th International Conference on Data Management Technologies and Applications, DATA, vol. 1, pp. 189–194 (2016)Google Scholar
  2. 2.
    Aguilera, A., Grunzke, R., Habich, D., Luong, J., Schollbach, D., Markwardt, U., Garcke, J.: Advancing a gateway infrastructure for wind turbine data analysis. J Grid Comput. 14(4), 499–514 (2016)CrossRefGoogle Scholar
  3. 3.
    Rompf, T., Odersky, M.: Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLS. ACM Sigplan Not. 46, 127–136 (2010). ACMCrossRefGoogle Scholar
  4. 4.
    Duggan, J., Elmore, A.J., Stonebraker, M., Balazinska, M., Howe, B., Kepner, J., Madden, S., Maier, D., Mattson, T., Zdonik, S.: The bigdawg polystore system. ACM SIGMOD Rec. 44, 11–16 (2015)CrossRefGoogle Scholar
  5. 5.
    Beckmann, O., Houghton, A., Mellor, M., Kelly, P.H.J.: Runtime code generation in C++ as a foundation for domain-specific optimisation. In: Lengauer, C., Batory, D., Consel, C., Odersky, M. (eds.) Domain-Specific Program Generation. LNCS, vol. 3016, pp. 291–306. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-25935-0_17 CrossRefGoogle Scholar
  6. 6.
    Newburn, C.J., So, B., Liu, Z., McCool, M., Ghuloum, A., Toit, S.D., Wang, Z.G., Du, Z.H., Chen, Y., Wu, G., et al.: Intel’s array building blocks: a retargetable, dynamic compiler and embedded language. In: 2011 9th annual IEEE/ACM international symposium on Code generation and optimization (CGO), pp. 224–235. IEEE (2011)Google Scholar
  7. 7.
    Alexandrov, A., Kunft, A., Katsifodimos, A., Schüler, F., Thamsen, L., Kao, O., Herb, T., Markl, V.: Implicit parallelism through deep language embedding. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 47–61. ACM (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Johannes Luong
    • 1
    Email author
  • Dirk Habich
    • 1
  • Thomas Kissinger
    • 1
  • Wolfgang Lehner
    • 1
  1. 1.Database Technology GroupTechnische Universität DresdenDresdenGermany

Personalised recommendations