Advertisement

MPP SQL Query Optimization with RTCG

  • K. T. SridharEmail author
  • M. A. Sakkeer
  • Shiju Andrews
  • Jimson Johnson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11297)

Abstract

Analytics database dbX is a cloud agnostic, MPP SQL product with both DSM and NSM stores. One of the techniques for better micro optimization of SQL query processing is runtime code generation and JIT compilation. We propose a RTCG model that is both query aware and hardware conscious extending analytics SQL query processing to a high degree of intra-query parallelism. Our approach to RTCG, at system level targets to maximize benefits from modern hardware, and at use level focuses on typical, industry type SQL, somewhat different from standard benchmarks. We describe the model, highlighting its novel aspects, techniques implemented and product engineering decisions in dbX. To evaluate the efficacy of the RTCG model, we perform experiments on desktop and cloud clusters, with standard and synthetic benchmarks, on data that is more commensurate in size with industry applications.

Keywords

SQL Query processing Micro optimization RTCG JIT 

Notes

Acknowledgment

We thank several people; at Bangalore: Pramod Sahu for testing JIT modules and SQL code; Dipanjan Deb and Prajeesh for operational cloud support; at Schaumburg: Jim Benbow for dbX cloud deployment scripts.

References

  1. 1.
    Ailamaki, A.A., Dewitt, D.J., Hill, M.D., Wood, D.A.: DBMS on a modern processor: where does time go? In: Proceedings of 25th VLDB, pp. 266–277 (1999)Google Scholar
  2. 2.
  3. 3.
    Astrahan, M.M., et al.: System R: a relational data base management system. Computer 12, 42–48 (1979)CrossRefGoogle Scholar
  4. 4.
    Aycock, S.: A brief history of Just-In-Time. Comput. Surv. 35, 97–113 (2003)CrossRefGoogle Scholar
  5. 5.
    Becker, A., Sirowy, S., Vahid, F.: Just-In-Time compilation for FPGA processor cores. In: ESLsyn Conference, pp. 1–6 (2011)Google Scholar
  6. 6.
    Codd, E.F.: Relational databases: a practical foundation for productivity, Turing award lecture. Commun. ACM 25, 109–117 (1982)CrossRefGoogle Scholar
  7. 7.
    Consel, C., Danvy, O.: Tutorial notes on partial evaluation. In: 20th POPL, pp. 493–501. ACM (1993)Google Scholar
  8. 8.
    Consel, C., Noel, F.: A general approach for Run-Time Specialization and its application to C. In: 23rd POPL, pp. 145–156. ACM (1996)Google Scholar
  9. 9.
    Diaconu, C., et al.: Hekaton: SQL Server’s memory optimized OLTP engine. In: SIGMOD 2013, pp. 1243–1254. ACM (2013)Google Scholar
  10. 10.
    Engler, D.R., Hsieh, W.C., Kaashoek, M.F.: \(^{\prime }\)C: a language for high-level, efficient and machine-independent dynamic code generation. In: 23rd POPL, pp. 131–144. ACM (1996)Google Scholar
  11. 11.
    Freytag, J.C., Goodman, N.: Translating aggregate queries into iterative programs. In: Proceedings of 12th VLDB, pp. 25–28 (1986)Google Scholar
  12. 12.
    Graeffe, G.: Query evaluation techniques for large databases. Comput. Surv. 25, 73–170 (1993)CrossRefGoogle Scholar
  13. 13.
    Grant, B., et al.: DyC: an expression annotation-directed dynamic compiler for C. Theor. Comput. Sci. 248(1–2), 147–199 (2000)CrossRefGoogle Scholar
  14. 14.
    Keppel, D., Eggers, S.J., Henry, R.R.: Evaluating runtime-compiled value specific optimizations. Technical report 93-11-02 (1993)Google Scholar
  15. 15.
    Krikellas, K., Viglas, S.D., Cintra, M.: Generating code for holistic query evaluation. In: Proceedings of 26th ICDE, pp. 613–624. IEEE (2010)Google Scholar
  16. 16.
    Lang, H., et al.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: SIGMOD, pp. 311–326. ACM (2016)Google Scholar
  17. 17.
    Leone, M., Lee, P.: A declarative approach to run-time code generation. In: Proceedings of WCSSS, vol. 73, p. 10 (1996)Google Scholar
  18. 18.
    Leone, M., Lee, P.: Optimizing ML with run-time code generation. SIGPLAN Not. 31, 137–148 (1996)CrossRefGoogle Scholar
  19. 19.
    Murray, D.G., Isard, M., Yu, Y.: Steno: automatic optimization of declarative queries. SIGPLAN Not. 46(6), 121–131 (2011)CrossRefGoogle Scholar
  20. 20.
    Nagel, F., Bierman, G., Viglas, S.D.: Code generation for efficient query processing in managed runtimes. In: Proceedings of 40th VLDB, vol. 7, pp. 1095–1106 (2014)CrossRefGoogle Scholar
  21. 21.
    Neumann, T.: Efficiently compiling efficient query plans for modern hardware. In: Proceedings of 37th VLDB, vol. 4, pp. 539–550 (2011)CrossRefGoogle Scholar
  22. 22.
    Pantela, S., Idreos, S.: One loop does not fit all. In: Proceedings of SIGMOD 2015, pp. 2073–2074. ACM (2015)Google Scholar
  23. 23.
    Pike, R., Locanthi, B., Reiser, J.: Hardware/Software trade-offs for bitmap graphics on the BLIT. Softw. Pract. Exp. 15, 131–151 (1985)CrossRefGoogle Scholar
  24. 24.
    Pu, C., et al.: Optimistic incremental specialization: streamlining a commercial Operating System. In: Proceedings of SIGOPS, vol. 29, pp. 314–321. ACM (1995)CrossRefGoogle Scholar
  25. 25.
    Queva, C., Courousse, D., Charles, H.: Self-optimisation using runtime-code generation for wireless sensor networks. In: Proceedings of ICDN, p. 6 (2016)Google Scholar
  26. 26.
    Rao, J., Pirahesh, J., Mohan, C., Lohman, G.: Compiled query execution engine using JVM. In: Proceedings of 22nd ICDE, pp. 23–23. IEEE (2006)Google Scholar
  27. 27.
    Sompolski, T., Zukowski, M., Boncz, P.: Vectorization vs. compilation in query execution. In: Proceedings of 7th DaMon, pp. 33–40 (2011)Google Scholar
  28. 28.
    SQLite: The SQLite Bytecode Engine (2017). https://www.sqlite.org/opcode.html
  29. 29.
    Sridhar, K.T.: Modern column stores for big data processing. In: Reddy, P.K., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) BDA 2017. LNCS, vol. 10721, pp. 113–125. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-72413-3_8CrossRefGoogle Scholar
  30. 30.
    Sridhar, K.T.: Reliability techniques for MPP SQL database product engineering. In: 2nd International Conference on System Reliability (ICSRS), pp. 180–185. IEEE (2017)Google Scholar
  31. 31.
    Sridhar, K.T., Johnson, J.: Entropy aware adaptive compression for SQL column stores. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2018. CCIS, vol. 928, pp. 90–104. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-99987-6_7CrossRefGoogle Scholar
  32. 32.
    Sridhar, K.T., Sakkeer, M.A.: Optimizing database load and extract for big data era. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014. LNCS, vol. 8422, pp. 503–512. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-05813-9_34CrossRefGoogle Scholar
  33. 33.
    Sudarshan, S. (ed.): Special Issue on When Compilers Meet Database Systems, IEEE Data Engineering Bulletin, vol. 37. IEEE (2014). http://sites.computer.org/debull/A14mar/issue1.htm
  34. 34.
    Viglas, S.D.: Just-in-time compilation for SQL query processing. In: Proceedings of 39th VLDB, vol. 6, p. 2 (2013)CrossRefGoogle Scholar
  35. 35.
    Wanderman-Milne, S., Li, N.: Runtime code generation in Cloudera Impala. IEEE Data Eng. Bull. 37(1), 31–37 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • K. T. Sridhar
    • 1
    • 2
    Email author
  • M. A. Sakkeer
    • 1
  • Shiju Andrews
    • 1
  • Jimson Johnson
    • 1
  1. 1.XtremeData TechnologiesBangaloreIndia
  2. 2.XtremeData, Inc.SchaumburgUSA

Personalised recommendations