Skip to main content

MPP SQL Query Optimization with RTCG

  • Conference paper
  • First Online:
Book cover Big Data Analytics (BDA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11297))

Included in the following conference series:

  • 1497 Accesses

Abstract

Analytics database dbX is a cloud agnostic, MPP SQL product with both DSM and NSM stores. One of the techniques for better micro optimization of SQL query processing is runtime code generation and JIT compilation. We propose a RTCG model that is both query aware and hardware conscious extending analytics SQL query processing to a high degree of intra-query parallelism. Our approach to RTCG, at system level targets to maximize benefits from modern hardware, and at use level focuses on typical, industry type SQL, somewhat different from standard benchmarks. We describe the model, highlighting its novel aspects, techniques implemented and product engineering decisions in dbX. To evaluate the efficacy of the RTCG model, we perform experiments on desktop and cloud clusters, with standard and synthetic benchmarks, on data that is more commensurate in size with industry applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ailamaki, A.A., Dewitt, D.J., Hill, M.D., Wood, D.A.: DBMS on a modern processor: where does time go? In: Proceedings of 25th VLDB, pp. 266–277 (1999)

    Google Scholar 

  2. Amazon: Redshift (2017). http://docs.aws.amazon.com/redshift/latest/dg/c-query-performance.html

  3. Astrahan, M.M., et al.: System R: a relational data base management system. Computer 12, 42–48 (1979)

    Article  Google Scholar 

  4. Aycock, S.: A brief history of Just-In-Time. Comput. Surv. 35, 97–113 (2003)

    Article  Google Scholar 

  5. Becker, A., Sirowy, S., Vahid, F.: Just-In-Time compilation for FPGA processor cores. In: ESLsyn Conference, pp. 1–6 (2011)

    Google Scholar 

  6. Codd, E.F.: Relational databases: a practical foundation for productivity, Turing award lecture. Commun. ACM 25, 109–117 (1982)

    Article  Google Scholar 

  7. Consel, C., Danvy, O.: Tutorial notes on partial evaluation. In: 20th POPL, pp. 493–501. ACM (1993)

    Google Scholar 

  8. Consel, C., Noel, F.: A general approach for Run-Time Specialization and its application to C. In: 23rd POPL, pp. 145–156. ACM (1996)

    Google Scholar 

  9. Diaconu, C., et al.: Hekaton: SQL Server’s memory optimized OLTP engine. In: SIGMOD 2013, pp. 1243–1254. ACM (2013)

    Google Scholar 

  10. Engler, D.R., Hsieh, W.C., Kaashoek, M.F.: \(^{\prime }\)C: a language for high-level, efficient and machine-independent dynamic code generation. In: 23rd POPL, pp. 131–144. ACM (1996)

    Google Scholar 

  11. Freytag, J.C., Goodman, N.: Translating aggregate queries into iterative programs. In: Proceedings of 12th VLDB, pp. 25–28 (1986)

    Google Scholar 

  12. Graeffe, G.: Query evaluation techniques for large databases. Comput. Surv. 25, 73–170 (1993)

    Article  Google Scholar 

  13. Grant, B., et al.: DyC: an expression annotation-directed dynamic compiler for C. Theor. Comput. Sci. 248(1–2), 147–199 (2000)

    Article  Google Scholar 

  14. Keppel, D., Eggers, S.J., Henry, R.R.: Evaluating runtime-compiled value specific optimizations. Technical report 93-11-02 (1993)

    Google Scholar 

  15. Krikellas, K., Viglas, S.D., Cintra, M.: Generating code for holistic query evaluation. In: Proceedings of 26th ICDE, pp. 613–624. IEEE (2010)

    Google Scholar 

  16. Lang, H., et al.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: SIGMOD, pp. 311–326. ACM (2016)

    Google Scholar 

  17. Leone, M., Lee, P.: A declarative approach to run-time code generation. In: Proceedings of WCSSS, vol. 73, p. 10 (1996)

    Google Scholar 

  18. Leone, M., Lee, P.: Optimizing ML with run-time code generation. SIGPLAN Not. 31, 137–148 (1996)

    Article  Google Scholar 

  19. Murray, D.G., Isard, M., Yu, Y.: Steno: automatic optimization of declarative queries. SIGPLAN Not. 46(6), 121–131 (2011)

    Article  Google Scholar 

  20. Nagel, F., Bierman, G., Viglas, S.D.: Code generation for efficient query processing in managed runtimes. In: Proceedings of 40th VLDB, vol. 7, pp. 1095–1106 (2014)

    Article  Google Scholar 

  21. Neumann, T.: Efficiently compiling efficient query plans for modern hardware. In: Proceedings of 37th VLDB, vol. 4, pp. 539–550 (2011)

    Article  Google Scholar 

  22. Pantela, S., Idreos, S.: One loop does not fit all. In: Proceedings of SIGMOD 2015, pp. 2073–2074. ACM (2015)

    Google Scholar 

  23. Pike, R., Locanthi, B., Reiser, J.: Hardware/Software trade-offs for bitmap graphics on the BLIT. Softw. Pract. Exp. 15, 131–151 (1985)

    Article  Google Scholar 

  24. Pu, C., et al.: Optimistic incremental specialization: streamlining a commercial Operating System. In: Proceedings of SIGOPS, vol. 29, pp. 314–321. ACM (1995)

    Article  Google Scholar 

  25. Queva, C., Courousse, D., Charles, H.: Self-optimisation using runtime-code generation for wireless sensor networks. In: Proceedings of ICDN, p. 6 (2016)

    Google Scholar 

  26. Rao, J., Pirahesh, J., Mohan, C., Lohman, G.: Compiled query execution engine using JVM. In: Proceedings of 22nd ICDE, pp. 23–23. IEEE (2006)

    Google Scholar 

  27. Sompolski, T., Zukowski, M., Boncz, P.: Vectorization vs. compilation in query execution. In: Proceedings of 7th DaMon, pp. 33–40 (2011)

    Google Scholar 

  28. SQLite: The SQLite Bytecode Engine (2017). https://www.sqlite.org/opcode.html

  29. Sridhar, K.T.: Modern column stores for big data processing. In: Reddy, P.K., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) BDA 2017. LNCS, vol. 10721, pp. 113–125. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72413-3_8

    Chapter  Google Scholar 

  30. Sridhar, K.T.: Reliability techniques for MPP SQL database product engineering. In: 2nd International Conference on System Reliability (ICSRS), pp. 180–185. IEEE (2017)

    Google Scholar 

  31. Sridhar, K.T., Johnson, J.: Entropy aware adaptive compression for SQL column stores. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2018. CCIS, vol. 928, pp. 90–104. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99987-6_7

    Chapter  Google Scholar 

  32. Sridhar, K.T., Sakkeer, M.A.: Optimizing database load and extract for big data era. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014. LNCS, vol. 8422, pp. 503–512. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05813-9_34

    Chapter  Google Scholar 

  33. Sudarshan, S. (ed.): Special Issue on When Compilers Meet Database Systems, IEEE Data Engineering Bulletin, vol. 37. IEEE (2014). http://sites.computer.org/debull/A14mar/issue1.htm

  34. Viglas, S.D.: Just-in-time compilation for SQL query processing. In: Proceedings of 39th VLDB, vol. 6, p. 2 (2013)

    Article  Google Scholar 

  35. Wanderman-Milne, S., Li, N.: Runtime code generation in Cloudera Impala. IEEE Data Eng. Bull. 37(1), 31–37 (2014)

    Google Scholar 

Download references

Acknowledgment

We thank several people; at Bangalore: Pramod Sahu for testing JIT modules and SQL code; Dipanjan Deb and Prajeesh for operational cloud support; at Schaumburg: Jim Benbow for dbX cloud deployment scripts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. T. Sridhar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sridhar, K.T., Sakkeer, M.A., Andrews, S., Johnson, J. (2018). MPP SQL Query Optimization with RTCG. In: Mondal, A., Gupta, H., Srivastava, J., Reddy, P., Somayajulu, D. (eds) Big Data Analytics. BDA 2018. Lecture Notes in Computer Science(), vol 11297. Springer, Cham. https://doi.org/10.1007/978-3-030-04780-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04780-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04779-5

  • Online ISBN: 978-3-030-04780-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics