Skip to main content

Query Optimization on Hybrid Storage

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10177))

Included in the following conference series:

Abstract

Thanks to the rapid growth of memory capacity, it is now feasible to perform query processing completely in memory. Nevertheless, as main memory is substantially more expensive than most secondary storage equipments, including HDD and SSD, it is not suitable for storing cold data. Therefore, a hybrid data storage composed of both memory and secondary storage is expected to stay popular in the foreseeable future. In this paper, we introduce a query optimization model for hybrid data storage. Different from traditional query processors, which treat either main memory as a cache or secondary storage as an anti-cache, our model performs semantic data partitioning between memory and secondary storage. Query optimization can thus take the partitioning of data into account, to achieve enhanced performance. We conducted experimental evaluation on a columnar query engine to demonstrate the advantage of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akbar, M.M., Rahman, M.S., Kaykobad, M., Manning, E.G., Shoja, G.C.: Solving the multidimensional multiple-choice knapsack problem by constructing convex hulls. Comput. Oper. Res. 33(5), 1259–1273 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bernstein, P.A., Goodman, N., Wong, E., Reeve, C.L., Rothnie Jr., J.B.: Query processing in a system for distributed databases (sdd-1). ACM TODS 6(4), 602–625 (1981)

    Article  MATH  Google Scholar 

  3. Boncz, P.A., Zukowski, M., Nes, N.: Monetdb, x100: hyper-pipelining query execution. In: CIDR, pp. 225–237 (2005)

    Google Scholar 

  4. Ceri, S., Gottlob, G.: Optimizing joins between two partitioned relations in distributed databases. J. Parallel Distrib. Comput. 3(2), 183–205 (1986)

    Article  Google Scholar 

  5. Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 34–43. ACM (1998)

    Google Scholar 

  6. Dar, S., Franklin, M.J., Jonsson, B.T., Srivastava, D., Tan, M., et al.: Semantic data caching and replacement. In: Proceedings of VLDB, vol. 96, pp. 330–341. Citeseer (1996)

    Google Scholar 

  7. DeBrabant, J., Pavlo, A., Tu, S., Stonebraker, M., Zdonik, S.: Anti-caching: a new approach to database management system architecture. Proc. VLDB Endow. 6(14), 1942–1953 (2013)

    Article  Google Scholar 

  8. Eldawy, A., Levandoski, J., Larson, P.-Å.: Trekking through siberia: managing cold data in a memory-optimized database. Proc. VLDB Endow. 7(11), 931–942 (2014)

    Article  Google Scholar 

  9. Finkelstein, S.: Common expression analysis in database applications. In: Proceedings of SIGMOD, pp. 235–245. ACM (1982)

    Google Scholar 

  10. Ganguly, S., Hasan, W., Krishnamurthy, R.: Query optimization for parallel execution. In: Proceedings of the SIGMOD, pp. 9–18 (1992)

    Google Scholar 

  11. Giannikis, G., Alonso, G., Kossmann, D.: Shareddb: killing one thousand queries with one stone. Proc. VLDB Endow. 5(6), 526–537 (2012)

    Article  Google Scholar 

  12. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining Knowl. Discov. 1(1), 29–53 (1997)

    Article  Google Scholar 

  13. Herodotou, H., Borisov, N., Babu, S.: Query optimization techniques for partitioned tables. In: Proceedings of the SIGMOD, pp. 49–60. ACM (2011)

    Google Scholar 

  14. Kemper, A., Neumann, T.: Hyper: a hybrid OLTP & OLAP main memory database system based on virtual memory snapshots. In: Proceedings of ICDE, pp. 195–206. IEEE (2011)

    Google Scholar 

  15. Kossmann, D., Franklin, M.J., Drasch, G., Ag, W.: Cache investment: integrating query optimization and distributed data placement. ACM TODS 25(4), 517–558 (2000)

    Article  MATH  Google Scholar 

  16. Manegold, S., Boncz, P., Kersten, M.L.: Optimizing main-memory join on modern hardware. IEEE TKDE 14(4), 709–730 (2002)

    Google Scholar 

  17. Manegold, S., Boncz, P., Kersten, M.L.: Generic database cost models for hierarchical memory systems. In Proceedings of VLDB, VLDB 2002, pp. 191–202. VLDB Endowment (2002)

    Google Scholar 

  18. Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 4(9), 539–550 (2011)

    Article  Google Scholar 

  19. Polyzotis, N.: Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization. In: Proceedings of CIKM, pp. 720–727. ACM (2005)

    Google Scholar 

  20. Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. ACM SIGMOD Record 29, 475–486 (2000)

    Article  Google Scholar 

  21. Ren, Q., Dunham, M.H., Kumar, V.: Semantic caching and query processing. IEEE TKDE 15(1), 192–210 (2003)

    Google Scholar 

  22. Sellis, T.K.: Multiple-query optimization. ACM TODS 13(1), 23–52 (1988)

    Article  Google Scholar 

  23. Zhang, H., Chen, G., Ooi, B.C., Tan, K.-L., Zhang, M.: In-memory big data management and processing: a survey. IEEE TKDE 27(7), 1920–1948 (2015)

    Google Scholar 

  24. Zhang, H., Chen, G., Ooi, B.C., Wong, W.-F., Wu, S., Xia, Y.: Anti-caching-based elastic memory management for big data. In: Proceedings of ICDE, pp. 1268–1279. IEEE (2015)

    Google Scholar 

  25. Zhang, Y., Zhou, X., Zhang, Y., Zhang, Y., Su, M., Wang, S.: Virtual denormalization via array index reference for main memory OLAP. IEEE TKDE 28(4), 1061–1074 (2016)

    Google Scholar 

  26. Zhou, J., Larson, P.-A., Chaiken, R.: Incorporating partitioning and parallel plans into the scope optimizer. In Proceedings of ICDE, pp. 1060–1071. IEEE (2010)

    Google Scholar 

  27. Zukowski, M., van de Wiel, M., Boncz, P.: Vectorwise: a vectorized analytical dbms. In: Proceedings of ICDE, pp. 1349–1350. IEEE (2012)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by Chinese National High-tech R&D Program (863 Program) (2015AA015307) and the NSFC Porject (No. 61272138).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Yu, A., Meng, Q., Zhou, X., Shen, B., Zhang, Y. (2017). Query Optimization on Hybrid Storage. In: Candan, S., Chen, L., Pedersen, T., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10177. Springer, Cham. https://doi.org/10.1007/978-3-319-55753-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55753-3_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55752-6

  • Online ISBN: 978-3-319-55753-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics