Advertisement

Split Dictionaries for In-memory Column Stores in Mixed Workload Environments

  • David Schwalb
  • Markus Dreseler
  • Martin Faust
  • Johannes Wust
  • Hasso Plattner
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8506)

Abstract

Columnar in-memory databases use dictionary encoding as a compression technique, replacing long and frequently occurring values with short integers. Sorted dictionaries allow for more efficient query processing as comparisons can be performed directly on the compressed data whereas unsorted dictionaries are faster when inserting new values.

In this work, we propose a new type of dictionary compression called Split Dictionaries. These organize their values in fixed-sized splits, enabling fast inserts and comparable query performance while significantly reducing maintenance costs. We present a detailed performance analysis regarding inserts, range queries, and the merge process as well as a memory usage model. We argue that adjusting the dictionary size allows for a more balanced trade-off especially in mixed workload environments.

Keywords

Query Processing Range Query Attribute Vector Auxiliary Structure Reduce Maintenance Cost 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: Data management for modern business applications. SIGMOD (2012)Google Scholar
  2. 2.
    Grund, M., Krueger, J., Plattner, H., Zeier, A., Cudre-Mauroux, P., Madden, S.: HYRISE—A Main Memory Hybrid Storage Engine. In: VLDB (2010)Google Scholar
  3. 3.
    Hildenbrand, S.: Scaling Out Column Stores: Data, Queries, and Transactions. PhD thesis, ETH Zurich (2012)Google Scholar
  4. 4.
    Kemper, A., Neumann, T.: HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: ICDE (2011)Google Scholar
  5. 5.
    Krüger, J., Kim, C., Grund, M., Satish, N., Schwalb, D., Chhugani, J., Plattner, H., Dubey, P., Zeier, A.: Fast Updates on Read-Optimized Databases Using Multi-Core CPUs. In: VLDB (2011)Google Scholar
  6. 6.
    Lemke, C., Sattler, K.-U., Faerber, F., Zeier, A.: Speeding up queries in column stores. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 117–129. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    MacNicol, R., French, B.: Sybase IQ Multiplex - Designed For Analytics. In: VLDB (2004)Google Scholar
  8. 8.
    Mühe, H., Kemper, A., Neumann, T.: Executing Long-Running Transactions in Synchronization-Free Main Memory Database Systems. In: CIDR (2013)Google Scholar
  9. 9.
    Plattner, H.: A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database. In: SIGMOD (2009)Google Scholar
  10. 10.
    Psaroudakis, I., Scheuer, T., May, N.: Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads. In: ADMS in Conjunction with VLDB (2013)Google Scholar
  11. 11.
    Schwalb, D., Faust, M., Krueger, J., Plattner, H.: Physical Column Organization in In-Memory Column Stores. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. 7826, pp. 48–63. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Sikka, V., Färber, F., Lehner, W., Cha, S.K., Peh, T., Bornhövd, C.: Efficient Transaction Processing in SAP HANA Database - The End of a Column Store Myth. In: SIGMOD (2012)Google Scholar
  13. 13.
    Stonebraker, M., Abadi, D., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E.: C-store: A Column-oriented DBMS. In: VLDB (2005)Google Scholar
  14. 14.
    Willhalm, T., Popovici, N., Boshmaf, Y., Plattner, H., Zeier, A., Schaffner, J.: SIMD-Scan: Ultra Fast in-Memory Table Scan Using on-Chip Vector Processing Units. In: VLDB (2009)Google Scholar
  15. 15.
    Zukowski, M., Boncz, P., Nes, N., Heman, S.: MonetDB/X100—A DBMS in the CPU cache. IEEE Data Engineering Bulletin (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • David Schwalb
    • 1
  • Markus Dreseler
    • 1
  • Martin Faust
    • 1
  • Johannes Wust
    • 1
  • Hasso Plattner
    • 1
  1. 1.Hasso Plattner InstitutePotsdamGermany

Personalised recommendations