Exploiting Intelligent Memory for Database Workloads
The increased transistor integration on a single chip has allowed for emerging technologies such as the merging of memory and logic. These chips, known as Intelligent Memory, offer increased bandwidth and reduced latency from computation to memory.
In this work, we focus on exploiting the features of a proposed Intelligent Memory chip, FlexRAM, for database workloads. To achieve this goal, we developed FlexDB, a simple DBMS prototype that includes modified parallel algorithms, an efficient data redistribution algorithm, and simple mathematical models for query optimization.
We tested FlexDB using three queries from the TPC-H benchmark on a simulated system configured with FlexRAM chips including up to 64 processing elements and a total memory size large enough to fit the whole database. Compared to a single processor system, the speedup values for a single FlexRAM chip system range from 4 to 92. These results scale when we add more FlexRAM chips to the system. Compared to a shared-memory multiprocessor, we observe that for two out of the three queries our approach achieves a speedup between 4 and an order of magnitude. This leads us to conclude that commercial workloads may benefit significantly from the use of Intelligent Memory chips such as FlexRAM.
KeywordsProcessing Element Load Imbalance Superscalar Processor Query Execution Plan Memory Processor
Unable to display preview. Download preview PDF.
- 1.M. Acharya, M. Uysal, and J. Saltz. Active Disks: Programming Model, Algorithms and Evaluation. In Proceedings of ASPLOS VIII, pages 81–91, 1998.Google Scholar
- 2.A. Ailamaki, D. DeWitt, M. Hill, and D. Wood. DBMSs on a modern processor: where does time go? In Proceedings of the 25th VLDB Conference, pages 266–277, 1999.Google Scholar
- 3.L. Barroso, K. Gharachorloo, and E. Bugnion. Memory System Characterization of Commercial Workloads. In Proceedings of the 25th International Symposium on Computer Architecture, pages 3–14, 1998.Google Scholar
- 4.L. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing. In Proceedings of the 27th International Symposium on Computer Architecture, pages 282–293, 2000.Google Scholar
- 5.P. Boncz, S. Manegold, and M. Kersten. Database Architecture Optimized for the new Bottleneck: Memory Access. In Proceedings of the 25th VLDB Conference, pages 54–65, 1999.Google Scholar
- 6.D. Burger and J. Goodman. Guest Editors’ Introduction: Billion-Transistor Architectures. IEEE Computer Magazine, 30(9):46–49, September 1997.Google Scholar
- 7.Transaction Processing Performance Council. TPC Benchmark™ H (Decision Support), Standard Specification, June 1999.Google Scholar
- 8.D. DeWitt. The Wisconsin Benchmark: Past, Present, and Future. In J. Gray, editor, The Benchmark Handbook. Morgan Kaufmann Publishers, San Mateo, CA, 1991.Google Scholar
- 9.D. DeWitt, R. Gerber, G. Graefe, M. Heytens, K. Kumar, and M. Muralikrishna. GAMMA: A High Performance Dataflow Database Machine. In Proceedings of the VLDB Conference, pages 228–237, 1986.Google Scholar
- 10.D. Elliot, W. Snelgrove, and M. Stumm. Computational Ram: A Memory-SIMD Hybrid and its Application to DSP. In Proceedings of the Custom Integrated Circuits Conference, pages 30.6.1–30.6.4, 1992.Google Scholar
- 12.Tom’s Hardware Guide. A New Kind of Fast: AMD Athlon XP 2200+. http://www6.tomshardware.com/cpu/02q2/020610/thoroughbred-07.html.
- 13.M. Hall, P. Kogge, J. Koller, P. Diniz, J. Chame, J. Draper, J. LaCoss, J. Granacki, J. Brockman, A. Srivastava, W. Athas, V. Freeh, J. Shin, and J. Park. Mapping Irregular Applications to DIVA, a PIM-based Data-Intensive Architecture. In Proceedings of Supercomputing 1999, 1999.Google Scholar
- 14.S. Iyer and H. Kalter. Embedded DRAM Technology: Opportunities and Challenges. IEEE Spectrum, pages 56–64, April 1999.Google Scholar
- 15.Y. Kang, W. Huang, S.-M. Yoo, D. Keen, Z. Ge, V. Lam, P. Pattnaik, and J. Torrellas. FlexRAM: Toward an Advanced Intelligent Memory System. In Proceedings of the 1999 International Conference on Computer Design, pages 192–201, 1999.Google Scholar
- 17.D. Knuth. Sorting and Searching, volume III of The Art of Computer Programming. Addison-Wesley, Reading, MA, 1973.Google Scholar
- 18.V. Krishnan and J. Torrellas. An Execution-Driven Framework for Fast and Accurate Simulation of Superscalar Processors. In Proceedings of the Parallel Architecture and Compilation Techniques, 1998.Google Scholar
- 19.K. Mai, T. Paaske, N. Jayasena, R. Ho, and M. Horowitz. Smart Memories: A Modular Reconfigurable Architecture. In Proceedings of the 27th International Symposium on Computer Architecture, pages 161–171, 2000.Google Scholar
- 20.IBM Microelectronics. Blue Logic SA-27E ASIC. News and Ideas of IBM Microelectronics, February 1999.Google Scholar
- 21.M. Oskin, F. Chong, and T. Sherwood. Active Pages: A Computation Model for Intelligent Memory. In Proceedings of the 1998 International Symposium on Computer Architecture, pages 192–203, 1998.Google Scholar
- 23.PostgreSQL. http://www.postgresql.org.
- 24.J. Rao and K.A. Ross. Cache Conscious Indexing for Decision-Support in Main Memory. In Proceedings of the VLDB Conference, pages 78–89, 1999.Google Scholar
- 25.S. Rixner, W. Dally, U. Kapasi, U. Khailany, A. Lopez-Lagunas, P. Matterson, and J. Owens. A Bandwidth-Efficient Architecture for Media Processing. In Proceedings of the 31st Symposium on Microarchitecture, pages 3-13, 1998.Google Scholar
- 26.Samsung. Embedded DRAM, http://www.usa.samsungsemi.com/products/asic/embeddeddram.htm.
- 27.Mitsubishi Semiconductors. eRAM. http://www.mitsubishichips.com/eram/eram.html.
- 28.A. Shatdal, C. Kant, and J.F. Naughton. Cache Conscious Algorithms for Relational Query Processing. In Proceedings of the 20th VLDB Conference, pages 510–521, 1994.Google Scholar
- 29.M. Stonebraker. The Design of the POSTGRES Storage System. In Proceedings of the VLDB Conference, pages 289–300, 1987.Google Scholar
- 30.M. Stonebraker, P. Aoki, and M. Seltzer. The Design of XPRS. In Proceedings of the VLDB Conference, pages 318–330, 1988.Google Scholar
- 31.P. Trancoso, J-L. Larriba-Pey, Z. Zhang, and J. Torrellas. The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors. In Proceedings of the Third International Symposium on High-Performance Computer Archithecture, 1997.Google Scholar
- 32.P. Trancoso and J. Torrellas. Cache Optimization for Memory-Resident Decision Support Commercial Workloads. In Proceedings of the 1999 International Conference on Computer Design, pages 546–555, 1999.Google Scholar
- 33.J. Veenstra and R. Fowler. MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors. In Proceedings of MASCOTS′94, pages 201–207, 1994.Google Scholar