Modeling and Analyzing of 3D DRAM as L3 Cache Based on DRAMSim2

  • Litiao QiuEmail author
  • Lei Wang
  • Qiang Dou
  • Zhenyu Zhao
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 592)


Cache memory system with a die-stacking DRAM L3 cache is a promising answer to break the Memory Wall and has a positive effect on performance. In order to further optimize the existing memory system, in this paper, a 3D DRAM as L3 Cache is modeled and analyzed based on DRAMSim2 simulator. In order to use an on-die DRAM as cache, tags and data are combined in one row in the DRAM, meanwhile, utilize the 3D DRAM with wider bus width and denser capacity. The cache memory modeling platform is evaluated by running traces which simulate the access behavior of core from spec2000 that generated by gem5. With DRAM L3 cache, all the test traces experience an improvement of performance. Read operation has an average speed-up of 1.82× over the baseline memory system, while write operation is 6.38×. The improvement of throughput in 3D DRAM cache compared to baseline system can reach to 1.45×’s speedup.


DRAM cache 3D Die-stacking Modeling 



This work was supported by the National Nature Science Foundation of China (61402501, 61272139).


  1. 1.
    Rosenfeld, P., Cooper-Balis, E., Jacob, B.: DRAMSim2: A cycle accurate memory system simulator. Comput. Archit. Lett. 10(1), 16–19 (2011)CrossRefGoogle Scholar
  2. 2.
    Sadri, M., Jung, M., Weis, C., et al.: Energy optimization in 3D MPSoCs with wide-I/O DRAM using temperature variation aware bank-wise refresh. In: Proceedings of the conference on Design, Automation & Test in Europe. European Design and Automation Association, p. 281(2014)Google Scholar
  3. 3.
    Woo, D.H., Seong, N.H., Lewis, D.L., et al.: An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth. In: High 2010 IEEE 16th International Symposium on Performance Computer Architecture (HPCA), pp. 1–12. IEEE (2010)Google Scholar
  4. 4.
    Loh, G.H.: 3D-stacked memory architectures for multi-core processors ACM SIGARCH computer architecture news. IEEE Comput. Soc. 36(3), 453–464 (2008)Google Scholar
  5. 5.
    Chou, C.C., Jaleel, A., Qureshi, M.K.: CAMEO: a two-level memory organization with capacity of main memory and flexibility of hardware-managed cache. In: 2014 47th Annual IEEE/ACM International Symposium on Micro-architecture (MICRO), 1–12. IEEE (2014)Google Scholar
  6. 6.
    Jevdjic, D., Volos, S., Falsafi, B.: Die-stacked DRAM caches for servers: hit ratio, latency, or bandwidth? have it all with footprint cache. ACM SIGARCH Comput. Archit. News 41(3), 404–415 (2013)CrossRefGoogle Scholar
  7. 7.
    Xie, Y., Loh, G.H., Black, B., et al.: Design space exploration for 3D architectures. ACM J. Emerg. Technol. Comput. Syst. (JETC) 2(2), 65–103 (2006)CrossRefGoogle Scholar
  8. 8.
    Wulf, W.A., McKee, S.A.: Hitting the memorywall: implications of the obvious. Comput. Archit. News 23(1), 20–24 (1995)CrossRefGoogle Scholar
  9. 9.
    Kgil, T.H., D’Souza, S., Saidi, A.G., Binkert, N., Dreslinski, R., Reinhardt, S., Flautner, K., Mudge, T.: Pico server: using 3D stacking technology to enable a compact energy efficient chip multiprocessor. In: Proceedings of the 12th Symposium on Architectural Support for Programming Languages and Operating Systems (2006)Google Scholar
  10. 10.
    Madan, N., Balasubramonian, R.: Leveraging 3D technology for improved reliability. In: Proceedings of the 40th International Symposium on Micro-architecture (2007)Google Scholar
  11. 11.
    Loh, G.H., Hill, M.D.: Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In: Proceedings of the 44th International Symposium on Micro-architecture, December 2011Google Scholar
  12. 12.
    Qureshi, M.K., et al.: Fundamental latency trade-off in architecting dram caches: Outperforming impractical SRAM-tags with a simple and practical design. In: MICRO-45 (2012)Google Scholar
  13. 13.
    N, Chatterjee., et al.: Leveraging heterogeneity in dram main memories to accelerate critical word access, In: MICRO-45 (2012)Google Scholar
  14. 14.
    Black, B., Annavaram, M., Brekelbaum, N., et al.: Die stacking (3D) micro-architecture. In: 2006 39th Annual IEEE/ACM International Symposium on Micro-architecture MICRO-39. (2006)Google Scholar
  15. 15.
    Henning, J.L.: SPEC CPU2000: Measuring CPU performance in the new millennium. Computer 33(7), 28–35 (2000)CrossRefGoogle Scholar
  16. 16.
    Kgil, T., D’Souza, S., Saidi, A., et al.: PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor. ACM SIGARCH Comput. Archit. News 34(5), 117–128 (2006)CrossRefGoogle Scholar
  17. 17.
    Binkert, N., Beckmann, B., Black, G., et al.: The gem5 simulator. ACM SIGARCH Comput. Archit. News 39(2), 1–7 (2011)CrossRefGoogle Scholar
  18. 18.
    Qureshi, M.K., Loh, G.H.: Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design. In: 2012 Proceedings of the 45th Annual IEEE/ACM International Symposium on Micro-architecture. IEEE Computer Society (2012)Google Scholar
  19. 19.
  20. 20.
    Chen, K., Li, S., Muralimanohar, N, et al.: CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory. In: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 33–3. EDA Consortium (2012)Google Scholar
  21. 21.
    Poremba, M., Xie, Y.: Nvmain: an architectural-level main memory simulator for emerging non-volatile memories. In: 2012 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 392–397. IEEE (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.School of ComputerNational University of Defense TechnologyChangshaChina

Personalised recommendations