Processing-in-Memory: Exploring the Design Space

  • Marko Scrbak
  • Mahzabeen Islam
  • Krishna M. KaviEmail author
  • Mike Ignatowski
  • Nuwan Jayasena
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9017)


With the emergence of 3D-DRAM, Processing-in-Memory has once more become of great interest to the research community and industry. In this paper, we present our observations on a subset of the PIM design space. We show how the architectural choices for PIM core frequency and cache sizes will affect the overall power consumption and energy efficiency. Our findings include detailed power consumption modeling for an ARM-like core as a PIM core. We show the maximum number of PIM cores we can place in the logic layer with respect to a power budget. In addition, we explore the optimal design choices for the number of cores as a function of frequency, utilization, and energy efficiency.


Processing-in-memory 3D-DRAM Big data MapReduce 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kogge, P.M., Jay, B.B., Sterling, T., Guang, G.: Processing in memory: chips to petaflops. In: Workshop on Mixing Logic and DRAM: Chips that Compute and Remember at ISCA, vol. 97 (1997)Google Scholar
  2. 2.
    Zhang, D.P., Jayasena, N., Lyashevsky, A., et al.: A new perspective on processing-in-memory architecture design. In: Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, p. 7. ACM (2013)Google Scholar
  3. 3.
    Loh, G., Jayasena, N., Oskin, M., et al.: A processing in memory taxonomy and a case for studying fixed-function PIM. In: WoNDP: 1st Workshop on Near-Data Processing (2013)Google Scholar
  4. 4.
    Rezaei, M., Kavi, K.M.: Intelligent memory manager: Reducing cache pollution due to memory management functions. Journal of Systems Architecture 52(1), 41–55 (2006)CrossRefGoogle Scholar
  5. 5.
    Chang, D.W., Byun, G., Kim, H., et al.: Reevaluating the latency claims of 3D stacked memories. In: Design Automation Conference (ASP-DAC), 2013 18th Asia and South Pacific, pp. 657–662. IEEE (2013)Google Scholar
  6. 6.
    Gara, A.: Energy efficiency challenges for exascale computing. In: ACM/IEEE Conference on Supercomputing: Workshop on Power Efficiency and the Path to Exascale Computing (2008)Google Scholar
  7. 7.
    Keckler, S.W., Dally, W.J., Khailany, B.: GPUs and the future of parallel computing. IEEE Micro 31(5), 7–17 (2011)CrossRefGoogle Scholar
  8. 8.
    Islam, M., Scrbak, M., Kavi, K.M., Ignatowski, M., Jayasena, N.: Improving node-level MapReduce performance using processing-in-memory technologies. In: Lopes, L., et al. (eds.) Euro-Par 2014, Part II. LNCS, vol. 8806, pp. 425–437. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  9. 9.
    Black, B., Annavaram, M., Brekelbaum, N., DeVale, J., et al.: Die stacking (3D) microarchitecture. In: Micro, pp. 469–479. IEEE (2006)Google Scholar
  10. 10.
    Hybrid Memory Cube Consortium.
  11. 11.
    Draper, J., Chame, J., Hall, M., et al.: The architecture of the DIVA processing-in-memory chip. In: Proceedings of the Supercomputing, pp. 14–25. ACM (2002)Google Scholar
  12. 12.
  13. 13.
    Patterson, D., Anderson, T., Cardwell, N., et al.: A case for intelligent RAM. Micro 17(2), 34–44 (1997)Google Scholar
  14. 14.
    Pugsley, S.H., Jestes, J., Zhang, H.: NDC: analyzing the impact of 3D-stacked memory+logic devices on mapreduce workloads. In: International Symposium on Performance Analysis of Systems and Software (2014)Google Scholar
  15. 15.
    Torrellas, J.: FlexRAM: toward an advanced intelligent memory system: a retrospective paper. In: Intlernational Conference on Computer Design, pp. 3–4. IEEE (2012)Google Scholar
  16. 16.
  17. 17.
    Graham, S.: HMC overview. In: memcon Proceedings (2012)Google Scholar
  18. 18.
    Zhang, D., Jayasena, N., Lyashevsky, A., et al.: TOP-PIM: throughput-oriented programmable processing in memory. In: Proceedings of international symposium on High-performance parallel and distributed computing, pp. 85–98. ACM (2014)Google Scholar
  19. 19.
    gem5 Simulator System.
  20. 20.
    Ferdman, M., Adileh, A., Kocberber, O., et al.: A case for specialized processors for scale-out workloads. In: Micro, pp. 31–42. IEEE (2014)Google Scholar
  21. 21.
  22. 22.
    Brockman, J.B., Kogge, P.M.: The Case for Processing-in-Memory. In: Reports in University of Notre Dame (1997)Google Scholar
  23. 23.
    Kogge, P.M.: EXECUBE-A new architecture for scaleable MPPs. In: International Conference on Parallel Processing, vol. 1, pp. 77–84. IEEE (1994)Google Scholar
  24. 24.
    Mai, K., Paaske, T., Jayasena, N., et al.: Smart memories: a modular reconfigurable architecture, vol. 28, no. 2. ACM (2000)Google Scholar
  25. 25.
  26. 26.
    Chen, K., Li, S., Muralimanohar, N., et al.: CACTI-3DD: architecture-level modeling for 3D die-stacked DRAM main memory. In: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 33–38. EDA Consortium (2012)Google Scholar
  27. 27.
    Spiliopoulos, V., Bagdia, A., Hansson, A., et al.: Introducing DVFS-management in a full-system simulator. In: Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 535–545. IEEE (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Marko Scrbak
    • 1
  • Mahzabeen Islam
    • 1
  • Krishna M. Kavi
    • 1
    Email author
  • Mike Ignatowski
    • 2
  • Nuwan Jayasena
    • 2
  1. 1.University of North TexasDentonUSA
  2. 2.AMD Research - Advanced Micro Devices, Inc.SunnyvaleUSA

Personalised recommendations