On Automated Feedback-Driven Data Placement in Multi-tiered Memory

  • T. Chad Effler
  • Adam P. Howard
  • Tong Zhou
  • Michael R. Jantz
  • Kshitij A. Doshi
  • Prasad A. Kulkarni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10793)

Abstract

Recent emergence of systems with multiple performance and capacity tiers of memory invites a fresh consideration of strategies for optimal placement of data into the various tiers. This work explores a variety of cross-layer strategies for managing application data in multi-tiered memory. We propose new profiling techniques based on the automatic classification of program allocation sites, with the goal of using those classifications to guide memory tier assignments. We evaluate our approach with different profiling inputs and application strategies, and show that it outperforms other state-of-the-art management techniques.

Notes

Acknowledgements

This research is supported in part by the National Science Foundation under CCF-1619140, CCF-1617954, and CNS-1464288, as well as a grant from the Software and Services Group (SSG) at Intel®.

References

  1. 1.
  2. 2.
    Mittal, S., Vetter, J.S.: A survey of techniques for architecting DRAM caches. IEEE Trans. Parallel Distrib. Syst. 27(6), 1852–1863 (2016)CrossRefGoogle Scholar
  3. 3.
    Meswani, M., Blagodurov, S., Roberts, D., Slice, J., Ignatowski, M., Loh, G.: Heterogeneous memory architectures: a HW/SW approach for mixing die-stacked and off-package memories. In: HPCA, 2015 (February 2015)Google Scholar
  4. 4.
    Li, Y., Ghose, S., Choi, J., Sun, J., Wang, H., Mutlu, O.: Utility-based hybrid memory management. In: IEEE CLUSTER (September 2017)Google Scholar
  5. 5.
    Cantalupo, C., Venkatesan, V., Hammond, J.R.: User extensible heap manager for heterogeneous memory platforms and mixed memory policies (2015). http://memkind.github.io/memkind/memkind_arch_20150318.pdf
  6. 6.
    Dulloor, S.R., et al.: Data tiering in heterogeneous memory systems. In: Eleventh European Conference on Computer Systems, p. 15. ACM (2016)Google Scholar
  7. 7.
    Agarwal, N., et al.: Page placement strategies for GPUs within heterogeneous memory systems. SIGPLAN Not. 50(4), 607–618 (2015)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Luk, C.K., et al.: Pin: building customized program analysis tools with dynamic instrumentation. SIGPLAN Not. 40(6), 190–200 (2005)CrossRefGoogle Scholar
  9. 9.
    Evans, J.: A scalable concurrent malloc (3) implementation for FreeBSD (2006)Google Scholar
  10. 10.
    Kim, Y., Yang, W., Mutlu, O.: Ramulator: a fast and extensible DRAM simulator. IEEE Comput. Archit. Lett. 15(1), 45–49 (2016).  https://doi.org/10.1109/LCA.2015.2414456 CrossRefGoogle Scholar
  11. 11.
    Giardino, M., Doshi, K., Ferri, B.H.: Soft2LM: application guided heterogeneous memory management. In: IEEE International Conference on Networking, Architecture and Storage (NAS), USA, pp. 1–10 (2016)Google Scholar
  12. 12.
    Agarwal, N., Wenisch, T.F.: Thermostat: application-transparent page management for two-tiered main memory. In: ASPLOS. ASPLOS 2017, pp. 631–644. ACM, New York (2017)Google Scholar
  13. 13.
    Peng, I.B., Gioiosa, R., Kestor, G., Cicotti, P., Laure, E., Markidis, S.: RTHMS: a tool for data placement on hybrid memory system. In: ISMM (2017)Google Scholar
  14. 14.
    Servat, H., Pea, A.J., Llort, G., Mercadal, E., Hoppe, H., Labarta, J.: Automating the application data placement in hybrid memory systems. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER) (September 2017)Google Scholar
  15. 15.
    Dashti, M., Fedorova, A., Funston, J., Gaud, F., Lachaize, R., Lepers, B., Quema, V., Roth, M.: Traffic management: a holistic approach to memory placement on NUMA systems. SIGPLAN Not. 48(4), 381–394 (2013)CrossRefGoogle Scholar
  16. 16.
    Jantz, M.R., et al.: A framework for application guidance in virtual memory systems. In: Virtual Execution Environments. VEE 2013, pp. 155–166 (2013)Google Scholar
  17. 17.
    Jantz, M.R., et al.: Cross-layer memory management for managed language applications. In: ACM/SIGPLAN OOPSLA. ACM, New York (2015)Google Scholar
  18. 18.
    Guo, R., Liao, X., Jin, H., Yue, J., Tan, G.: NightWatch: integrating lightweight and transparent cache pollution control into dynamic memory allocation systems. In: 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp. 307–318 (2015)Google Scholar
  19. 19.
    Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: Code Generation and Optimization (2004)Google Scholar
  20. 20.
    Sodani, A.: Knights Landing (KNL): 2nd generation Intel® Xeon Phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015)Google Scholar
  21. 21.
    Hamerly, G., Perelman, E., Lau, J., Calder, B.: Simpoint 3.0. J. Instr. Level Parallelism 7(4), 1–28 (2005)Google Scholar
  22. 22.
    Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)CrossRefGoogle Scholar
  23. 23.
    Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: PLDI (2007)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • T. Chad Effler
    • 1
  • Adam P. Howard
    • 1
  • Tong Zhou
    • 1
  • Michael R. Jantz
    • 1
  • Kshitij A. Doshi
    • 2
  • Prasad A. Kulkarni
    • 3
  1. 1.University of TennesseeKnoxvilleUSA
  2. 2.Intel CorporationSanta ClaraUSA
  3. 3.University of KansasLawrenceUSA

Personalised recommendations