Advertisement

Cache Memory Architectures for Handling Big Data Applications: A Survey

  • Purnendu DasEmail author
Conference paper
  • 15 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 767)

Abstract

Cache memory plays an important role in the efficient execution of today’s big data-based applications. The high-performance computer has multicore processors to support parallel execution of different applications and threads. These multicore processors are placed in a single chip called chip multiprocessor (CMP). Each core has its own private cache memories, and all the core share a common last-level cache (LLC). The performance of LLC plays a major role in handling big data-based applications. In this paper, we have done a survey on the innovative techniques proposed for efficiently handling big data-based applications in the LLC of CMPs.

Keywords

Cache memory Near-data processing Big data Multicore architectures 

References

  1. 1.
    R. Balasubramonian, N.P. Jouppi, N. Muralimanohar, Multi-Core Cache Hierarchies (Morgan and Claypool Publishers, 2011)Google Scholar
  2. 2.
    G.H. Loh, 3D-stacked memory architectures for multi-core processors. ACM SIGARCH Computer Architecture News 36, 453–464 (2008)CrossRefGoogle Scholar
  3. 3.
    J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, S.W. Keckler, A NUCA substrate for flexible CMP cache sharing, in Proceedings of the 19th Annual International Conference on Supercomputing (ICS) (2005), pp. 31–40Google Scholar
  4. 4.
    U. Nawathe, M. Hassan, L. Warriner, K. Yen, B. Upputuri, D. Greenhill, A. Kumar, H. Park, An 8-Core 64-thread 64B power-efficient SPARC SoC, in Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) (2007), pp. 108–590Google Scholar
  5. 5.
    AMD Athlon 64 X2 Dual-Core Processor for Desktop. (Online). Available: http://www.amd.com/us-en/Processors/ProductInformation/0,,30118948513041,00.html
  6. 6.
    J. Chang, G.S. Sohi, Cooperative caching for chip multiprocessors, in Proceedings of the International Symposium on Computer Architecture (ISCA) (2006), pp. 264–276CrossRefGoogle Scholar
  7. 7.
    S. Das, H.K. Kapoor, Dynamic associativity management using fellow sets, in Proceedings of the 2013 International Symposium on Electronic System Design (ISED) (2013), pp. 133–137Google Scholar
  8. 8.
    M.K. Qureshi, D. Thompson, Y.N. Patt, The V-way cache: demand based associativity via global replacement. ACM SIGARCH Comput Archit. News 33(2), 544–555 (2005)CrossRefGoogle Scholar
  9. 9.
    S. Das, H.K. Kapoor, Victim retention for reducing cache misses in tiled chip multiprocessors. Microprocess. Microsyst. 38(4), 263–275 (2014)CrossRefGoogle Scholar
  10. 10.
    S. Das, H.K. Kapoor, Exploration of migration and replacement policies for dynamic NUCA over tiled CMPs, in Proceedings of the 28th International Conference on VLSI Design (VLSID) (2015)Google Scholar
  11. 11.
    C. Kim, D. Burger, S.W. Keckler, An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. ACM SIGOPS Operating Syst. Rev. 36, 211–222 (2002)CrossRefGoogle Scholar
  12. 12.
    Z. Chishti, M.D. Powell, T.N. Vijaykumar, Distance associativity for high-performance energy-efficient non-uniform cache architectures, in Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2003), pp. 55–66Google Scholar
  13. 13.
    J. Lira, C. Molina, A. Gonzalez, HK-NUCA: boosting data searches in dynamic non-uniform cache architectures for chip multiprocessors, in Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS), (2011), pp. 419–430Google Scholar
  14. 14.
    W. Ding, M. Kandemir, Improving last level cache locality by integrating loop and data transformations, in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2012), pp. 65–72Google Scholar
  15. 15.
    J.L. Hennessy, D.A. Patterson, Computer Architecture: A Quantitative Approach, 4th edn. (Elsevier, 2007)Google Scholar
  16. 16.
    A. El-Moursy, F. Sibai, V-set cache design for LLC of multi-core processors, in Proceedings of the IEEE 14th International Conference on High Performance Computing and Communication and IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS) (2012), pp. 995–1000Google Scholar
  17. 17.
    G.H. Loh, Y. Xie, B. Black, Processor design in 3D die-stacking technologies. IEEE Micro 27(3), 31–48 (2007)CrossRefGoogle Scholar
  18. 18.
    Intel. Quad-Core Intel Xeon Processor 5400 Series, apr 2008. (Online). Available: http://download.intel.com/design/xeon/datashts/318589.pdf
  19. 19.
    D. Rolán, B.B. Fraguela, R. Doallo, Adaptive line placement with the set balancing cache, in Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2009), pp. 529–540Google Scholar
  20. 20.
    S. Das, N. Polavarapu, P.D. Halwe, H.K. Kapoor, Random-LRU: a replacement policy for chip multiprocessors, in Proceedings of the International Symposium on VLSI Design and Test (VDAT) (2013)Google Scholar
  21. 21.
    D. Sanchez, C. Kozyrakis, The ZCache: decoupling ways and associativity, in Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2010), pp. 187–198Google Scholar
  22. 22.
    P. Foglia, M. Comparetti, A workload independent energy reduction strategy for D-NUCA caches. J. Supercomput. 68, 157–182 (2013)CrossRefGoogle Scholar
  23. 23.
    H. Kapoor, S. Das, S. Chakraborty, Static energy reduction by performance linked cache capacity management in Tiled CMPs, in Proceedings of the 30th ACM/SIGAPP Symposium On Applied Computing (SAC) (2015)Google Scholar
  24. 24.
    B.M. Beckmann, D.A. Wood, Managing wire delay in large chip-multiprocessor caches, in Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2004), pp. 319–330Google Scholar
  25. 25.
    B.M. Beckmann, D.A. Wood, TLC: transmission line caches, in Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2003), pp. 43–54Google Scholar
  26. 26.
    Z. Chishti, M.D. Powell, T.N. Vijaykumar, Optimizing replication, communication, and capacity allocation in CMPs. ACM SIGARCH Comput. Archit. News 33, 357–368 (2005)CrossRefGoogle Scholar
  27. 27.
    M. Zhang, K. Asanovic, Victim replication: maximizing capacity while hiding wire delay in tiled chip multiprocessors, Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA), vol. 0, (2005), , pp. 336–345Google Scholar
  28. 28.
    M. Hammoud, S. Cho, R. Melhem, Dynamic cache clustering for chip multiprocessors, in Proceedings of the 23rd International Conference on Supercomputing (ICS) (2009), pp. 56–67Google Scholar
  29. 29.
    D. Zhan, H. Jiang, S.C. Seth, STEM: spatiotemporal management of capacity for intra-core last level caches, in Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2010), pp. 163–174Google Scholar
  30. 30.
    M.K. Qureshi, A. Jaleel, Y.N. Patt, S.C. Steely, J. Emer, Adaptive insertion policies for high performance caching. ACM SIGARCH Comput. Archit. News 35(2), 381–391 (2007)CrossRefGoogle Scholar
  31. 31.
    S.M. Khan, D.A. Jiménez, D. Burger, B. Falsafi, Using dead blocks as a virtual victim cache, in Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT) (2010), pp. 489–500Google Scholar
  32. 32.
    R. Balasubramonian, J. Chang, T. Manning, J.H. Moreno, R. Murphy, R. Nair, S. Swanson, Near-data processing: Insights from a micro-46 workshop. IEEE Micro 34(4), 36–42 (2014)CrossRefGoogle Scholar
  33. 33.
    H. Asghari-Moghaddam, A. Farmahini-Farahani, K. Morrow, J.H. Ahn, N.S. Kim, Near-DRAM acceleration with single-ISA heterogeneous processing in standard memory Mmodules. IEEE Micro 36(1), 24–34 (2016)CrossRefGoogle Scholar
  34. 34.
    D. Park, J. Wang, Y.S. Kee, In-storage computing for Hadoop MapReduce framework: challenges and possibilities. IEEE Trans. Comput. 99, 1–1 (2016)CrossRefGoogle Scholar
  35. 35.
    S.H. Pugsley, A. Deb, R. Balasubramonian, F. Li, Fixed-function hardware sorting accelerators for near data MapReduce execution, in Proceedings of the 33rd IEEE International Conference on Computer Design (ICCD) (2015), pp. 439–442Google Scholar
  36. 36.
    B. Gu, A.S. Yoon, D.H. Bae, I. Jo, J. Lee, J. Yoon, J.U. Kang, M. Kwon, C. Yoon, S. Cho, J. Jeong, D. Chang, Biscuit: a framework for near-data processing of big data workloads, in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016), pp. 153–165Google Scholar
  37. 37.
    M. Gao, C. Kozyrakis, HRL: efficient and flexible reconfigurable logic for near-data processing, in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA) (2016), pp. 126–137Google Scholar
  38. 38.
    V.T. Lee, A. Mazumdar, C.C.d. Mundo, A. Alaghi, L. Ceze, M. Oskin, POSTER: application-driven near-data processing for similarity search, in 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT) (2017), pp. 132–133Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceAssam University SilcharSilcharIndia

Personalised recommendations