Skip to main content

Reducing Last Level Cache Pollution in NUMA Multicore Systems for Improving Cache Performance

  • Conference paper
Computational Science and Its Applications – ICCSA 2012 (ICCSA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7335))

Included in the following conference series:

  • 2439 Accesses

Abstract

Non-uniform memory architecture (NUMA) system has numerous nodes with shared last level cache (LLC). Their shared LLC has brought many benefits in the cache utilization. However, LLC can be seriously polluted by tasks that cause huge I/O traffic for a long time since inclusive cache architecture of LLC replaces valid cache line by back-invalidate. Many research on the page coloring, partitioning, and pollute buffer mechanism handled this cache pollution. But, there are no scheduling approaches considering I/O-intensive tasks in NUMA systems. To address the above problem, OS scheduling that reduces cache pollution is highly needed in NUMA systems.

In this paper, we propose a software-based mechanism that reduces shared LLC miss in NUMA systems. Our mechanism includes I/O traffic measurement and devil conscious scheduling. The experimental results show that LLC miss rate can be reduced up to 37.6%, and our approach improves execution time to 1.48%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Azimi, R., Tam, D., Soares, L., Stumm, M.: Enhancing operating system support for multicore processors by using hardware performance monitoring. In: ACM Special Interest Group on Operating System, pp. 56–65 (2009)

    Google Scholar 

  2. Blagodurov, S., Zhuravlev, S., Fedorova, A., Kamali, A.: A case for NUMA system-aware contention management on multicore systems. In: 19th Parallel Architectures and Compilation Techniques, pp. 557–558 (2010)

    Google Scholar 

  3. Kim, J., Kim, J., Ahn, D., Eom, Y.: Page Coloring Synchronization for Improving Cache Performance in Virtualization Environment. In: 11th Computational Science and its Applications, pp. 495–505 (2011)

    Google Scholar 

  4. Dey, T., Wang, W., Davidson, J.W., Soffa, M.L.: Characterizing multi-threaded applications based on shared-resource contention. IEEE Performance Analysis of Systems and Software, 76–86 (2011)

    Google Scholar 

  5. Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture. In: 11th IEEE High-Performance Computer Architecture, pp. 340–351 (2005)

    Google Scholar 

  6. Soares, L., Tam, D., Stumm, M.: Reducing the Harmful Effects of Last-Level Cache Polluters with an OS-Level, Software-Only Pollute Buffer. IEEE MICRO Architecture, 258–269 (2008)

    Google Scholar 

  7. Ding, X., Wang, K., Zhang, X.: SRM-buffer: An OS buffer management technique to prevent last level cache from thrashing in multicores. In: 6th ACM European Conference on Computer Systems, pp. 243–256 (2011)

    Google Scholar 

  8. Jaleel, A., Borch, E., Bhandaru, M., Simon, C.S., Emer, J.: Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies. In: 43rd IEEE MICRO Architecture, pp. 151–162 (2010)

    Google Scholar 

  9. Molka, D., Hackenberg, D., Schöne, R., Müller, M.S.: Memory Performance and Coherency Effects on an Intel Nehalem Multiprocessor System. In: 18th IEEE Parallel Architectures and Compilation Techniques, pp. 261–270 (2009)

    Google Scholar 

  10. Zhuravlev, S., Blagodurov, S., Fedorova, A.: Addressing shared resource contention in multicore processors via scheduling. In: 15th ACM Architectural Support for Programming Languages and Operating Systems, pp. 129–142 (2010)

    Google Scholar 

  11. Qian, B., Yan, L.: The Research of the Inclusive Cache used in Multi-Core Processor. IEEE Electronic Packaging Technology & High Density Packaging, 1–4 (2008)

    Google Scholar 

  12. Tam, D., Azimi, R., Soares, L., Stumm, M.: RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations. In: 14th ACM Architectural Support for Programming Languages and Operating Systems, pp. 121–132 (2009)

    Google Scholar 

  13. Knauerhase, R., Brett, P., Hohlt, B., Li, T., Hahn, S.: Using OS Observations to Improve Performance in Multicore Systems. IEEE MICRO Achitecture, 54–66 (2008)

    Google Scholar 

  14. Xie, Y., Loh, G.H.: Dynamic Classification of Program Memory Behaviors in CMPs. In: 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (2008)

    Google Scholar 

  15. The Linux Kernel Archives: THE proc FILESYSTEM, http://www.kernel.org/doc/Documentation/filesystems/proc.txt

  16. Blagodurov, S., Fedorova, A.: User-level scheduling on NUMA system multicore systems under Linux. In: 13th Annual Linux Symposium (2011)

    Google Scholar 

  17. Jaleel, A.: Memory Characterization of Workloads Using Instrumentation-Driven Simulation, http://www.jaleels.org/ajaleel/workload/SPECanalysis.pdf

  18. SPEC CPU2006 Documentation, http://www.spec.org/cpu2006/Docs/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

An, D., Kim, J., Han, J., Eom, Y.I. (2012). Reducing Last Level Cache Pollution in NUMA Multicore Systems for Improving Cache Performance. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2012. ICCSA 2012. Lecture Notes in Computer Science, vol 7335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31137-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31137-6_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31136-9

  • Online ISBN: 978-3-642-31137-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics