Skip to main content

Reducing NoC and Memory Contention for Manycores

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9637))

Abstract

Platforms consisting of many computing cores have become the mainstream in high performance computing, general purpose-computing and, lately, embedded systems. Such systems provide increased processing power and system availability, but often impose latencies and contention for memory accesses as multiple cores try to reference data at the same time. This may result in sub-optimal performance unless special allocation policies are employed. On a multi-processor board with 50 or more processing cores, the NoC (Network On Chip) adds to this challenge. This work evaluates the impact of bank-aware and controller-aware allocation on NoC contention. Experiments show that targeted memory allocation results in reduced execution times and NoC contention, the latter of which has not been studied before at this scale.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Wulf, W.A., McKee, S.A.: Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput. Archit. News 23(1), 20–24 (1995)

    Article  Google Scholar 

  2. Programming The Tile Processor, Tilera. http://www.tilera.com/

  3. Application Libraries Reference Manual, Tilera. http://www.tilera.com/

  4. Tilera processor family. www.tilera.com

  5. Intel xeon phi, April 2015. https://www-ssl.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessor-datasheet.html

  6. Single-chip cloud computer. blogs.intel.com/intellabs/2009/12/sccloudcomp.php

  7. Adapteva processor family. www.adapteva.com/products/silicon-devices/e16g301/

  8. Tile Processor I/O Device Guide, Tilera. http://www.tilera.com/

  9. Tile Processor User Architecture Overview, Tilera. http://www.tilera.com/

  10. Yun, H., Mancuso, R., Wu, Z.-P., Pellizzoni, R.: Palloc: dram bank-aware memory allocator for performance isolation on multicore platforms. In: IEEE Real-Time and Embedded Technology and Applications Symposium, vol. 356 (2014)

    Google Scholar 

  11. Jeong, M.K., Yoon, D.H., Sunwoo, D., Sullivan, M., Lee, I., Erez, M.: Balancing DRAM locality and parallelism in shared memory CMP systems. In: International Symposium on High Performance Computer Architecture, pp. 1–12 (2012)

    Google Scholar 

  12. Liu, L., Cui, Z., Xing, M., Bao, Y., Chen, M., Wu, C.: A software memory partition approach for eliminating bank-level interference in multicore systems. In: International Conference on Parallel Architectures and Compilation Techniques, pp. 367–376 (2012)

    Google Scholar 

  13. Tile Processor User Architecture Reference, Tilera. http://www.tilera.com/scm

  14. Park, H., Baek, S., Choi, J., Lee, D., Noh, S.H.: Regularities considered harmful: forcing randomness to memory accesses to reduce row buffer conflicts for multi-core, multi-bank systems. ACM SIGPLAN Notices 48(4), 181–192 (2013)

    Article  Google Scholar 

  15. Muralidhara, S.P., Subramanian, L., Mutlu, O., Kandemir, M., Moscibroda, T.: Reducing memory interference in multicore systems via application-aware memory channel partitioning. In: International Symposium on Microarchitecture, pp. 374–385 (2011)

    Google Scholar 

  16. Reineke, J., Liu, I., Patel, H.D., Kim, S., Lee, E.A.: Pret dram controller: bank privatization for predictability and temporal isolation. In: International conference on Hardware/software codesign and system synthesis, pp. 99–108 (2011)

    Google Scholar 

  17. Wu, Z.P., Krish, Y., Pellizzoni, R.: Worst case analysis of DRAM latency in multi-requestor systems. In: 34th IEEE Real-Time Systems Symposium (RTSS), pp. 372–383 (2013)

    Google Scholar 

  18. Akesson, B., Goossens, K., Ringhofer, M.: Predator: a predictable SDRAM memory controller. In: International Conference on Hardware/Software Codesign and System Synthesis, pp. 251–256 (2007)

    Google Scholar 

  19. Goossens, S., Akesson, B., Goossens, K.: Conservative open-page policy for mixed time-criticality memory controllers. In: Conference on Design, Automation and Test in Europe, pp. 525–530 (2013)

    Google Scholar 

  20. Paolieri, M., Quiñones, E., Cazorla, F.J., Valero, M.: An analyzable memory controller for hard real-time CMPs. IEEE Embed. Syst. Lett. 1(4), 86–90 (2009)

    Article  Google Scholar 

  21. Åkesson, B., Steffens, L., Strooisma, E., Goossens, K. et al.: Real-time scheduling of hybrid systems using credit-controlled static-priority arbitration. In: RTCSA (2008)

    Google Scholar 

  22. Kim, S., Chandra, D., Solihin, Y.: Fair cache sharing and partitioning in a chip multiprocessor architecture. In: International Conference on Parallel Architectures and Compilation Techniques, pp. 111–122 (2004)

    Google Scholar 

  23. Nesbit, K.J., Laudon, J., Smith, J.E.: Virtual private caches. ACM SIGARCH Comput. Archit. News 35(2), 57–68 (2007)

    Article  Google Scholar 

  24. Liedtke, J., Hartig, H., Hohmuth, M.: OS-controlled cache predictability for real-time systems. In: Third IEEE Real-Time Technology and Applications Symposium, Proceedings, pp. 213–224 (1997)

    Google Scholar 

  25. Lin, J., Lu, Q., Ding, X., Zhang, Z., Zhang, X., Sadayappan, P.: Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems. In: IEEE 14th International Symposium on High Performance Computer Architecture, HPCA 2008, pp. 367–378 (2008)

    Google Scholar 

  26. Zhang, X., Dwarkadas, S., Shen, K.: Towards practical page coloring-based multicore cache management. In: European conference on Computer systems, pp. 89–102 (2009)

    Google Scholar 

  27. Soares, L., Tam, D., Stumm, M.: Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer. In: International Symposium on Microarchitecture, pp. 258–269 (2008)

    Google Scholar 

  28. Ding, X., Wang, K., Zhang, X.: SRM-Buffer,: an OS buffer management technique to prevent last level cache from thrashing in multicores. In: Conference on Computer systems, pp. 243–256 (2011)

    Google Scholar 

  29. Ward, B.C., Herman, J.L., Kenna, C.J., Anderson, J.H.: Outstanding paper award: making shared caches more predictable on multicore platforms. In: 25th Euromicro Conference on Real-Time Systems (ECRTS), pp. 157–167 (2013)

    Google Scholar 

  30. Mancuso, R., Dudko, R., Betti, E., Cesati, M., Caccamo, M., Pellizzoni, R.: Real-time cache management framework for multi-core architectures. In: IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS), pp. 45–54 (2013)

    Google Scholar 

  31. Buono, D., Danelutto, M., Lametti, S., Torquati, M.: Parallel patterns for general purpose many-core. In: Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 131–139 (2013)

    Google Scholar 

Download references

Acknowledgment

Tilera Corporation provided technical support of the research. This work was funded in part by NSF grants 1239246 and 1058779 as well as a grant from AFOSR via Securboration.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Mueller .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chandru, V., Mueller, F. (2016). Reducing NoC and Memory Contention for Manycores. In: Hannig, F., Cardoso, J.M.P., Pionteck, T., Fey, D., Schröder-Preikschat, W., Teich, J. (eds) Architecture of Computing Systems – ARCS 2016. ARCS 2016. Lecture Notes in Computer Science(), vol 9637. Springer, Cham. https://doi.org/10.1007/978-3-319-30695-7_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30695-7_22

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30694-0

  • Online ISBN: 978-3-319-30695-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics