Abstract
Shared L1 memories are of interest for tightly-coupled processor clusters in programmable accelerators as they provide a convenient shared memory abstraction while avoiding cache coherence overheads. The performance of a shared-L1 memory critically depends on the architecture of the low-latency interconnect between processors and memory banks, which needs to provide ultra-fast access to the largest possible L1 working set. The advent of 3D technology provides new opportunities to improve the interconnect delay and the form factor. In this chapter we propose a network architecture, 3D-LIN, based on 3D integration technology. The network can be configured based on user specifications and technology constraints to provide fast access to L1 memories on multiple stacked dies. The extracted results from the physical synthesis of 3D-LIN permit to explore trade-offs between memory size and network latency from a planar design to multiple memory layers stacked on top of logic, evaluating the improvement in both form factor and latency.
Chapter PDF
Similar content being viewed by others
References
Owens, J.D., Dally, W.J., Ho, R., Jayasimha, D.N., Keckler, S.W., Peh, L.-S.: Research challenges for on-chip interconnection networks. IEEE Micro 27, 96–108 (2007)
Borkar, S., Chien, A.A.: The Future of Microprocessors. Commun. ACM 54, 67–77 (2011)
Benini, L., De Micheli, G.: Networks on Chips: a New SoC Paradigm. Computer 35, 70–78 (2002)
Balkan, A., Qu, G., Vishkin, U.: A Mesh-of-Trees Interconnection Network for Single-Chip Parallel Processing Application-Specific Systems. In: International Conference on Architectures and Processors, pp. 73–80 (2006)
Plurality, Ltd.: The hyperCore architecture. White Paper (2010)
Rahimi, A., Loi, I., Kakoee, M., Benini, L.: A fully-synthesizable single-cycle interconnection network for Shared-L1 processor clusters Design. In: Automation Test in Europe Conference, pp. 1–6 (2011)
Xie, Y.: Processor Architecture Design Using 3D Integration Technology. In: 23rd International Conference on VLSI Design, pp. 446–451 (2010)
Li, F., Nicopoulos, C., Richardson, T., Xie, Y., Narayanan, V., Kandemir, M.: Design and management of 3D chip multiprocessors using network-in-memory. SIGARCH Comput. Archit. News 34, 130–141 (2006)
Loh, G.: 3D-Stacked memory architectures for multi-core processors. In: Proceedings of the 35th Annual International Symposium on Computer Architecture, pp. 453–464 (2008)
Woo, D.H., Seong, N.H., Lewis, D., Lee, H.-H.: An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth. In: 16th International Symposium on High Performance Computer Architecture, pp. 1–12 (2010)
Madan, N., Zhao, L., Muralimanohar, N., Udipi, A., Balasubramonian, R., Iyer, R., Makineni, S., Newell, D.: Optimizing communication and capacity in a 3D stacked reconfigurable cache hierarchy. In: 15th International Symposium on High Performance Computer Architecture, pp. 262–274 (2009)
Mishra, A., Dong, X., Sun, G., Xie, Y., Vijaykrishnan, N., Das, C.: Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs. SIGARCH Comput. Archit. News 39, 69–80 (2011)
Li, F., Nicopoulos, C., Richardson, T., Xie, Y., Narayanan, V., Kandemir, M.: Design and Management of 3D Chip Multiprocessors Using Network-in-Memory. SIGARCH Comput. Archit. News 34, 130–141 (2006)
Kim, J., Nicopoulos, C., Park, D., Das, R., Xie, Y., Narayanan, V., Yousif, M., Das, C.: A novel dimensionally-decomposed router for on-chip communication in 3D architectures. In: 34th International Symposium on Computer Architecture, pp. 138–149 (2007)
Park, D., Eachempati, S., Das, R., Mishra, A., Xie, Y., Vijaykrishnan, N., Das, C.: MIRA: A Multi-layered On-Chip Interconnect Router Architecture. In: 35th Annual International Symposium on Computer Architecture, pp. 251–261 (2008)
Xu, Y., Du, Y., Zhao, B., Zhou, X., Zhang, Y., Yang, J.: A Low-Radix and Low-Diameter 3D Interconnection Network Design. In: 15th International Symposium on High Performance Computer Architecture, pp. 30–42 (2009)
Xue, L., Gao, Y., Fu, J.: A High Performance 3D Interconnection Network for Many-Core Processors. In: 2nd International Conference on Computer Engineering and Technology, pp. 383–389 (2010)
Ben Ahmed, A., Ben Abdallah, A., Kuroda, K.: Architecture and Design of Efficient 3D Network-on-Chip (3D NoC) for Custom Multicore SoC. In: Broadband, Wireless Computing, Communication and Applications, pp. 67–73 (2010)
Design Compiler User Guide, Synopsys, version F-2011.09-SP2 (2011)
Van der Plas, G., Limaye, P., Loi, I., Mercha, A., Oprins, H., Torregiani, C., Thijs, S., Linten, D., Stucchi, M., Katti, G., Velenis, D., Cherman, V., Vandevelde, B., Simons, V., De Wolf, I., Labie, R., Perry, D., Bronckers, S., Minas, N., Cupac, M., Ruythooren, W., Van Olmen, J., Phommahaxay, A., de Potter de ten Broeck, M., Opdebeeck, A., Rakowski, M., De Wachter, B., Dehan, M., Nelis, M., Agarwal, R., Pullini, A., Angiolini, F., Benini, L., Dehaene, W., Travaly, Y., Beyne, E., Marchal, P.: Design issues and considerations for low-cost 3-D TSV IC technology. J. of Solid-State Circuits 46, 293–307 (2011)
Kim, D.H., Mukhopadhyay, S., Lim, S.K.: Fast and Accurate Analytical Modeling of Through-Silicon-Via Capacitive Coupling. IEEE Transactions on Components Packaging and Manufacturing Technology 1, 168–180 (2011)
Shi, B., Srivastava, A.: Liquid Cooling for 3D-ICs. In: International Green Computing Conference and Workshops, July 25-28, pp. 1–6, (2011)
Zhou, X., Yang, J., Xu, Y., Zhang, Y., Zhao, J.: Thermal-aware Task Scheduling for 3D Multicore Processors. IEEE Trans. Parallel Distrib. Syst. 21, 60–71 (2010)
Goplen, B., Sapatnekar, S.: Thermal Via Placement in 3D ICs. In: International Symposium on Physical Design, pp. 167–174 (2005)
Yu, H., He, L.: Dynamic Power and Thermal Integrity in 3D Integration. In: Communications, Circuits and Systems, pp. 1108–1112 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Beanato, G., Loi, I., De Micheli, G., Leblebici, Y., Benini, L. (2013). Configurable Low-Latency Interconnect for Multi-core Clusters. In: Burg, A., Coṣkun, A., Guthaus, M., Katkoori, S., Reis, R. (eds) VLSI-SoC: From Algorithms to Circuits and System-on-Chip Design. VLSI-SoC 2012. IFIP Advances in Information and Communication Technology, vol 418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45073-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-45073-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45072-3
Online ISBN: 978-3-642-45073-0
eBook Packages: Computer ScienceComputer Science (R0)