Abstract
The present contribution explores the allocator design for high-radix switches and implements a highly-efficient allocator PWF(Parallel WaveFront) for achieving high throughput. Based on wavefront allocator, PWF allocator realizes fast allocation within one cycle to avoid timing loop, and it proposes parallelized matching strategy on cyclical priority to supply allocation fairness as well as utilizing greedy policy to reach the maximal match number. Implemented under 32nm CMOS technology, the evaluation results of PWF hardware cost show that the area and power consumption compared to wavefront allocator are slightly increased by 32.8% and 36.8%, and the critical path delay under 8x8x8 switch is less than 0.5ns which satisfies the requirement of GHz-level frequency design for high-radix switch. By further estimating the allocation efficiency, PWF reduces the request schedule time by 61.2% and 65.7%, and increases the immediate request schedule number averagely by 38.9% and 46.7% in comparison with wavefront allocator. Then, the efficiency improvement is also revealed by the distinctly decreased average schedule time and average response time compared with wavefront and DRRM allocators, yielding apparent advantages on improving allocation performance and providing good allocation fairness.
This work was supported by NSFC (61103188), Natural Science Fundatuions of Hunan Province (12JJ4071) and the Specialized Research Fund for the Doctoral Program of Higher Education of China (20114307120011).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kim, J., Dally, W.J., Towles, B., Gupta, A.K.: Microarchitecture of a high-radix router. In: 32nd Annual International Symposium on Computer Architecture, pp. 420–431 (2005)
Wang, K.F., Fang, M., Chen, S.Q.: Design of a tile-based high-radix switch with high throughput. In: 2011 International Conference on Networking and Information Technology, pp. 277–285 (2011)
Sanchez, D., Michelogiannakis, G., Kozyrakis, C.: An analysis of interconnection networks for large scale chip-multiprocessors. ACM Transactions on Architecture and Code Optimization 7(1), 4:1–4:28 (2010)
Soteriou, V., Ramanujam, R.S., Lin, B., Peh, L.S.: A High-Throughput Distributed Shared-Buffer NoC Router. Computer Architecture Letters 8(1), 21–24 (2009)
McKeown, N.: The iSLIP Scheduling Algorithm for Input-Queued Switches. IEEE/ACM Transactions on Networking 7(2), 188–201 (1999)
Tamir, Y., Chi, H.-C.: Symmetric Crossbar Arbiters for VLSI Communication Switches. IEEE Transactions on Parallel and Distributed Systems 4(1) (1993)
Chao, H.J., Park, J.S.: Centralized contention resolution schemes for a large-capacity optical ATM switch. In: Proc. IEEE ATM Workshop, Fairfax, Virginia (May 1998)
Michelogiannakis, G., Jiang, N., Becker, D.: Packet chaining: efficient single-cycle allocation for on-chip networks. In: IEEE MICRO, Porto Allegre, Brazil, pp. 83–94 (2011)
Jafri, S.A.R., Sohail, H.B., Thottethodi, M., et al.: asSLIP: a high-performance adaptive-effort pipelined switch allocator. Purdue e-Pubs, ECE Technical Reports (October 28, 2013)
Mukherjee, S.S., Silla, F., Bannon, P., Emer, J., Lang, S., Webb, D.: A comparative study of arbitration algorithms for the alpha 21364 pipelined router. SIGARCH Computer Architecture News 30, 223–234 (2002)
Dally, W.J., Towles, B.: Principles and practices of interconnection networks. Morgan Kaufmann Publishers, San Francisco (2004)
Becker, D.U., Dally, W.J.: Allocator implementations for network-on-chip routers. In: Proc. of 2009 ACM/IEEE Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2009)
Mora, G., Flich, J., Duato, J., Baydal, E., López, P., Lysne, O.: Towards an Efficient Switch Architecture for High-Radix Switches. In: ACM/IEEE Symposium on Architectures for Networking and Communications Systems, San Jose, CA, December 3-5 (2006)
Scott, S., Abts, D., Kim, J., Dally, W.J.: The BlackWidow High-radix Clos Network. In: Proc. of the International Symposium on Computer Architecture (ISCA), Boston, MA (June 2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lai, M., Gao, L. (2015). A Highly-Efficient Crossbar Allocator Architecture for High-Radix Switch. In: Xu, W., Xiao, L., Li, J., Zhang, C., Zhu, Z. (eds) Computer Engineering and Technology. NCCET 2014. Communications in Computer and Information Science, vol 491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45815-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-45815-0_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45814-3
Online ISBN: 978-3-662-45815-0
eBook Packages: Computer ScienceComputer Science (R0)