Comparison of Direct and Indirect Networks for High-Performance FPGA Clusters

Mondigo, Antoniette; Ueno, Tomohiro; Sano, Kentaro; Takizawa, Hiroyuki

doi:10.1007/978-3-030-44534-8_24

Antoniette Mondigo¹³,
Tomohiro Ueno¹⁴,
Kentaro Sano¹⁴ &
…
Hiroyuki Takizawa¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12083))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

1592 Accesses
9 Citations

Abstract

As field programmable gate arrays (FPGAs) become a favorable choice in exploring new computing architectures for the post-Moore era, a flexible network architecture for scalable FPGA clusters becomes increasingly important in high performance computing (HPC). In this paper, we introduce a scalable platform of indirectly-connected FPGAs, where its Ethernet-switching network allows flexibly customized inter-FPGA connectivity. However, for certain applications such as in stream computing, it is necessary to establish a connection-oriented datapath with backpressure between FPGAs. Due to the lack of physical backpressure channel in the network, we utilized our existing credit-based network protocol with flow control to provide receiver FPGA awareness and tailored it to minimize overall communication overhead for the proposed framework. To know its performance characteristics, we implemented necessary data transfer hardware on Intel Arria 10 FPGAs, modeled and obtained its communication performance, and compared it to a direct network. Results show that our proposed indirect framework achieves approximately 3% higher effective network bandwidth than our existing direct inter-FPGA network, which demonstrates good performance and scalability for large HPC applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alizadeh, M., Edsall, T.: On the data path performance of leaf-spine datacenter fabrics. In: Proceedings - IEEE 21st Annual Symposium on High-Performance Interconnects, HOTI 2013, pp. 71–74. IEEE Computer Society (2013). https://doi.org/10.1109/HOTI.2013.23
AWS: Amazon EC2 F1 instances. https://aws.amazon.com/ec2/instance-types/f1/
Baxter, R., Booth, S., Bull, M., et al.: Maxwell - a 64 FPGA supercomputer. In: Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), pp. 287–294. IEEE, August 2007. https://doi.org/10.1109/AHS.2007.71
Bunker, T., Swanson, S.: Latency-optimized networks for clustering FPGAs (2013). https://doi.org/10.1109/FCCM.2013.49
Caulfield, A.M., Chung, E.S., Putnam, A., et al.: A cloud-scale acceleration architecture. In: MICRO-49 The 49th Annual IEEE/ACM International Symposium on Microarchitecture (2016). https://doi.org/10.1109/MICRO.2016.7783710
Fowers, J., Ovtcharov, K., Papamichael, M., et al.: A configurable cloud-scale DNN processor for real-time AI. In: 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pp. 1–14. IEEE, June 2018. https://doi.org/10.1109/ISCA.2018.00012
Herbordt, M.C., VanCourt, T., Gu, Y., et al.: Achieving high performance with FPGA-based computing. Computer 40(3), 50–57 (2007). https://doi.org/10.1109/MC.2007.79
Article Google Scholar
Intel: Low latency 40-gbps Ethernet IP core user guide. https://www.intel.com/content/www/us/en/programmable/products/intellectual-property/ip/interface-protocols/m-alt-40gb-ethernet.html
Intel: SerialLite III IP Solution. https://www.altera.com/solutions/technology/transceiver/protocols/pro-seriallite-3.html
Kung, H., Morris, R.: Credit-based flow control for ATM networks. IEEE Network 9(2), 40–48 (1995). https://doi.org/10.1109/65.372658
Article Google Scholar
Markettos, A.T., Fox, P.J., Moore, S.W., et al.: Interconnect for commodity FPGA clusters: standardized or customized? In: Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014, pp. 1–8, September 2014. https://doi.org/10.1109/FPL.2014.6927472
Mellanox Technologies Ltd.: Sn2100 open Ethernet switch (2019). https://www.mellanox.com/ethernet/switches.php
Mencer, O., Tsoi, K.H., Craimer, S., et al.: Cube: a 512-FPGA cluster. In: Proceedings of the 2009 5th Southern Conference on Programmable Logic (SPL), pp. 51–57. IEEE, April 2009. https://doi.org/10.1109/SPL.2009.4914907
Mondigo, A., Ueno, T., Sano, K., Takizawa, H.: Scalability analysis of deeply pipelined Tsunami simulation with multiple FPGAS. IEICE Trans. Inf. Syst. E102-D(5), 1029–1036 (2019). https://doi.org/10.1587/transinf.2018RCP0007
Putnam, A., Caulfield, A.M., Chung, E.S., et al.: A reconfigurable fabric for accelerating large-scale datacenter services. In: ISCA 2014 Proceeding of the 41st Annual International Symposium on Computer Architecture, Minneapolis, MN, USA, pp. 13–24. IEEE (2014). https://doi.org/10.1109/ISCA.2014.6853195
Sheng, J., Yang, C., Herbordt, M.C.: High performance communication on reconfigurable clusters. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL), Dublin, Ireland (2018)
Google Scholar
Tarafdar, N., Lin, T., Fukuda, E., et al.: Enabling flexible network FPGA clusters in a heterogeneous cloud data center (2017). https://doi.org/10.1145/3020078.3021742
Xiong, Q., Skjellum, A., Herbordt, M.C.: Accelerating MPI message matching through FPGA offload. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL), pp. 191–1914. IEEE, August 2018. https://doi.org/10.1109/FPL.2018.00039

Download references

Author information

Authors and Affiliations

Graduate School of Information Sciences, Tohoku University, Sendai, Japan
Antoniette Mondigo
Processor Research Team, RIKEN Center for Computational Science, Kobe, Japan
Tomohiro Ueno & Kentaro Sano
Cyberscience Center, Tohoku University, Sendai, Japan
Hiroyuki Takizawa

Authors

Antoniette Mondigo
View author publications
You can also search for this author in PubMed Google Scholar
Tomohiro Ueno
View author publications
You can also search for this author in PubMed Google Scholar
Kentaro Sano
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Takizawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antoniette Mondigo .

Editor information

Editors and Affiliations

Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Fernando Rincón
Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Jesús Barba
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong, China
Hayden K. H. So
INESC-ID, Lisbon, Portugal
Pedro Diniz
Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Julián Caba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mondigo, A., Ueno, T., Sano, K., Takizawa, H. (2020). Comparison of Direct and Indirect Networks for High-Performance FPGA Clusters. In: Rincón, F., Barba, J., So, H., Diniz, P., Caba, J. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2020. Lecture Notes in Computer Science(), vol 12083. Springer, Cham. https://doi.org/10.1007/978-3-030-44534-8_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-44534-8_24
Published: 25 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44533-1
Online ISBN: 978-3-030-44534-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics