Advertisement

Optimizing Big Data Network Transfers in FPGA SoC Clusters: TECBrain Case Study

  • Luis G. León-VegaEmail author
  • Kaleb Alfaro-BadillaEmail author
  • Alfonso Chacón-RodríguezEmail author
  • Carlos Salazar-GarcíaEmail author
Conference paper
  • 21 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1087)

Abstract

Spiking Neural Network (SSN) simulators based on clusters of FPGA-based System-on-Chip (SoC) involve the transmission of large amounts of data (from hundreds of MB to tens of GB per second) from and to a data host, usually a PC or a server. TECBrain is an SNN simulator which currently uses Ethernet for transmitting results from its simulations, which can potentially take hours if the effective connection speed is around 100 Mbps. This paper proposes data transfer techniques that optimize data transmissions by grouping data into packages making the most of the payload size and the use of thread-level parallelism, trying to minimize the impact of multiple clients transmitting at the same time. The proposed method achieves its highest throughput when inserting simulation results directly into a No-SQL database.

Using the proposed optimization techniques over an Ethernet connection, the minimum overhead reached is 2.93% (out of the theoretical 2.47%) for five nodes sending data simultaneously from C++, with speeds up to 95 Mbps on a network at 100 Mbps. Besides, the maximum database insertion speed reached is 32.5 MB/s, using large packages and parallelism, which is 26% of the bandwidth of the connection link at 1 Gbps.

Keywords

High perfomance computing No-SQL High-speed networks Embedded software 

References

  1. 1.
    Alfaro-Badilla, K., et al.: Improving the simulation of biologically accurate neural networks using data flow HLS transformations on heterogeneous SoC-FPGA platforms. In: CARLA 2019 - Latin America High Performance Computing Conference, September 2019Google Scholar
  2. 2.
    Alfaro-Badilla, K., et al.: Prototyping a biologically plausible neuron model on a heterogeneous CPU-FPGA board. In: 2019 IEEE 10th Latin American Symposium on Circuits Systems (LASCAS), pp. 5–8, February 2019.  https://doi.org/10.1109/LASCAS.2019.8667538
  3. 3.
    Altera: White paper accelerating high-performance computing with FPGAs. Cluster Computing, pp. 1–8 (2007). https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/wp/wp-01029.pdf. Accessed 04 April 2019
  4. 4.
    Arnst, D., Plenk, V., Adrian, W.: Comparative evaluation of database performance in an Internet of Things context comparative evaluation of database performance in an Internet of Things context. In: ICSNC 2018, vol. 13, pp. 45–50, October (2018)Google Scholar
  5. 5.
    Chodorow, K.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media Inc., Sebastopol (2013)Google Scholar
  6. 6.
    Cramer, T., Friedman, R., Miller, T., Seberger, D., Wilson, R., Wolczko, M.: Compiling Java just in time. IEEE Micro 17(3), 36–43 (1997).  https://doi.org/10.1109/40.591653CrossRefGoogle Scholar
  7. 7.
    Dong, T., Dobrev, V., Kolev, T., Rieben, R., Tomov, S., Dongarra, J.: A step towards energy efficient computing: redesigning a hydrodynamic application on CPU-GPU, pp. 972–981, May 2014.  https://doi.org/10.1109/IPDPS.2014.103
  8. 8.
    Furber, S.B., Galluppi, F., Temple, S., Plana, L.A.: The spiNNaker project. Proc. IEEE 102(5), 652–665 (2014).  https://doi.org/10.1109/JPROC.2014.2304638CrossRefGoogle Scholar
  9. 9.
    Hamada, T., Benkrid, K., Nitadori, K., Taiji, M.: A comparative study on ASIC, FPGAs, GPUs and general purpose processors in the O(N 2) gravitational N-body simulation. In: Proceedings - 2009 NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2009), pp. 447–452 (2009).  https://doi.org/10.1109/AHS.2009.55
  10. 10.
    Hsieh, C.W., Chou, C.Y., Tsai, T.C., Cheng, Y.F., Kuo, S.H.: NCHC’s Formosa v GPU cluster enters the TOP500 ranking. In: 2012 Proceedings of 4th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2012), pp. 622–624 (2012).  https://doi.org/10.1109/CloudCom.2012.6427507
  11. 11.
    Huang, J., Cai, L.: Research on TCP/IP network communication based on Node.js. In: AIP Conference Proceedings, vol. 1955, issue 1, pp. 040115 (2018).  https://doi.org/10.1063/1.5033779. https://aip.scitation.org/doi/abs/10.1063/1.5033779
  12. 12.
    IEEE: IEEE Standard for Ethernet. IEEE Std 802.3-2018, (Revision of IEEE Std 802.3-2015), pp. 1–5600, August 2018.  https://doi.org/10.1109/IEEESTD.2018.8457469.
  13. 13.
    Li, C., Yang, W.: The distributed storage strategy research of remote sensing image based on Mongo DB. In: The 3rd International Workshop on Earth Observation and Remote Sensing Applications (EORSA 2014) - (41271390), pp. 101–104 (2014).  https://doi.org/10.1109/EORSA.2014.6927858
  14. 14.
    Milluzzi, A., George, A., Lam, H.: Computational and memory analysis of Tegra SoCs. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC 2016), issue (1), pp. 1–7 (2016).  https://doi.org/10.1109/HPEC.2016.7761602
  15. 15.
    MongoDB: MongoDB Limits and Thresholds. https://docs.mongodb.com/manual/reference/limits/. Accessed 14 April 2019
  16. 16.
    Rojas, J., Verastegui, J., Milla, M.: Design and implementation of a high speed interface system over Gigabit Ethernet based on FPGA for use on radar acquisition systems. In: Proceedings of the 2017 Electronic Congress (E-CON UNI 2017) (2018).  https://doi.org/10.1109/ECON.2017.8247311
  17. 17.
    Satheesh, M., D’mello, B.J., Krol, J.: Web Development with MongoDB and NodeJS. Packt Publishing Ltd., Birmingham (2015)Google Scholar
  18. 18.
    Szebenyi, Z.: Capturing Parallel Performance Dynamics. Forschungszentrum Jülich, Jülich (2012). http://hdl.handle.net/2128/4603Google Scholar
  19. 19.
    Truica, C.O., Radulescu, F., Boicea, A., Bucur, I.: Performance evaluation for CRUD operations in asynchronously replicated document oriented database. In: Proceedings - 2015 20th International Conference on Control Systems and Computer Science (CSCS 2015), pp. 191–196 (2015).  https://doi.org/10.1109/CSCS.2015.32
  20. 20.
    Xilinx, Inc.: Xilinx WP375 high performance computing using FPGAs. White Pap. 375, 1–15 (2010). https://www.xilinx.com/support/documentation/white_papers/wp375_HPC_Using_FPGAs.pdf
  21. 21.
    Zamora-Umaña, D.: Desarrollo y validación de un método para la visualización de resultados en la implementación del algoritmo de simulación de redes neuronales. Bachelor’s thesis, Instituto Tecnológico de Costa Rica, Escuela de Ingeniería Electrónica, December 2017Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Escuela de Ingeniería ElectrónicaTecnológico de Costa RicaCartagoCosta Rica
  2. 2.Área Académica de Ingeniería MecatrónicaTecnológico de Costa RicaCartagoCosta Rica

Personalised recommendations