Skip to main content

Fast barrier synchronization on shared fast Ethernet

  • Conference paper
  • First Online:
Network-Based Parallel Computing Communication, Architecture, and Applications (CANPC 1998)

Abstract

Shared LAN is presently the most widespread networking technology, due to its extremely low cost and favourable cost/performance ratio. Clusters of Personal Computers (PCs) leveraging shared 100base-T Ethernet may currently offer the best price/performance in parallel processing. Most numerical parallel algorithms make heavy use of collective communications and especially barrier synchronization. Hence a critical issue on PC clusters is to offer efficient implementations of such primitives even though using low-cost, non-switched LAN technology. We implemented and studied some simple barrier synchronization protocols atop the Genoa Active Message MAchine (GAMMA), an efficient Active Messages-like communication layer running on a cluster of Pentium PCs connected by a 100base-TX Ethernet repeater hub. In the case of synchronized or quasi-synchronized processes issuing a barrier synchronization, an obvious way to avoid collisions on shared 100base-T Ethernet is to use a barrier protocol which explicitly serializes all the inter-process synchronization communications over the LAN. We propose alternative barrier protocols which avoid Ethernet collisions during the synchronization phase without requiring such a full explicit serialization. One of such protocols definitely outperforms the fully serialized barrier protocol over 100base-T Ethernet as well as the MPI implementations of barrier synchronization on IBM SP2 and Intel Paragon.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Connection Machine CM-5 Technical Summary. Technical report, Thinking Machines Corporation, Cambridge, Massachusetts, 1992.

    Google Scholar 

  2. A. Bar-Noy and S. Knipis. Designing Broadcasting Algorithms in the Postal Model for Message-Passing Systems. In Proc. of the 4th ACM Symp. on Parallel Algorithms and Architectures (SPAA '92), June 1992.

    Google Scholar 

  3. M. Bernaschi and G. Iannello. Efficient Collective Communication Operations in PVMe. In 2nd EuroPVM Users' Group Meeting, Lyon, Prance, September 1995.

    Google Scholar 

  4. J. Bruck, D. Dolev, C. Ho, M. Rosu, and R. Strong. Efficient Message Passing Interface (MPI) for Parallel Computing on Clusters of Workstations. Journal of Parallel and Distributed Computing, 40(1):19–34, January 1997.

    Article  Google Scholar 

  5. G. Chiola and G. Ciaccio. Implementing a Low Cost, Low Latency Parallel Platform. Parallel Computing, (22): 1703–1717, 1997.

    Article  Google Scholar 

  6. G. Ciaccio. Optimal Communication Performance on Fast Ethernet with GAMMA. In Proc. International Workshop on Personal Computer based Networks of Workstations (PC-NOW'98), to appear (LNCS), Orlando, Florida, April 1998. Springer.

    Google Scholar 

  7. D. Culler, K. Keeton, L.T. Liu, A. Mainwaring, R. Martin, S. Rodriguez, K. Wright, and C. Yoshikawa. Generic Active Message Interface Specification. Technical Report white paper of the NOW Team, Computer Science Dept., U. California at Berkeley, 1994.

    Google Scholar 

  8. G. Davies and N. Matloff. Network-Specific Performance Enhancements for PVM. In Proc. of the Fourth IEEE Int'l Sytnp. on High Performance Distributed Computing (HPDC-4), 1995.

    Google Scholar 

  9. W. Gropp and E. Lusk. User's Guide for MPICH, a Portable Implementation of MPI. Technical Report MCS-TM-ANL-96/6, Argonne National Lab., University of Chicago, 1996.

    Google Scholar 

  10. K. Hwang, C. Wang, and C-L. Wang. Evaluating MPI Collective Communication on the SP2, T3D, and Paragon Multicomputers. In Proc. of the Sth IEEE Symp. on High-Performance Computer Architecture (HPCA-3), February 1997.

    Google Scholar 

  11. L.T. Liu and D.E. Culler. Measurement of Active Message Performance on the CM-5. Technical Report CSD-94-807, Computer Science Dept., University of California at Berkeley, May 1994.

    Google Scholar 

  12. P. Marenzoni, G. Rimassa, M. Vignali, M. Bertozzi, G. Conte, and P. Rossi. An Operating System Support to Low-Overhead Communications in NOW Clusters. In Proc. of the 1st International Workshop on Communication and Architectural Support for Network-Based Parallel Computing (CANPC'97), LNCS 1199, pages 130–143, February 1997.

    Google Scholar 

  13. R. P. Martin. HPAM: An Active Message layer for a Network of HP Workstations. In Proc. of Hot Interconnect II, August 1994.

    Google Scholar 

  14. S. Pakin, V. Karamcheti, and A. Chien. Fast Messages (FM): Efficient, Portable Communication for Workstation Clusters and Massively-Parallel Processors. IEEE Concurrency, 1997 (to appear).

    Google Scholar 

  15. S. Rodrigues, T. Anderson, and D. Culler. High-performance Local-area Communication Using Fast Sockets. In Proc. USENIX'97, 1997.

    Google Scholar 

  16. T. Sterling. The Scientific Workstation of the Future May Be a Pile of PCs. Comm. of ACM, 39(9):11–12, September 1996.

    Article  Google Scholar 

  17. T. Sterling, D.J. Becker, D. Savarese, J.E. Dorband, U.A. Ranawake, and C.V. Packer. BEOWULF: A Parallel Workstation for Scientific Computation. In Proc. 24th Int. Conf. on Parallel Processing, Oconomowoc, Wisconsin, August 1995.

    Google Scholar 

  18. M. R. Swanson and L. B. Stoller. Low Latency Workstation Cluster Communications Using Sender-Based Protocols. Technical Report UUCS-96-001, Dept. of Computer Science, University of Utah, January 1996.

    Google Scholar 

  19. T. von Eicken, V. Avula, A. Basu, and V. Buch. Low-latency Communication Over ATM Networks Using Active Messages. IEEE Micro, 15(1):46–64, February 1995.

    Article  Google Scholar 

  20. T. von Eicken, A. Basu, V. Buch, and W. Vogels. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proc. of the 15th ACM Symp. on Operating Systems Principles (SOSP'95), Copper Mountain, Colorado, December 1995. ACM Press.

    Google Scholar 

  21. T. von Eicken, D.E. Culler, S.C. Goldstein, and K.E. Schauser. Active Messages: A Mechanism for Integrated Communication and Computation. In Proc. of the 19th Annual Int'l Symp. on Computer Architecture (ISGA'92), Gold Coast, Australia, May 1992. ACM Press.

    Google Scholar 

  22. T. M. Warschko, W. F. Tichy, and C. H. Herter. Efficient Parallel Computing on Workstation Clusters. Technical report, http://wwwipd.ira.uka.de / warschko /parapc /sc95.html, Karlsruhe, Germany, 1995.

    Google Scholar 

  23. M. Welsh, A. Basu, and T. von Eicken. Low-latency Communication over Fast Ethernet. In Proc. Euro-Par'96, Lyon, France, August 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dhabaleswar K. Panda Craig B. Stunkel

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chiola, G., Ciaccio, G. (1998). Fast barrier synchronization on shared fast Ethernet. In: Panda, D.K., Stunkel, C.B. (eds) Network-Based Parallel Computing Communication, Architecture, and Applications. CANPC 1998. Lecture Notes in Computer Science, vol 1362. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052213

Download citation

  • DOI: https://doi.org/10.1007/BFb0052213

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64140-7

  • Online ISBN: 978-3-540-69693-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics