Hardware support for collective communication operations

  • Larry Rudolph
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 678)


We describe the design of a communications board that supports collective communications operations of parallel programs. Each cluster of processors are interconnected by a bus and connected to the communications board. The communications boards are, in turn, connected to a low latency, high bandwidth, slow configuration time crossbar network. The design is geared towards low-end (inexpensive) parallel computer architectures.

Such an architecture is motivated by recent advances in parallel programming models and in particular by the “loosely synchronous” style of parallel programming. Subgroups of processors alternatively perform local work with collective communication operations. This allows time to reconfigure the large crossbar and enables communication between processors without interrupts. The important broadcast, scatter, gather, and transpose operations can be efficiently supported.


Parallel Processor Memory Buffer Communication Operation Communication Board Parallel Programming Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. Barak and R. Ben-Natan, “Assignment of Tasks to Parallel Architectures,” Technical Report 92-01, Department of Computer Science, The Hebrew University, Jerusalem, Israel, 1992.Google Scholar
  2. 2.
    K. Hwang, M. Dubois, D.K. Panda, S. Rao, S. Shang, A. Uresin, W. Mao, H. Nair, M. Lytwyn, F. Hsieh, J. Liu, S. Mehrotra, and C.M. Cheng, “OMP: A RISC-based Multiprocessor using Orthogonal-Access Memories and Multiple Spanning Buses,” 1990 International Conference on Supercomputing, Amsterdam, The Netherlands, September 1990, pp. 7–22.Google Scholar
  3. 3.
    V. Bala, M. C. Chen, S. Kipnis, C-Y. Lin, L. Rudolph, and M. Snir, “A high-level communication library for massively parallel distributed-memory machines,” Internal Memorandum, IBM T. J. Watson Research Center.Google Scholar
  4. 4.
    G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker, Solving Problems on Concurrent Processors, Volume I: General Techniques and Regular Problems, Prentice-Hall, Englewood Cliffs, New Jersey, 1988.Google Scholar
  5. 5.
    G. Lerman and L. Rudolph, Parallel Processors: Will They Ever Meet? A Three Decade Survey of Parallel Computers.Google Scholar
  6. 6.
    L. Rudolph and E. Shenfeld, “Interconnection Cache for Large Interconnection Networks,” Manuscript, Department of Computer Science, Hebrew University, Jerusalem, Israel, 1992.Google Scholar
  7. 7.
    E. Shenfeld, “A Parallel Architecture for a Digital Optical Computer,” Ph.D. Thesis, Department of Computer Science, The Hebrew University, Jerusalem, Israel, 1992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Larry Rudolph
    • 1
  1. 1.Department of Computer ScienceThe Hebrew UniversityJerusalemIsrael

Personalised recommendations