Hardware support for collective communication operations
We describe the design of a communications board that supports collective communications operations of parallel programs. Each cluster of processors are interconnected by a bus and connected to the communications board. The communications boards are, in turn, connected to a low latency, high bandwidth, slow configuration time crossbar network. The design is geared towards low-end (inexpensive) parallel computer architectures.
Such an architecture is motivated by recent advances in parallel programming models and in particular by the “loosely synchronous” style of parallel programming. Subgroups of processors alternatively perform local work with collective communication operations. This allows time to reconfigure the large crossbar and enables communication between processors without interrupts. The important broadcast, scatter, gather, and transpose operations can be efficiently supported.
KeywordsParallel Processor Memory Buffer Communication Operation Communication Board Parallel Programming Model
Unable to display preview. Download preview PDF.
- 1.A. Barak and R. Ben-Natan, “Assignment of Tasks to Parallel Architectures,” Technical Report 92-01, Department of Computer Science, The Hebrew University, Jerusalem, Israel, 1992.Google Scholar
- 2.K. Hwang, M. Dubois, D.K. Panda, S. Rao, S. Shang, A. Uresin, W. Mao, H. Nair, M. Lytwyn, F. Hsieh, J. Liu, S. Mehrotra, and C.M. Cheng, “OMP: A RISC-based Multiprocessor using Orthogonal-Access Memories and Multiple Spanning Buses,” 1990 International Conference on Supercomputing, Amsterdam, The Netherlands, September 1990, pp. 7–22.Google Scholar
- 3.V. Bala, M. C. Chen, S. Kipnis, C-Y. Lin, L. Rudolph, and M. Snir, “A high-level communication library for massively parallel distributed-memory machines,” Internal Memorandum, IBM T. J. Watson Research Center.Google Scholar
- 4.G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker, Solving Problems on Concurrent Processors, Volume I: General Techniques and Regular Problems, Prentice-Hall, Englewood Cliffs, New Jersey, 1988.Google Scholar
- 5.G. Lerman and L. Rudolph, Parallel Processors: Will They Ever Meet? A Three Decade Survey of Parallel Computers.Google Scholar
- 6.L. Rudolph and E. Shenfeld, “Interconnection Cache for Large Interconnection Networks,” Manuscript, Department of Computer Science, Hebrew University, Jerusalem, Israel, 1992.Google Scholar
- 7.E. Shenfeld, “A Parallel Architecture for a Digital Optical Computer,” Ph.D. Thesis, Department of Computer Science, The Hebrew University, Jerusalem, Israel, 1992.Google Scholar