Abstract
For distributed-memory multicomputers such as the Intel Paragon, the IBM SP-1/SP-2, the NCUBE/2, and the Thinking Machines CM-5, the quality of the data partitioning for a given application is crucial to obtaining high performance. This task has traditionally been the user's responsibility, but in recent years much effort has been directed to automating the selection of data partitioning schemes. Several researchers have proposed systems that are able to produce data distributions that remain in effect for the entire execution of an application. For complex programs, however, such static data distributions may be insufficient to obtain acceptable performance. The selection of distributions that dynamically change over the course of a program's execution adds another dimension to the data partitioning problem. In this paper, we present a technique that can be used to automatically determine which partitionings are most beneficial over specific sections of a program while taking into account the added overhead of performing redistribution. This system is being built as part of the PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois. The complete system will provide a fully automated means to parallelize programs written in a serial programming model obtaining high performance on a wide range of distributed-memory multicomputers.
This research was supported in part by the National Aeronautics and Space Administration under Contract NASA NAG 1-613 and in part by the Advanced Research Projects Agency under contract DAA-H04-94-G-0273 administered by the Army Research office. We are also grateful to the National Center for Supercomputing Applications and the San Diego Supercomputing Center for providing access to their machines.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
J. M. Anderson and M. S. Lam. Global Optimizations for Parallelism and Locality on Scalable Parallel Machines. In Proc. of the ACM SIGPLAN '93 Conf. on Prog. Lang. Design and Implementation, pages 112–125, Albuquerque, NM, June 1993.
P. Banerjee, J. A. Chandy, M. Gupta, E. W. Hodges IV, J. G. Holm, A. Lain, D. J. Palermo, S. Ramaswamy, and E. Su. An Overview of the PARADIGM Compiler for Distributed-Memory Multicomputers. to appear in IEEE Computer, 1995.
R. Bixby, K. Kennedy, and U. Kremer. Automatic Data Layout Using 0–1 Integer Programming. In Proc. of the 1994 Int'l Conf. on Parallel Archs. and Compilation Techniques, pages 111–122, Montréal, Canada, Aug. 1994.
B. Chapman, T. Fahringer, and H. Zima. Automatic support for data distribution on distributed memory multiprocessor systems. In Proc. of the 6th Work, on Langs. and Compilers for Parallel Computing, pages 184–199, Portland, OR, Aug. 1993. Springer-Verlag.
S. Chatterjee, J. R. Gilbert, R. Schreiber, and S. H. Teng. Automatic Array Alignment in Data-Parallel Programs. In Proc. of the 20th ACM SIGPLAN Symp. on Principles of Prog. Langs., pages 16–28, Charleston, SC, Jan. 1993.
T. Fahringer. Automatic Performance Prediction for Parallel Programs on Massively Parallel Computers. PhD thesis, Univ. of Vienna, Vienna, Austria, Sept. 1993. TR93-3.
G. Golub and J. M. Ortega. Scientific Computing: An Introduction with Parallel Computing. Academic Press, San Diego, CA, 1993.
M. Gupta and P. Banerjee. Compile-Time Estimation of Communication Costs on Multicomputers. In Proc. of the 6th Int'l Parallel Processing Symp., pages 470–475, Beverly Hills, CA, Mar. 1992.
M. Gupta and P. Banerjee. PARADIGM: A Compiler for Automated Data Partitioning on Multicomputers. In Proc. of the 7th ACM Int'l Conf. on Supercomputing, Tokyo, Japan, July 1993.
S. Hiranandani, K. Kennedy, and C. Tseng. Compiling Fortran D for MIMD Distributed Memory Machines. Communications of the ACM, 35(8):66–80, Aug. 1992.
D. E. Hudak and S. G. Abraham. Compiling Parallel Loops for High Performance Computers — Partitioning, Data Assignment and Remapping. Kluwer Academic Pub., Boston, MA, 1993.
K. Knobe, J. Lukas, and G. Steele Jr. Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines. J. of Parallel and Distributed Computing, 8(2): 102–118, Feb. 1990.
B. Krishnamurthy, editor. Practical Reusable UNIX Software. John Wiley and Sons Inc., New York, NY, 1995.
J. Li and M. Chen. The Data Alignment Phase in Compiling Programs for Distributed-Memory Machines. J. of Parallel and Distributed Computing, 13(2):213–221, Oct. 1991.
D. J. Palermo, E. Su, J. A. Chandy, and P. Banerjee. Compiler Optimizations for Distributed Memory Multicomputers used in the PARADIGM Compiler. In Proc. of the 23rd Int'l Conf. on Parallel Processing, pages 11:1–10, St. Charles, IL, Aug. 1994.
C. D. Polychronopoulos, M. Girkar, M. R. Haghighat, C. L. Lee, B. Leung, and D. Schouten. Parafrase-2: An Environment for Parallelizing, Partitioning, Synchronizing and Scheduling Programs on Multiprocessors. In Proc. of the 18th Int'l Conf. on Parallel Processing, pages II:39–48, St. Charles, IL, Aug. 1989.
J. Ramanujam and P. Sadayappan. Compile-time Techniques for Data Distribution in Distributed Memory Machines. IEEE Trans. on Parallel and Distributed Systems, 2(4):472–481, Oct. 1991.
S. Ramaswamy and P. Banerjee. Automatic Generation of Efficient Array Redistribution Routines for Distributed Memory Multicomputers. In Frontiers '95: The 5th Symp. on the Frontiers of Massively Parallel Computation, pages 342–349, McLean, VA, Feb. 1995.
T. J. Sheffler, J. R. Gilbert, R. Schreiber, and S. Chatterjee. Aligning Parallel Arrays to Reduce Communication. In Frontiers '95: The 5th Symp. on the Frontiers of Massively Parallel Computation, pages 324–331, McLean, VA, 1995.
H. Sivaraman and C. S. Raghavendra. Compiling for MIMD Distributed Memory Machines. Tech. Report EECS-94-021, School of Electrical Enginnering and Computer Science, Washington State Univ., Pullman, WA, 1994.
P. S. Tseng. Compiling Programs for a Linear Systolic Array. In Proc. of the ACM SIGPLAN '90 Conf. on Prog. Lang. Design and Implementation, pages 311–321, White Plains, NY, June 1990.
S. Wholey. Automatic Data Mapping for Distributed-Memory Parallel Computers. In Proc. of the 6th ACM Int'l Conf. on Supercomputing, pages 25–34, Washington D.C., July 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Palermo, D.J., Banerjee, P. (1996). Automatic selection of dynamic data partitioning schemes for distributed-memory multicomputers. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014213
Download citation
DOI: https://doi.org/10.1007/BFb0014213
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60765-6
Online ISBN: 978-3-540-49446-1
eBook Packages: Springer Book Archive