Abstract
We distinguish between two different types of communication; we will ignore communication required to align two objects of the same shape. Such communication occurs, for example, when two 2-dimensional arrays must align for an operation to be performed on corresponding elements. Since all the communication can occur concurrently this is less of an efficiency issue than shape changing.
Shape changing arises when an object of some dimensionality must align with multiple parts of a higher dimensioned object. A shape change occurs, for example, when a vector is required to align with each row of a matrix as in the code, a (i) + b (j, i). Since, after alignment, each (virtual) processor holding an element of b will also contain the associated element of a, a has become 2-dimensional (changing its shape).
A shape change that adds a dimension of extent n may require a number of communications proportional to O(l), O(lg n), or O(n) communication, depending on related code within the program. For example, it may be replicated or privatized along that dimension. Since optimizing among O(l), O(lg n), and O(n) communication is much more significant than optimizing the alignment of objects of the same shape, we have developed an abstraction that exposes shape changes but ignores alignment. We show a number of optimization based on this abstraction.
The research described in this paper was supported in part by the Defense Advanced Research Projects Agency under contracts N00014-88K-0738 and N00014-91J-1698, by AirForce Systems under contract F19628-92-C-0045 and by a National Science Foundation Presidential Young Investigator Award, grant MIP-8657531, with matching funds from General Electric Corporation, IBM Corporation, and AT&T.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barbara Chapman, Piyush Mehrotra, and Hans Zima. Vienna Fortran - a Fortran language extension for distributed memory multiprocessors. Technical report, Institute for Computer Applications in Science and Engineering, Hampton, Virginia, Sept 1991.
Siddhartha Chatteijee, John R. Gilbert, Robert Schreiber, and Shuang-Hua Teng. Automatic array alignment in data-parallel programs. In Proceedings of the Twentieth Annual Symposium on Principles of Programming Languages,Charleston, SC, January 1993. Association for Computing Machinery.
R. Eigenmann, J. Hoefinger, Z. Li, and D. Padua. Experience in the automatic parallelization of four Perfect-Benchmark programs. In Proceedings of the 4th workshop on Programming Languages and Compilers for Parallel Computing. Pitman/MIT Press, AUG 1991.
L. Guibas and D. Wyatt. Compilation and delayed evaluation in APL. In Proceedings of the Fifth Annual Symposium on Principles of Programming Languages. Association for Computing Machinery, January 1978.
HPF language specification, version 1.0. Technical Report CRPC-TR 92225, Rice University, Houston, Texas, January 1993.
Kai Hwang. Computer Arithmetic. Wiley, 1979.
Kathleen Knobe, Joan D. Lukas, and William J. Dally. Dynamic alignment on distributed memory systems. In Proceedings of the Third Workshop on Compilers for Parallel Computers,Vienna, Austria, July 1992. Austrian Center for Parallel Computation. Published as technical report ACPC/TR 92–8 of the Austrian Center for Parallel Computation.
Kathleen Knobe, Joan D. Lukas, and Guy L. Steele Jr. Data optimization: Allocation of arrays to reduce communication on SIMD machines. Journal of Parallel and Distributed Computing, 8: 102–118, 1990.
Kathleen Knobe and Venkataraman Natarajan. Data optimization: Minimizing residual interprocessor data motion on SIMD machines. In Frontiers ‘80: The Third Symposium on the Frontiers of Massively Parallel Computation, College Park, Maryland, Oct 1990. University of Maryland.
Kathleen Knobe and Venkataraman Natarajan. Automatic data allocation to mimimize data motion on SIMD machines. Journal of Supercomputing, 1993. to appear.
Jingke Li and Marina Chen. Index domain alignment: Minimizing costs of cross-referencing between distributed arrays. In Frontiers ‘80: The Third Symposium on the Frontiers of Massively Parallel Computation, College Park, Maryland, Oct 1990. University of Maryland.
J. Ramanujam and R. Sadayappan. Compile-time techniques for data distribution in distributed memory machines. IEEE Transactions on Parallel and Distributed Systems, 2 (4), October 1991.
Peng Tu and David Padua. Array privatization for shared and distributed memory machines. ACM SIGPLAN Notices, 28(1), January 1993. Proceedings of the Workshop on Languages, Compilers, and Run-Time Environments for Distributed Memory Multiprocessors.
Michael Weiss. Strip mining on SIMD architectures. In International Conference on Supercomputing, Cologne, Germany, June 1991. ACM.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Friedr. Vieweg & Sohn Verlagsgesellschaft mbH, Braunschweig/Wiesbaden
About this chapter
Cite this chapter
Knobe, K., Dally, W.J. (1994). Subspace Optimizations. In: Keßler, C.W. (eds) Automatic Parallelization. Vieweg+Teubner Verlag. https://doi.org/10.1007/978-3-322-87865-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-322-87865-6_9
Publisher Name: Vieweg+Teubner Verlag
Print ISBN: 978-3-528-05401-4
Online ISBN: 978-3-322-87865-6
eBook Packages: Springer Book Archive