A Data Transformations Based Approach for Optimizing Memory and Cache Locality on Distributed Memory Multiprocessors
Data locality is one of the key factors in affecting the performance of parallel programs running on distributed memory multiprocessors. This paper presents an approach for optimizing memory locality and cache locality of perfect or non-perfect loop nests using linear data transformations on distributed memory multiprocessors. The approach optimizes memory locality with the data space fusion technique and cache locality with the projection-delamination technique, and combines the both techniques effectively to make the overheads of remote memory accesses and local memory accesses as low as possible. We conduct experiments with nine programs and the results show the approach is effective in optimizing memory locality and cache locality simultaneously.
Unable to display preview. Download preview PDF.
- 3.Jun, X., Xue-Jun, Y., Hua-Dong, D.: Data space fusion based approach for effective alignment of computation and data. In: Proc. of 5th International Workshop on Advanced Parallel Processing Technology, Xiamen, China, pp. 215–225 (2003)Google Scholar
- 4.Wolf, M., Lam, M.: A data locality optimizing algorithm. In: SIGPLAN 1991 Conference on Programming Language Design and Implementation, Toronto, Canada, pp. 30–44 (1991)Google Scholar
- 6.Bik, A.J.C., Knijnenburg, P.M.W.: Reshaping Access Patterns for Improving Data Locality. In: Proc. of 6th Workshop on Compilers for Parallel Computers (1996)Google Scholar
- 8.Kandemir, M., Choudhary, A., Shenoy, N., Banerjee, P., Ramanujam, J.: A hyperplane based approach for optimizing spatial locality in loop nests. In: Proc. of 1998 ACM International Conference on Supercomputing (ICS 1998), Melbourne, Australia, pp. 69–76 (1998)Google Scholar
- 9.Jun, X., Xue-Jun, Y., Li-Fang, Z., Hai-Fang, Z.: A projection-delamination based approach for optimizing spatial locality in loop nests. Chinese Journal of Computers 26(5), 539–551 (2003)Google Scholar
- 10.Cierniak, M., Li, W.: Unifying data and control transformations for distributed shared memory machines. In: SIGPLAN 1995 Conference on Programming Language Design and Implementation, La Jolla, CA, pp. 205–217 (1995)Google Scholar
- 13.High Performance Computational Chemistry Group. NWChem: A computational chemistry package for parallel computers, version 1.1. Pacific Northwest Laboratory (1995)Google Scholar