Abstract
Based on a thorough study of the relationship between array element accesses and loop indices of the nested loop, a method is presented with which the staggering relation and the compacting relation between the threads of the nested loop (either with a single linear function or with multiple linear functions) can be determined at compile-time, and accordingly the nested loop (either perfectly nested one or imperfectly nested one) can be restructured to avoid the thrashing problem. Due to its simplicity, our method can be efficiently implemented in any parallel compiler, and the improvement of the performance is significant as shown by the experimental results.
Similar content being viewed by others
References
Abu-Sufah W, Kuck D J, Lawrie D H. On the performance enhancement of paging systems through program analysis and transformation. IEEE Trans on Computers, 1981, C-30(5): 341–355.
Eggers S J, Katz R. H. The effect of sharing on the cache and bus performance of parallel program. In: Proc Int'l Conf Architectural Support for Programming Languages and Operating Systems (ASPLOS III), Boston, U S A, 1989: 257–270.
Fang J, Lu M. A solution of cache ping-pong problem in RISC based parallel processing systems. In: ICPP'91, St Charles, U S A, 1991.
Fang Z. Cache or local memory thrashing and compiler strategy in parallel processing systems. In: ICPP'90, Charles, U S A, 1990, 271–275.
Gallivan K, Jalby W, Gannon D. On the problem of optimizing data transfers for complex memory systems. In: Proc of Supercomputing, Malo, France, July, 1988: 238–253.
Gannon D, Jalby W, Gallivan K. Strategies for cache and local memory management by global program transformation. J of Parallel and Distributed Computing, 1988, 5(5): 587–616.
Jin G, Yang X, Chen F. Loop staggering, loop staggering and loop compacting: restructuring techniques for thrashing problem. In: ICPP'91, St Charles, U S A, 1991.
Jin G, Chen F. Loop restructuring techniques for thrashing problem. In: Parallel Architectures and Languages Europe, Paris, France, 1992.
Jin G, Chen F. The design and the implementation of a knowledge-based paralleling tool. In: 2nd IES Information Technology Conf, Singapore, July 4–6, 1991.
Leasure Bet al. PCF Fortran: Language definition (Version I). In: The Parallel Computing Forum, Kuck & Associates Inc, 1906 Fox Drive, Champaign, IL 61820, August 1988.
Nivan Iet al. An introduction to the theory of numbers. 4th ed. New York: John Wiley & Sons, 1980.
Wolfe M. More iteration space tiling. In: Proc. Supercomputing'89, 1989: 655–664.
Yang X. KD-MUST: A multiprocessor simulator. Technical Report, Changsha Institute of Technology, China, May, 1990.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Jin, G., Chen, F. On the problem of optimizing parallel programs for complex memory hierarchies. J. of Compt. Sci. & Technol. 9, 1–26 (1994). https://doi.org/10.1007/BF02939483
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02939483