Skip to main content
Log in

On the problem of optimizing parallel programs for complex memory hierarchies

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Based on a thorough study of the relationship between array element accesses and loop indices of the nested loop, a method is presented with which the staggering relation and the compacting relation between the threads of the nested loop (either with a single linear function or with multiple linear functions) can be determined at compile-time, and accordingly the nested loop (either perfectly nested one or imperfectly nested one) can be restructured to avoid the thrashing problem. Due to its simplicity, our method can be efficiently implemented in any parallel compiler, and the improvement of the performance is significant as shown by the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Abu-Sufah W, Kuck D J, Lawrie D H. On the performance enhancement of paging systems through program analysis and transformation. IEEE Trans on Computers, 1981, C-30(5): 341–355.

    Article  Google Scholar 

  2. Eggers S J, Katz R. H. The effect of sharing on the cache and bus performance of parallel program. In: Proc Int'l Conf Architectural Support for Programming Languages and Operating Systems (ASPLOS III), Boston, U S A, 1989: 257–270.

  3. Fang J, Lu M. A solution of cache ping-pong problem in RISC based parallel processing systems. In: ICPP'91, St Charles, U S A, 1991.

  4. Fang Z. Cache or local memory thrashing and compiler strategy in parallel processing systems. In: ICPP'90, Charles, U S A, 1990, 271–275.

  5. Gallivan K, Jalby W, Gannon D. On the problem of optimizing data transfers for complex memory systems. In: Proc of Supercomputing, Malo, France, July, 1988: 238–253.

  6. Gannon D, Jalby W, Gallivan K. Strategies for cache and local memory management by global program transformation. J of Parallel and Distributed Computing, 1988, 5(5): 587–616.

    Article  Google Scholar 

  7. Jin G, Yang X, Chen F. Loop staggering, loop staggering and loop compacting: restructuring techniques for thrashing problem. In: ICPP'91, St Charles, U S A, 1991.

  8. Jin G, Chen F. Loop restructuring techniques for thrashing problem. In: Parallel Architectures and Languages Europe, Paris, France, 1992.

  9. Jin G, Chen F. The design and the implementation of a knowledge-based paralleling tool. In: 2nd IES Information Technology Conf, Singapore, July 4–6, 1991.

  10. Leasure Bet al. PCF Fortran: Language definition (Version I). In: The Parallel Computing Forum, Kuck & Associates Inc, 1906 Fox Drive, Champaign, IL 61820, August 1988.

    Google Scholar 

  11. Nivan Iet al. An introduction to the theory of numbers. 4th ed. New York: John Wiley & Sons, 1980.

    Google Scholar 

  12. Wolfe M. More iteration space tiling. In: Proc. Supercomputing'89, 1989: 655–664.

  13. Yang X. KD-MUST: A multiprocessor simulator. Technical Report, Changsha Institute of Technology, China, May, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, G., Chen, F. On the problem of optimizing parallel programs for complex memory hierarchies. J. of Compt. Sci. & Technol. 9, 1–26 (1994). https://doi.org/10.1007/BF02939483

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02939483

Keywords

Navigation