Advertisement

Loop restructuring techniques for thrashing problem

  • Guohua Jin 
  • Fujie Chen 
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 605)

Abstract

Parallel loops account for the greatest amount of parallelism in numerical programs. Executing nested loops in parallel with low run-time overhead is thus very important for achieving high performance in parallel processing systems. However, in parallel processing systems with caches or local memories in memory hierarchies, “thrashing problem” may arise.As thrashing problem severely ruins system performance, there has been an urgent need for a simple and effective algorithm to solve the problem.Based on thorough study of the relationship between the array element accesses and its enclosed loop indices in the nested loop,we present,in this paper,a set of compiler restructuring techniques,with which the reduced iteration space is staggered,regularized and compacted,the nested loop is restructured.As a result,we get a nested loop without thrashing problem and catering to different loop scheduling strategies.In addition to this, additional parallelism is exploited.

Keywords

Coherence Chert 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    W.Abu,D.Kuck, and D.Lawrie, ”On the performance enhancement of paging systems through program analysis and transformations”, IEEE Trans. on Computers, Vol. C-30,No.5,1981.Google Scholar
  2. 2.
    J.Baer and W.Wang, ”Multilevel cache hierarchies:organizations,protocols,and performance”, Journal of Parallel and Distributed Computing,Vol.6, pp.451–476, 1989.CrossRefGoogle Scholar
  3. 3.
    U.Banerjee, ”Dependence analysis for supercomputing”,Kluwer Academic Publishers, 1988.Google Scholar
  4. 4.
    S.J.Eggers and R.H.Katz, “The effect of sharing on the cache and bus performance of parallel programs”, In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS III), pp.257–270,1989.Google Scholar
  5. 5.
    Z.Fang, “Cache or local memory thrashing and compiler strategy in parallel processing systems” ICPP'90,pp.271–275.Google Scholar
  6. 6.
    Z.Fang,P.Tang,P.C.Yew,and C.Q.Zhu, ”Dynamic processor self-scheduling for general palallel nested loops”, IEEE Transactions on Computers,Vol.39,No.7,July,1990.Google Scholar
  7. 7.
    K.Gallivan,W.Jalby and D.Gannon, ”On the problem of optimizing data transfers for complex memory systems”, In Proceedings of Supercomputing 1988, pp.238–253.Google Scholar
  8. 8.
    D.Gannon,W.Jalby and K.Gallivan, “Strategies for cache and local memory management by global program transformation”,In Journal of Parallel and Distributed Computing,Vol.5,1988.Google Scholar
  9. 9.
    E.H.Gornish, E.D.Granston and A.V.Veidenbaunn, “Compiler-directed data prefetching in multiprocessors with memory hierarchies”, Proceedings of ICS,1990.Google Scholar
  10. 10.
    Guohua Jin and Fujie Chen, “Solving thrashing problem at compile-time “, Technical Report, C.I.T., Setp.,1991.Google Scholar
  11. 11.
    Guohua Jin and Fujie Chen, “Loop restructuring techniques for thrashing problem“,Technical Report, C.I.T., July, 1991.Google Scholar
  12. 12.
    D.Kuck, R.Kuhn,D.Padua,B.Leasure,and M.Wolfe, “Dependence graphs and compiler optimizations”, In Proc. of the 8th ACM Symp. on Principles of Programming Languages (POPL), 1981.Google Scholar
  13. 13.
    D.Kuck, ”The structure of computer and computations”, Vol.1, John Wiley and Sons,1978.Google Scholar
  14. 14.
    B.Leasure, et.,al., “PCF Fortran: language definition (Version 1)”, The Parallel Computing Forum, Aug. 16,1988.Google Scholar
  15. 15.
    D. Padua, and D.Kuck, “High speed multiprocessors and compilation techniques”, IEEE Trans. on Computers, C-29 Sept. 1980.Google Scholar
  16. 16.
    C.D. Polychronopoulos, D.Kuck and D.Padua, “Execution of parallel loops on parallel processor systems”, ICPP'86.Google Scholar
  17. 17.
    C.D. Polychronopoulos, D.Kuck, “Guided self-scheduling: a practical scheduling scheme for parallel supercomputers”,IEEE Trans. on Computers,Vol.C-36,No.12,Dec.,1987.Google Scholar
  18. 18.
    P.Tang and P.C.Yew, ”Processor self-scheduling for multiple-nested parallel loops”, ICPP'86.Google Scholar
  19. 19.
    T.H.Tzen, L.M.Ni, “Dynamic loop scheduling for shared memory multiprocessors”, ICPP'91.Google Scholar
  20. 20.
    M. Wolfe, “Iteration space tiling for memory hierarchies“, In Proc. of the Third SIAM Conf. on Parallel Processing, Los Angeles, CA, Dec., 1987.Google Scholar

Copyright information

© Springer-Verlag 1992

Authors and Affiliations

  • Guohua Jin 
    • 1
  • Fujie Chen 
    • 1
  1. 1.Department of Computer ScienceChangsha Institute of TechnologyChangsha, HunanP.R.China

Personalised recommendations