Advertisement

Optimal and Heuristic Global Code Motion for Minimal Spilling

  • Gergö Barany
  • Andreas Krall
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7791)

Abstract

The interaction of register allocation and instruction scheduling is a well-studied problem: Certain ways of arranging instructions within basic blocks reduce overlaps of live ranges, leading to the insertion of less costly spill code. However, there is little previous research on the extension of this problem to global code motion, i.e., the motion of instructions between blocks. We present an algorithm that models global code motion as an optimization problem with the goal of minimizing overlaps between live ranges in order to minimize spill code.

Our approach analyzes the program to identify the live range overlaps for all possible placements of instructions in basic blocks and all orderings of instructions within blocks. Using this information, we formulate an optimization problem to determine code motions and partial local schedules that minimize the overall cost of live range overlaps. We evaluate solutions of this optimization problem using integer linear programming, where feasible, and a simple greedy heuristic.

We conclude that global code motion with the sole goal of avoiding spills rarely leads to performance improvements because code is placed too conservatively. On the other hand, purely local optimal instruction scheduling for minimal spilling is effective at improving performance when compared to a heuristic scheduler for minimal register use.

Keywords

Basic Block Dependence Graph Register Allocation Instruction Schedule Code Motion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. [AEBK94]
    Ambrosch, W., Ertl, M.A., Beer, F., Krall, A.: Dependence-conscious Global Register Allocation. In: Gutknecht, J. (ed.) Programming Languages and System Architectures. LNCS, vol. 782, pp. 125–136. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  2. [Bar11]
    Barany, G.: Register reuse scheduling. In: 9th Workshop on Optimizations for DSP and Embedded Systems (ODES-9), Chamonix, France, http://www.imec.be/odes/ (April 2011)
  3. [BR91]
    Bernstein, D., Rodeh, M.: Global instruction scheduling for superscalar machines. In: Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, PLDI 1991, pp. 241–255. ACM, New York (1991)CrossRefGoogle Scholar
  4. [CBD11]
    Colombet, Q., Brandner, F., Darte, A.: Studying optimal spilling in the light of ssa. In: Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2011, pp. 25–34. ACM, New York (2011)CrossRefGoogle Scholar
  5. [CCK97]
    Chang, C.-M., Chen, C.-M., King, C.-T.: Using integer linear programming for instruction scheduling and register allocation in multi-issue processors. In: Computers and Mathematics with Applications (1997)Google Scholar
  6. [CFR+91]
    Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, F.K.: Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13(4), 451–490 (1991)CrossRefGoogle Scholar
  7. [Cha82]
    Chaitin, G.J.: Register allocation & spilling via graph coloring. In: Proceedings of the 1982 SIGPLAN Symposium on Compiler Construction, SIGPLAN 1982, pp. 98–105. ACM, New York (1982)CrossRefGoogle Scholar
  8. [Cli95]
    Click, C.: Global code motion/global value numbering. In: Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation, PLDI 1995, pp. 246–257 (1995)Google Scholar
  9. [CSG01]
    Codina, J.M., Sánchez, J., González, A.: A unified modulo scheduling and register allocation technique for clustered processors. In: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, PACT 2001, pp. 175–184. IEEE Computer Society, Washington, DC (2001)Google Scholar
  10. [EK91]
    Ertl, M.A., Krall, A.: Optimal Instruction Scheduling using Constraint Logic Programming. In: Małuszyński, J., Wirsing, M. (eds.) PLILP 1991. LNCS, vol. 528, Springer, Heidelberg (1991)CrossRefGoogle Scholar
  11. [EK12]
    Eriksson, M., Kessler, C.: Integrated code generation for loops. ACM Trans. Embed. Comput. Syst. 11S(1), 19:1–19:24 (2012)Google Scholar
  12. [GH88]
    Goodman, J.R., Hsu, W.-C.: Code scheduling and register allocation in large basic blocks. In: ICS 1988: Proceedings of the 2nd International Conference on Supercomputing, pp. 442–452. ACM, New York (1988)Google Scholar
  13. [GYA+03]
    Govindarajan, R., Yang, H., Amaral, J.N., Zhang, C., Gao, G.R.: Minimum register instruction sequencing to reduce register spills in out-of-order issue superscalar architectures. IEEE Transactions on Computers 52(1), 4–20 (2003)CrossRefGoogle Scholar
  14. [HS06]
    Hames, L., Scholz, B.: Nearly Optimal Register Allocation with PBQP. In: Lightfoot, D.E., Ren, X.-M. (eds.) JMLC 2006. LNCS, vol. 4228, pp. 346–361. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. [JM03]
    Johnson, N., Mycroft, A.: Combined Code Motion and Register Allocation Using the Value State Dependence Graph. In: Hedin, G. (ed.) CC 2003. LNCS, vol. 2622, pp. 1–16. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  16. [Lam88]
    Lam, M.: Software pipelining: an effective scheduling technique for VLIW machines. In: Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation, PLDI 1988, pp. 318–328. ACM, New York (1988)CrossRefGoogle Scholar
  17. [NP93]
    Norris, C., Pollock, L.L.: A scheduler-sensitive global register allocator. In: Supercomputing 1993: Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pp. 804–813 (1993)Google Scholar
  18. [NP95a]
    Norris, C., Pollock, L.L.: An experimental study of several cooperative register allocation and instruction scheduling strategies. In: Proceedings of the 28th Annual International Symposium on Microarchitecture, MICRO 28, pp. 169–179. IEEE Computer Society Press, Los Alamitos (1995)CrossRefGoogle Scholar
  19. [NP95b]
    Norris, C., Pollock, L.L.: Register allocation sensitive region scheduling. In: Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques, PaCT 1995, pp. 1–10. IFIP Working Group on Algol, Manchester (1995)Google Scholar
  20. [Pin93]
    Pinter, S.S.: Register allocation with instruction scheduling. In: PLDI 1993: Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation, pp. 248–257. ACM, New York (1993)CrossRefGoogle Scholar
  21. [SE02]
    Scholz, B., Eckstein, E.: Register allocation for irregular architectures. In: Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems: Software and Compilers for Embedded Systems, LCTES/SCOPES 2002, pp. 139–148. ACM, New York (2002)CrossRefGoogle Scholar
  22. [Tou01]
    Touati, S.A.A.: Register Saturation in Superscalar and VLIW Codes. In: Wilhelm, R. (ed.) CC 2001. LNCS, vol. 2027, pp. 213–228. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  23. [Win07]
    Winkel, S.: Optimal versus heuristic global code scheduling. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 40, pp. 43–55. IEEE Computer Society, Washington, DC (2007)CrossRefGoogle Scholar
  24. [WLH00]
    Wilken, K., Liu, J., Heffernan, M.: Optimal instruction scheduling using integer programming. In: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI 2000, pp. 121–133. ACM, New York (2000)CrossRefGoogle Scholar
  25. [XT07]
    Xu, W., Tessier, R.: Tetris: a new register pressure control technique for VLIW processors. In: LCTES 007: Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, pp. 113–122. ACM, New York (2007)CrossRefGoogle Scholar
  26. [ZJC03]
    Zhou, H., Jennings, M.D., Conte, T.M.: Tree Traversal Scheduling: A Global Instruction Scheduling Technique for VLIW/EPIC Processors. In: Dietz, H.G. (ed.) LCPC 2001. LNCS, vol. 2624, pp. 223–238. Springer, Heidelberg (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Gergö Barany
    • 1
  • Andreas Krall
    • 1
  1. 1.Vienna University of TechnologyAustria

Personalised recommendations