Abstract
High-performance embedded systems can only be developed when efficiency requirements are pursued at different levels of the system design. A predominant role is associated with compilers which are responsible for the generation of efficient machine code. To accomplish this goal, compilers have to feature advanced optimizations. The class of source code optimizations provides a number of benefits compared to optimizations applied at lower abstraction levels of the code. The most important issues are portability, early application in the optimization sequence to enable subsequent optimizations, and availability of more details about the program structure due to the high level of abstraction. In this chapter, novel WCET-aware source code level optimizations are presented, including procedure cloning, superblock optimizations, loop unrolling, and loop unswitching. Moreover, a technique called invariant path is presented to accelerate WCET-aware optimizations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A.V. Aho, R. Sethi, J.D. Ullman, Compilers: Principles, Techniques, and Tools (Addison-Wesley/Longman, Boston, 1986)
A.W. Appel, Modern Compiler Implementation in C (Cambridge University Press, New York, 1997)
D.F. Bacon, S.L. Graham, O.J. Sharp, Compiler transformations for high-performance computing. ACM Comput. Surv. 26(4), 345–420 (1994)
S. Carr, K. Kennedy, Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16(6), 1768–1810 (1994)
P.P. Chang, W.W. Hwu, Trace selection for compiling large C application programs to microcode, in Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture (MICRO), San Diego, USA, November 1988, pp. 21–29
P.P. Chang, S.A. Mahlke, W.W. Hwu, Using profile information to assist classic code optimizations. Softw. Pract. Exp. 21(12), 1301–1321 (1991)
W. Chen, S. Mahlke, N. Warter et al., Using profile information to assist advanced compiler optimization and scheduling. Adv. Lang. Compil. Parallel Process. 757, 31–48 (1992)
R. Cohn, P.G. Lowney, Design and analysis of profile-based optimization in Compaq’s compilation tools for alpha. J. Instr. Level Parallelism 2, 1–25 (2000)
K.D. Cooper, M.W. Hall, K. Kennedy, A methodology for procedure cloning. Comput. Lang. 19(2), 105–117 (1993)
J.W. Davidson, S. Jinturkar, An aggressive approach to loop unrolling, Technical report, University of Virginia, Charlottesville, USA, 2001
A. Erosa, L.J. Hendren, Taming control flow: a structured approach to eliminating goto statements, in Proceedings of IEEE International Conference on Computer Languages (ICCL), Toulouse, France, May 1994, pp. 229–240
H. Falk, WCET-aware register allocation based on graph coloring, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 726–731
H. Falk, J.C. Kleinsorge, Optimal static WCET-aware scratchpad allocation of program code, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 732–737
H. Falk, P. Marwedel, Control flow driven splitting of loop nests at the source code level, in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), Munich, Germany, March 2003, pp. 410–415
H. Falk, M. Schwarzer, Loop nest splitting for WCET-optimization and predictability improvement, in Proceedings of the 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia (ESTIMedia), Seoul, Korea, October 2006, pp. 115–120
J.A. Fisher, Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput. 30(7), 478–490 (1981)
G. Fursin, C. Miranda, S. Pop et al., Practical run-time adaptation with procedure cloning to enable continuous collective compilation, in Proceedings of the GCC Developers’ Summit, Ottawa, Canada, July 2007
R. Ghiya, L.J. Hendren, Putting pointer analysis to work, in Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), San Diego, USA, January 1998, pp. 121–133
R. Giegerich, U. Möncke, R. Wilhelm, Invariance of approximate semantics with respect to program transformations, in GI - 11. Jahrestagung in Verbindung mit Third Conference of the European Co-operation in Informatics (ECI), Munich, Germany, October 1981, pp. 1–10
K. Heydemann, F. Bodin, P. Knijnenburg et al., UFC: a global trade-off strategy for loop unrolling for VLIW architecture, in Proceedings of the 10th Workshop on Compilers for Parallel Computers (CPC), Amsterdam, The Netherlands, January 2001, pp. 59–70
W.W. Hwu, S.A. Mahlke, W.Y. Chen et al., The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 229–248 (1993)
T. Kelter, Superblock-based high-level WCET optimizations, Diploma thesis, TU Dortmund University, September 2009 (in German)
R. Kidd, W.W. Hwu, Abstract improved superblock optimization in GCC, in GCC Summit (2006)
A. Koseki, H. Komastu, Y. Fukazawa, A method for estimating optimal unrolling times for nested loops, in Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN), Washington, USA, December 1997, pp. 376–382
D.M. Lavery, W.W. Hwu, Unrolling-based optimizations for modulo scheduling, in Proceedings of the 28th Annual International Symposium on Microarchitecture (MICRO), Ann Arbor, USA, November 1995, pp. 327–337
S. Lee, J. Lee, C.Y. Park, S.L. Min, A flexible tradeoff between code size and WCET using a dual instruction set processor, in Proceedings of the 8th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Amsterdam, The Netherlands, September 2004, pp. 244–258
C. Lee, M. Potkonjak, W.H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communications systems, in Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO), Research Triangle Park, USA, December 1997, pp. 330–335
C. Liem, P. Paulin, A. Jerraya, Address calculation for retargetable compilation and exploration of instruction-set architectures, in Proceedings of the 33rd annual Design Automation Conference (DAC), Las Vegas, USA, June 1996, pp. 597–600
P. Lokuciejewsi, H. Falk, P. Marwedel, H. Theiling, WCET-driven, code-size critical procedure cloning, in Proceedings of the 11th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Munich, Germany, March 2008, pp. 21–30
P. Lokuciejewski, P. Marwedel, Combining worst-case timing models, loop unrolling, and static loop analysis for WCET minimization, in Proceedings of the 22nd Euromicro Conference on Real-Time Systems (ECRTS), Dublin, Ireland, July 2009, pp. 35–44
P. Lokuciejewski, H. Falk, M. Schwarzer, P. Marwedel, H. Theiling, Influence of procedure cloning on WCET prediction, in Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Salzburg, Austria, October 2007, pp. 137–142
P. Lokuciejewski, H. Falk, M. Schwarzer, M. Peter, Tighter WCET estimates by procedure cloning, in Proceedings of the 7th International Workshop on Worst-Case Execution Time Analysis (WCET), Pisa, Italy, July 2007, pp. 27–32
P. Lokuciejewski, F. Gedikli, P. Marwedel, Accelerating WCET-driven optimizations by the invariant path paradigm: a case study of loop unswitching, in Proceedings of the 12th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Nice, France, April 2009, pp. 11–20
P. Lokuciejewski, T. Kelter, P. Marwedel, Superblock-based source code optimizations for WCET reduction, in Proceedings of the 7th IEEE International Conferences on Embedded Software and Systems (ICESS), Bradford, UK, June 2010
P.G. Lowney, S.M. Freudenberger, T.J. Karzes et al., The multiflow trace scheduling compiler. J. Supercomput. 7(1–2), 51–142 (1993)
S.A. Mahlke, W.Y. Chen, J. Gyllenhaal et al., Compiler code transformations for superscalar-based high performance systems, in Proceedings of the 1992 ACM/IEEE Conference on Supercomputing, Washington, USA, July 1992, pp. 808–817
Mälardalen WCET Research Group. WCET Benchmarks, http://www.mrtc.mdh.se/projects/wcet, March 2010
T.C. Mowry, Tolerating latency through software-controlled data prefetching, Technical report, Stanford University, Stanford, USA, 1994
S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufmann, San Francisco, 1997)
A. Pabalkar, A. Shrivastava, A. Kannan, J. Lee, SDRM: simultaneous determination of regions and function-to-region mapping for scratchpad memories. Lect. Not. Comput. Sci. 5374, 569–582 (2008)
A. Prantl, M. Schordan, J. Knoop, TuBound—a conceptually new tool for worst-case execution time analysis, in Proceedings of the 8th International Workshop on Worst-Case Execution Time Analysis (WCET), Prague, Czech Republik, July 2008
I. Puaut, WCET-centric software-controlled instruction caches for hard real-time systems, in Proceedings of the 18th Euromicro Conference on Real-Time Systems (ECRTS), Dresden, Germany, July 2006, pp. 217–226
I. Puaut, D. Decotigny, Low-complexity algorithms for static cache locking in multitasking hard real-time systems, in Proceedings of the 23rd IEEE Real-Time Systems Symposium (RTSS), Austin, USA, December 2002, pp. 114–123
V. Sarkar, Optimized unrolling of nested loops. Int. J. Parallel Program. 29(5), 545–581 (2001)
B. Siegfried, M. Eduard, B. Scholz, Probabilistic procedure cloning for high-performance systems, Technical report, Institute for Software Science, University of Vienna, November 2000
W. So, A. Dean, Procedure cloning and integration for converting parallelism from coarse to fine grain, in Proceedings of the 7th Workshop on Interaction between Compilers and Computer Architectures (INTERACT), Anaheim, USA, February 2003, pp. 27–36
L. Song, K. Kavi, What can we gain by unfolding loops? SIGPLAN Not. 39(2), 26–33 (2004)
B. Su, S. Ding, L. Jin, An improvement of trace scheduling for global microcode compaction. ACM SIGMICRO Newsl. 15(4), 78–85 (1984)
V. Suhendra, T. Mitra, A. Roychoudhury et al., WCET centric data allocation to scratchpad memory, in Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS), Miami, USA, December 2005, pp. 223–232
H. Theiling, Control flow graphs for real-time systems analysis, PhD thesis, Saarland University, 2002
P. Tonella, Effects of different flow insensitive points-to analyses on DEF/USE sets, in Proceedings of the Third European Conference on Software Maintenance and Reengineering (CSMR), Amsterdam, The Netherlands, March 1999, pp. 62–69
UTDSP Benchmark Suite. http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html, March 2010
F. Vahid, Procedure cloning: a transformation for improved system-level functional partitioning. ACM Trans. Des. Automat. Electron. Syst. 4(1), 70–96 (1999)
H. Venturini, F. Riss, J.C. Fernandez et al., A fully-non-transparent approach to the code location problem, in Proceedings of the 11th International Workshop on Software & Compilers for Embedded Systems (SCOPES), Munich, Germany, March 2008, pp. 61–68
X. Vera, B. Lisper, J. Xue, Data cache locking for higher program predictability, in Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), San Diego, USA, July 2003, pp. 272–282
W. Zhao, D. Whalley, C. Healy, F. Mueller, WCET code positioning, in Proceedings of the 25th IEEE International Real-Time Systems Symposium (RTSS), Lisbon, Portugal, December 2004, pp. 81–91
W. Zhao, W. Kreahling, D. Whalley et al., Improving WCET by optimizing worst-case paths, in Proceedings of the 11th IEEE Real Time on Embedded Technology and Applications Symposium (RTAS), San Francisco, USA, March 2005, pp. 138–147
V. Zivojnović, J. Martínez Velarde, C. Schläger et al., DSPstone: a DSP-oriented benchmarking methodology, in Proceedings of the International Conference on Signal Processing and Technology (ICSPAT), Dallas, USA, January 1994, pp. 715–720
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Lokuciejewski, P., Marwedel, P. (2011). WCET-Aware Source Code Level Optimizations. In: Worst-Case Execution Time Aware Compilation Techniques for Real-Time Systems. Embedded Systems. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9929-7_4
Download citation
DOI: https://doi.org/10.1007/978-90-481-9929-7_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9928-0
Online ISBN: 978-90-481-9929-7
eBook Packages: EngineeringEngineering (R0)