Hardware-Aware Compilation

Shrivastava, Aviral; Cai, Jian

doi:10.1007/978-94-017-7267-9_26

Aviral Shrivastava³ &
Jian Cai⁴

3151 Accesses

Abstract

Hardware-aware compilers are in high demand for embedded systems with stringent multidimensional design constraints on cost, power, performance, etc. By making use of the microarchitectural information about a processor, a hardware-aware compiler can generate more efficient code than a generic compiler while meeting the design constraints, by exploiting those highly customized microarchitectural features. In this chapter, we introduce two applications of hardware-aware compilers: a hardware-aware compiler can be used as a production compiler and as a tool to efficiently explore the design space of embedded processors. We demonstrate the first application with a compiler that generates efficient code for embedded processors that do not have any branch predictor to reduce branch penalties. To demonstrate the second application, we show how a hardware-aware compiler can be used to explore the Design Space of the bypass designs in the processor. In both the cases, the hardware-aware compiler can generate better code than a hardware-ignorant compiler.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 699.99; Price excludes VAT (USA)

Hardcover Book: USD 949.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

ADL:: Architecture Description Language
BRF:: Bypass Register File
BTB:: Branch Target Buffer
CFG:: Control-Flow Graph
CIL:: Compiler-In-the-Loop
DSE:: Design Space Exploration
HPC:: Horizontally Partitioned Cache
ISA:: Instruction-Set Architecture
MAC:: Multiply-Accumulator
OT:: Operation Table
RT:: Response Time
SPU:: Synergistic Processor Unit

References

Bala V, Rubin N (1995) Efficient instruction scheduling using finite state automata. In: Proceedings of the 28th annual international symposium on microarchitecture, pp 46–56. doi:10.1109/MICRO.1995.476812
Ball T, Larus JR (1993) Branch prediction for free. In: Proceedings of PLDI. ACM, New York, pp 300–313. doi:10.1145/155090.155119
Google Scholar
Chen T, Raghavan R, Dale JN, Iwata E (2007) Cell broadband engine architecture and its first implementation – a performance view. IBM J Res Dev 51(5):559–572. doi:10.1147/rd.515.0559
Article Google Scholar
Dual-Core Intel Itanium Processor 9000 and 9100 Series (2007). http://download.intel.com/design/itanium/downloads/314054.pdf
Flachs et al B (2006) The microarchitecture of the synergistic processor for a cell processor. IEEE Solid-State Circuits 41(1):63–70
Google Scholar
Fog A (2008) The microarchitecture of Intel and AMD CPUs
Google Scholar
GNU Toolchain 4.1.1 and GDB for the Cell BE’s PPU/SPU. http://www.bsc.es/plantillaH.php?cat_id=304
Grun P, Dutt N, Nicolau A Memory aware compilation through accurate timing extraction. In: Proceedings of the 37th annual design automation conference, DAC’00. ACM, New York, pp 316–321 (2000). doi:10.1145/337292.337428
Grun P, Dutt N, Nicolau A (2000) MIST: an algorithm for memory miss traffic management. In: IEEE/ACM international conference on computer aided design, ICCAD-2000, pp 431–437. doi:10.1109/ICCAD.2000.896510
Grun P, Halambi A, Dutt N, Nicolau A (2003) RTGEN-an algorithm for automatic generation of reservation tables from architectural descriptions. IEEE Trans Very Large Scale Integr (VLSI) Syst 11(4):731–737. doi:10.1109/TVLSI.2003.813011
Article Google Scholar
Halambi A, Grun P, Ganesh V, Khare A, Dutt N, Nicolau A (1999) EXPRESSION: a language for architecture exploration through compiler/simulator retargetability. In: Design, automation and test in Europe conference and exhibition 1999. Proceedings, pp 485–490. doi:10.1109/DATE.1999.761170
Hoffmann A, Schliebusch O, Nohl A, Braun G, Wahlen O, Meyr H (2001) A methodology for the design of application specific instruction set processors (ASIP) using the machine description language LISA. In: Proceedings of the 2001 IEEE/ACM international conference on computer-aided design, ICCAD’01. IEEE Press, Piscataway, pp 625–630
Google Scholar
https://gcc.gnu.org/ (2007)
IBM: Cell Broadband Engine Programming Handbook including PowerXCell 8i. https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/7A77CCDF14FE70D5852575CA0074E8ED
Intel Corporation. Intel XScale(R) Core: Developer’s Manual. http://www.intel.com/design/iio/manuals/273411.htm
Keutzer K, Malik S, Newton A (2002) From ASIC to ASIP: the next design discontinuity. In: IEEE international conference on computer design: VLSI in computers and processors, 2002. Proceedings, pp 84–90. doi:10.1109/ICCD.2002.1106752
Kondo M, Kobyashi H, Sakamoto R, Wada M, Tsukamoto J, Namiki M, Wang W, Amano H, Matsunaga K, Kudo M, Usami K, Komoda T, Nakamura H (2014) Design and evaluation of fine-grained power-gating for embedded microprocessors. In: Design, automation and test in Europe conference and exhibition (DATE), pp 1–6. doi:10.7873/DATE.2014.158
Kongetira P, Aingaran K, Olukotun K (2005) Niagara: a 32-way multithreaded sparc processor. IEEE Micro 25(2):21–29. doi:10.1109/MM.2005.35
Article Google Scholar
Lattner C (2002) LLVM: an infrastructure for multi-stage optimization. Master’s thesis, Computer Science Department, University of Illinois at Urbana-Champaign, Urbana. See http://llvm.cs.uiuc.edu
Leupers R (2000) Code generation for embedded processors. In: The 13th international symposium on system synthesis, 2000. Proceedings, pp 173–178. doi:10.1109/ISSS.2000.874046
Lowney PG, Freudenberger SM, Karzes TJ, Lichtenstein WD, Nix RP, O’Donnell JS, Ruttenberg JC (1993) The multiflow trace scheduling compiler. J Supercomput 7:51–142
Article Google Scholar
Lu J, Kim Y, Shrivastava A, Huang C (2011) Branch penalty reduction on IBM cell SPUs via software branch hinting. In: Proceedings of CODES+ISSS, pp 355–364
Google Scholar
Muchnick SS (1997) Advanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
Park D, Lee J, Kim NS, Kim T (2010) Optimal algorithm for profile-based power gating: a compiler technique for reducing leakage on execution units in microprocessors. In: 2010 IEEE/ACM international conference on computer-aided design (ICCAD), pp 361–364. doi:10.1109/ICCAD.2010.5653652
Patterson D, Anderson T, Cardwell N, Fromm R, Keeton K, Kozyrakis C, Thomas R, Yelick K (1997) A case for intelligent RAM. IEEE Micro 17(2):34–44. doi:10.1109/40.592312
Article Google Scholar
Proebsting TA, Fraser CW (1994) Detecting pipeline structural hazards quickly. In: Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL’94. ACM, New York, pp 280–286. doi:10.1145/174675.177904
Google Scholar
Roy S, Katkoori S, Ranganathan N (2007) A compiler based leakage reduction technique by power-gating functional units in embedded microprocessors. In: 20th international conference on VLSI Design, 2007. Held jointly with 6th international conference on embedded systems, pp 215–220. doi:10.1109/VLSID.2007.10
Shrivastava A (2006) Compiler-in-loop exploration of programmable embedded systems. Ph.D. thesis, Donald Bren School of Information and Computer Sciences
Google Scholar
Shrivastava A, Issenin I, Dutt N (2005) Compilation techniques for energy reduction in horizontally partitioned cache architectures. In: Proceedings of the 2005 international conference on compilers, architectures and synthesis for embedded systems, CASES’05. ACM, New York, pp 90–96. doi:10.1145/1086297.1086310
Chapter Google Scholar
Siska C (1998) A processor desription language supporting retargetable multi-pipeline DSP program development tools. In: Proceedings of the 11th international symposium on system synthesis, ISSS’98. IEEE Computer Society, Washington, DC, pp 31–36
Google Scholar
Trimaran. http://www.trimaran.org/
Wagner TA, Maverick V, Graham SL, Harrison MA (1994) Accurate static estimators for program optimization. In: Proceedings of the ACM SIGPLAN 1994 conference on programming language design and implementation, PLDI’94. ACM, New York, pp 85–96. doi:10.1145/178243.178251
Chapter Google Scholar
Wu Y, Larus JR (1994) Static branch frequency and program profile analysis. In: Proceedings of the 27th annual international symposium on Microarchitecture. ACM, New York, pp 1–11. doi:10.1145/192724.192725
Google Scholar
Zivojnovic V, Pees S, Meyr H (1996) LISA-machine description language and generic machine model for HW/SW co-design. In: Workshop on VLSI signal processing, IX, pp 127–136. doi:10.1109/VLSISP.1996.558311
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Informatics and Decision Systems Engineering, Arizona State University, 85281, Tempe, AZ, USA
Aviral Shrivastava
Arizona State University, 85281, Tempe, AZ, USA
Jian Cai

Authors

Aviral Shrivastava
View author publications
You can also search for this author in PubMed Google Scholar
Jian Cai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aviral Shrivastava .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Seoul National University, Seoul, Korea (Republic of)
Soonhoi Ha
Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Jürgen Teich

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Shrivastava, A., Cai, J. (2017). Hardware-Aware Compilation. In: Ha, S., Teich, J. (eds) Handbook of Hardware/Software Codesign. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7267-9_26

Download citation

DOI: https://doi.org/10.1007/978-94-017-7267-9_26
Published: 27 September 2017
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-7266-2
Online ISBN: 978-94-017-7267-9
eBook Packages: EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics