Skip to main content

Application-Specific Processors

  • Reference work entry
  • First Online:
Handbook of Hardware/Software Codesign
  • 3179 Accesses

Abstract

General-Purpose Processors (GPPs) and Application-Specific Integrated Circuits (ASICs) are the two extreme choices for computational engines. GPPs offer complete flexibility but are inefficient both in terms of performance and energy. In contrast, ASICs are highly energy-efficient, provide the best performance at the cost of zero flexibility. Application-specific processors or custom processors bridge the gap between these two alternatives by bringing in improved power-performance efficiency within the familiar software programming environment. An application-specific processor architecture augments the base instruction-set architecture with customized instructions that encapsulate the frequently occurring computational patterns within an application. These custom instructions are implemented in hardware enabling performance acceleration and energy benefits. The challenge lies in inventing automated tools that can design an application-specific processor by identifying and implementing custom instructions from the application software specified in high-level programming languages. In this chapter, we present the benefits of application-specific processors, their architecture, automated design flow, and the renewed interests in this class of architectures from energy-efficiency perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

ALU:

Arithmetic-Logic Unit

ASIC:

Application-Specific Integrated Circuit

BERET:

Bundled Execution of REcurring Traces

CAD:

Computer-Aided Design

CCA:

Configurable Compute Accelerator

CFG:

Control-Flow Graph

CFU:

Custom Functional Unit

CIS:

Custom Instruction-Set

DFG:

Data-Flow Graph

DISC:

Dynamic Instruction-Set Computer

DSP:

Digital Signal Processor

FPGA:

Field-Programmable Gate Array

GPP:

General-Purpose Processor

GPU:

Graphics Processing Unit

ILP:

Integer Linear Program

IR:

Intermediate Representation

ISA:

Instruction-Set Architecture

ISEF:

Stretch Instruction-Set Extension Fabric

MAC:

Multiply-Accumulator

MIMO:

Multiple Input Multiple Output

MISO:

Multiple Input Single Output

PFU:

Programmable Functional Unit

PRISC:

Programmable Instruction-Set Processor

RAM:

Random-Access Memory

RISC:

Reduced Instruction-Set Processor

RISPP:

Rotating Instruction-Set Processing Platform

SFU:

Specialized Functional Unit

VLIW:

Very Long Instruction Word

References

  1. Ahn J, Choi K (2013) Isomorphism-aware identification of custom instructions with i/o serialization. IEEE Trans Comput-Aided Des Integr Circuits Syst 32(1):34–46

    Article  Google Scholar 

  2. Alippi C, Fornaciari W, Pozzi L, Sami M (1999) A dag-based design approach for reconfigurable VLIW processors. In: Proceedings of the conference on design, automation and test in Europe. ACM, p 57

    Google Scholar 

  3. Amdahl GM (1967) Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the spring joint computer conference, 18–20 Apr 1967. ACM, pp 483–485

    Google Scholar 

  4. Atasu K, Dimond RG, Mencer O, Luk W, Özturan C, Diindar G (2007) Optimizing instruction-set extensible processors under data bandwidth constraints. In: Design, automation & test in Europe conference & exhibition, DATE’07. IEEE, pp 1–6

    Google Scholar 

  5. Atasu K, Luk W, Mencer O, Özturan C, Dündar G (2012) Fish: fast instruction synthesis for custom processors. IEEE Trans Very Large Scale Integr (VLSI) Syst 20(1):52–65

    Article  Google Scholar 

  6. Atasu K, Mencer O, Luk W, Özturan C, Dündar G (2008) Fast custom instruction identification by convex subgraph enumeration. In: International conference on application-specific systems, architectures and processors, ASAP 2008. IEEE, pp 1–6

    Google Scholar 

  7. Atasu K, Pozzi L, Ienne P (2003) Automatic application-specific instruction-set extensions under microarchitectural constraints. Int J Parallel Program 31(6):411–428

    Article  MATH  Google Scholar 

  8. Bauer L, Shafique M, Kramer S, Henkel J (2007) Rispp: rotating instruction set processing platform. In: Proceedings of the 44th annual design automation conference. ACM, pp 791–796

    Google Scholar 

  9. Bonzini P, Pozzi L (2007) Polynomial-time subgraph enumeration for automated instruction set extension. In: Proceedings of the conference on design, automation and test in Europe. EDA Consortium, pp 1331–1336

    Google Scholar 

  10. Bordoloi UD, Huynh HP, Chakraborty S, Mitra T (2009) Evaluating design trade-offs in customizable processors. In: 46th ACM/IEEE design automation conference, DAC’09. IEEE, pp 244–249

    Google Scholar 

  11. Borkar S, Chien AA (2011) The future of microprocessors. Commun ACM 54(5):67–77

    Article  Google Scholar 

  12. Chen L, Tarango J, Mitra T, Brisk P (2013) A just-in-time customizable processor. In: 2013 IEEE/ACM international conference on computer-aided design (ICCAD). IEEE, pp 524–531

    Google Scholar 

  13. Chen X, Maskell DL, Sun Y (2007) Fast identification of custom instructions for extensible processors. IEEE Trans Comput-Aided Des Integr Circuits Syst 26(2):359–368

    Article  Google Scholar 

  14. Choi K (2011) Coarse-grained reconfigurable array: architecture and application mapping. IPSJ Trans Syst LSI Des Methodol 4:31–46

    Article  Google Scholar 

  15. Clark N, Blome J, Chu M, Mahlke S, Biles S, Flautner K (2005) An architecture framework for transparent instruction set customization in embedded processors. In: Proceedings of the 32nd international symposium on computer architecture (ISCA’05). IEEE Computer Society, pp 272–283

    Google Scholar 

  16. Clark N, Kudlur M, Park H, Mahlke S, Flautner K (2004) Application-specific processing on a general-purpose core via transparent instruction set customization. In: 37th international symposium on microarchitecture, MICRO-37 2004. IEEE, pp 30–40

    Google Scholar 

  17. Cong J, Fan Y, Han G, Jagannathan A, Reinman G, Zhang Z (2005) Instruction set extension with shadow registers for configurable processors. In: Proceedings of the 2005 ACM/SIGDA 13th international symposium on field-programmable gate arrays. ACM, pp 99–106

    Google Scholar 

  18. Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processor architectures. In: Proceedings of the 2004 ACM/SIGDA 12th international symposium on field programmable gate arrays. ACM, pp 183–189

    Google Scholar 

  19. Dennard RH, Gaensslen FH, Rideout VL, Bassous E, LeBlanc AR (1974) Design of Ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid-State Circuits 9(5):256–268

    Article  Google Scholar 

  20. Dubach C, Jones T, O’Boyle M (2007) Microarchitectural design space exploration using an architecture-centric approach. In: Proceedings of the 40th annual IEEE/ACM international symposium on microarchitecture. IEEE Computer Society, pp 262–271

    Google Scholar 

  21. Esmaeilzadeh H, Blem E, St Amant R, Sankaralingam K, Burger D (2011) Dark silicon and the end of multicore scaling. In: International symposium on computer architecture (ISCA)

    Google Scholar 

  22. Geer D (2005) Chip makers turn to multicore processors. Computer 38(5):11–13

    Article  Google Scholar 

  23. Giaquinta E, Mishra A, Pozzi L (2015) Maximum convex subgraphs under i/o constraint for automatic identification of custom instructions. IEEE Trans Comput-Aided Des Integr Circuits Syst 34(3):483–494

    Article  Google Scholar 

  24. Gonzalez RE (2000) Xtensa: a configurable and extensible processor. IEEE Micro 20(2):60–70

    Article  MathSciNet  Google Scholar 

  25. Gonzalez RE (2006) A software-configurable processor architecture. IEEE Micro 26(5):42–51

    Article  Google Scholar 

  26. Govindaraju V, Ho CH, Sankaralingam K (2011) Dynamically specialized datapaths for energy efficient computing. In: 2011 IEEE 17th international symposium on high performance computer architecture (HPCA). IEEE, pp 503–514

    Google Scholar 

  27. Gupta S, Feng S, Ansari A, Mahlke S, August D (2011) Bundled execution of recurring traces for energy-efficient general purpose processing. In: Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture. ACM, pp 12–23

    Google Scholar 

  28. Gutin G, Johnstone A, Reddington J, Scott E, Yeo A (2012) An algorithm for finding input–output constrained convex sets in an acyclic digraph. J Discret Algorithms 13:47–58

    Article  MathSciNet  MATH  Google Scholar 

  29. Halambi A, Grun P, Ganesh V, Khare A, Dutt N, Nicolau A (2008) Expression: a language for architecture exploration through compiler/simulator retargetability. In: Design, automation, and test in Europe. Springer, The Netherlands, pp 31–45

    Chapter  Google Scholar 

  30. Hameed R, Qadeer W, Wachs M, Azizi O, Solomatnikov A, Lee BC, Richardson S, Kozyrakis C, Horowitz M (2010) Understanding sources of inefficiency in general-purpose chips. In: ACM SIGARCH computer architecture news, vol 38, no 3. ACM, pp 37–47

    Google Scholar 

  31. Huynh H, Mitra T (2007) Instruction-set customization for real-time embedded systems. In: Proceedings of the conference on design, automation and test in Europe. EDA Consortium, pp 1472–1477

    Google Scholar 

  32. Huynh HP, Mitra T (2009) Runtime adaptive extensible embedded processors–a survey. In: International workshop on embedded computer systems. Springer, Berlin/Heidelberg, pp 215–225

    Google Scholar 

  33. Huynh HP, Sim JE, Mitra T (2007) An efficient framework for dynamic reconfiguration of instruction-set customization. In: Proceedings of the 2007 international conference on compilers, architecture, and synthesis for embedded systems. ACM, pp 135–144

    Google Scholar 

  34. Ienne P, Leupers R (2006) Customizable embedded processors: design technologies and applications. Academic Press

    Google Scholar 

  35. Jacob JA, Chow P (1999) Memory interfacing and instruction specification for reconfigurable processors. In: Proceedings of the 1999 ACM/SIGDA seventh international symposium on field programmable gate arrays. ACM, pp 145–154

    Google Scholar 

  36. Jayaseelan R, Liu H, Mitra T (2006) Exploiting forwarding to improve data bandwidth of instruction-set extensions. In: Proceedings of the 43rd annual design automation conference. ACM, pp 43–48

    Google Scholar 

  37. Kastner R, Kaplan A, Memik SO, Bozorgzadeh E (2002) Instruction generation for hybrid reconfigurable systems. ACM Trans Des Autom Electron Syst (TODAES) 7(4):605–627

    Article  Google Scholar 

  38. Kathail V, Aditya S, Schreiber R, Rau BR, Cronquist DC, Sivaraman M (2002) Pico: automatically designing custom computers. Computer 35(9):39–47

    Article  MATH  Google Scholar 

  39. Leibson S (2006) Designing SOCs with configured cores: unleashing the tensilica Xtensa and diamond cores. Academic Press

    Google Scholar 

  40. Li T, Sun Z, Jigang W, Lu X (2009) Fast enumeration of maximal valid subgraphs for custom-instruction identification. In: Proceedings of the 2009 international conference on compilers, architecture, and synthesis for embedded systems. ACM, pp 29–36

    Google Scholar 

  41. Lodi A, Toma M, Campi F, Cappelli A, Canegallo R, Guerrieri R (2003) A VLIW processor with reconfigurable instruction set for embedded applications. IEEE J Solid-State Circuits 38(11):1876–1886

    Article  Google Scholar 

  42. Lysecky R, Stitt G, Vahid F (2004) Warp processors. In: ACM transactions on design automation of electronic systems (TODAES), vol 11, no 3. ACM, pp 659–681

    Google Scholar 

  43. Merritt R (2009) ARM CTO: power surge could create ‘dark silicon’. EE Times, Oct 2009.

    Google Scholar 

  44. Mitra T (2015) Heterogeneous multi-core architectures. Inf Media Technol 10(3):383–394

    Google Scholar 

  45. Mitra T, Yu P (2005) Satisfying real-time constraints with custom instructions. In: Third IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis, CODES+ ISSS’05. IEEE, pp 166–171

    Google Scholar 

  46. Moon JW, Moser L (1965) On cliques in graphs. Israel J Math 3(1):23–28

    Article  MathSciNet  MATH  Google Scholar 

  47. Moore GE et al (1965) Cramming more components onto integrated circuits

    Google Scholar 

  48. Mudge T (2000) Power: a first class design constraint for future architectures. In: International conference on high-performance computing. Springer, pp 215–224

    Google Scholar 

  49. Nios I (2009) Processor reference handbook

    Google Scholar 

  50. Palacharla S, Jouppi NP, Smith JE (1997) Complexity-effective superscalar processors. In: Proceedings of the 24th annual international symposium on computer architecture (ISCA’97), Denver. ACM, New York, pp 206–218. doi: 10.1145/264107.264201

    Google Scholar 

  51. Palermo G, Silvano C, Zaccaria V (2005) Multi-objective design space exploration of embedded systems. J Embed Comput 1(3):305–316

    Google Scholar 

  52. Pan Y (2008) Design methodologies for instruction-set extensible processors. Ph.D. thesis, National University of Singapore

    Google Scholar 

  53. Patterson D, Hennessy JL (2012) Computer architecture: a quantitative approach. Elsevier

    Google Scholar 

  54. Pothineni N, Kumar A, Paul K (2007) Application specific datapath extension with distributed i/o functional units. In: Proceedings of the 20th international conference on VLSI design, Bangalore

    Google Scholar 

  55. Pozzi L, Atasu K, Ienne P (2006) Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans Comput-Aided Des Integr Circuits Syst 25(7):1209–1229

    Article  Google Scholar 

  56. Pozzi L, Ienne P (2005) Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In: Proceedings of the 2005 international conference on compilers, architectures and synthesis for embedded systems. ACM, pp 2–10

    Google Scholar 

  57. Razdan R (1994) Prisc: programmable reduced instruction set computers. Ph.D. thesis, Harvard University Cambridge

    Google Scholar 

  58. Reddington J, Atasu K (2012) Complexity of computing convex subgraphs in custom instruction synthesis. IEEE Trans Very Large Scale Integr (VLSI) Syst 20(12): 2337–2341

    Article  Google Scholar 

  59. Reddington J, Gutin G, Johnstone A, Scott E, Yeo A (2009) Better than optimal: fast identification of custom instruction candidates. In: International conference on computational science and engineering, CSE’09. vol 2. IEEE, pp 17–24

    Google Scholar 

  60. Rosinger HP (2004) Connecting customized ip to the microblaze soft processor using the fast simplex link (fsl) channel. Xilinx Application Note

    Google Scholar 

  61. Shafique M, Garg S, Mitra T, Parameswaran S, Henkel J (2014) Dark silicon as a challenge for hardware/software co-design. In: Conference on hardware/software codesign and system synthesis (CODES)

    Google Scholar 

  62. Shalf JM, Leland R (2015) Computing beyond moore’s law. Computer 48(12):14–23

    Article  Google Scholar 

  63. Tan C, Kulkarni A, Venkataramani V, Karunaratne M, Mitra T, Peh LS (2016) Locus: low-power customizable many-core architecture for wearables. In: Proceedings of the international conference on compilers, architecture, and synthesis for embedded systems (CASES)

    Google Scholar 

  64. Vassiliadis S, Wong S, Gaydadjiev G, Bertels K, Kuzmanov G, Panainte EM (2004) The molen polymorphic processor. IEEE Trans Comput 53(11):1363–1375

    Article  Google Scholar 

  65. Venkatesh G, Sampson J, Goulding-Hotta N, Venkata SK, Taylor MB, Swanson S (2011) Qscores: trading dark silicon for scalable energy efficiency with quasi-specific cores. In: Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture. ACM, pp 163–174

    Google Scholar 

  66. Verma AK, Brisk P, Ienne P (2007) Rethinking custom ise identification: a new processor-agnostic method. In: Proceedings of the 2007 international conference on compilers, architecture, and synthesis for embedded systems. ACM, pp 125–134

    Google Scholar 

  67. Wall DW (1991) Limits of instruction-level parallelism. In: Proceedings of the fourth international conference on architectural support for programming languages and operating systems (ASPLOS IV), Santa Clara. ACM, New York, pp 176–188. doi: 10.1145/106972.106991

    Chapter  Google Scholar 

  68. Wirthlin MJ, Hutchings BL (1995) A dynamic instruction set computer. In: IEEE symposium on FPGAs for custom computing machines. Proceedings. IEEE, pp 99–107

    Chapter  Google Scholar 

  69. Wulf WA, McKee SA (1995) Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput Archit News 23(1):20–24

    Article  Google Scholar 

  70. Ye ZA, Moshovos A, Hauck S, Banerjee P (2000) CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit. In: ACM SIGARCH computer architecture news, vol 28, no 2. ACM, pp. 225–235

    Google Scholar 

  71. Yu P, Mitra T (2004) Characterizing embedded applications for instruction-set extensible processors. In: Proceedings of the 41st annual design automation conference. ACM, pp 723–728

    Google Scholar 

  72. Yu P, Mitra T (2004) Scalable custom instructions identification for instruction-set extensible processors. In: Proceedings of the 2004 international conference on compilers, architecture, and synthesis for embedded systems. ACM, pp 69–78

    Google Scholar 

  73. Yu P, Mitra T (2007) Disjoint pattern enumeration for custom instructions identification. In: International conference on field programmable logic and applications, FPL 2007. IEEE, pp 273–278

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by Singapore Ministry of Education Academic Research Fund Tier 2 MOE2014-T2-2-129.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tulika Mitra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Dordrecht

About this entry

Cite this entry

Mitra, T. (2017). Application-Specific Processors. In: Ha, S., Teich, J. (eds) Handbook of Hardware/Software Codesign. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7267-9_13

Download citation

Publish with us

Policies and ethics