Reconfigurable Acceleration with Binary Compatibility for General Purpose Processors

Part of the IFIP International Federation for Information Processing book series (IFIPAICT, volume 291)


Although transistor scaling keeps following Moore`s law, and more area is available for designers, the clock frequency and ILP rate do not present the same level of growth anymore. This way, new architectural alternatives are necessary. Reconfigurable fabric appears to be one emerging possibility: besides exploiting the parallelism among instructions, it can also accelerate sequences of data dependent ones. However, reconfiguration wide spread usage is still withheld by the need of special tools and compilers, which clearly do not sustain the reuse of legacy code without any kind of modification. Based on all these facts, this work proposes a new Binary Translation algorithm, implemented in hardware and working in parallel to the processor, responsible for transforming sequences of instructions at run-time to be executed on a dynamic coarse-grain reconfigurable array, tightly coupled to a traditional RISC machine. Therefore, we can take advantage of using pure combinational logic to optimize even control-flow oriented code in a totally transparent process, without any modification in the source code or binary. Using the Simplescalar Toolset together with the MIBench embedded benchmark suite, we show performance improvements and area evaluation when comparing against a traditional superscalar architecture.


Functional Unit Area Overhead Reconfigurable System General Purpose Processor Reconfigurable Hardware 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [1]
    David W. Wall, “Limits of instruction-level parallelism”, In Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.176–188, April 08–11, 1991Google Scholar
  2. [2]
    Sima, D., “Decisive aspects in the evolution of microprocessors”. In Proceedings of the IEEE, vol. 92, pp.1896–1926, 2004Google Scholar
  3. [3]
  4. [4]
    Venkataramani, G., Najjar, W., Kurdahi, F., Bagherzadeh, N., Bohm W., “A Compiler Framework for Mapping Applications to a Coarse-grained Reconfigurable Computer Architecture. Conf. on Compiler”. In Architecture and Synthesis for Embedded Systems (CASES), 2001Google Scholar
  5. [5]
    Stitt, G., Vahid F., “The Energy Advantages of Microprocessor Platforms with On-Chip Configurable Logic”. In IEEE Design and Test of Computers, 2002Google Scholar
  6. [6]
    Or-Bach, Z., Panel: “(when) will FPGAs kill ASICs?”, 38th Design Automation Conference, 2001.Google Scholar
  7. [7]
    [7] Compton, K., Hauck, S. “Reconfigurable computing: A survey of systems and software,” ACM Computing Surveys, vol. 34, no. 2, pp. 171–210, June 2002.CrossRefGoogle Scholar
  8. [8]
    Hauck, S., Fry, T., Hosler, M., Kao, J.: “The Chimaera reconfigurable functional unit”. In Proc. IEEE Symp. FPGAs for Custom Computing Machines, pp. 87–96, Napa Valley, CA, 1997Google Scholar
  9. [9]
    Leonardo Spectrum, available at homepage:
  10. [10]
    Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge T., Brown, R.B., “MiBench: A Free, Commercially Representative Embedded Benchmark Suite. 4th Workshop on Workload Characterization”, Austin, TX, Dec. 2001Google Scholar
  11. [11]
    Swanson, S., Michelson, K., Schwerin, A., Oskin. M., “WaveScalar”. In MICRO-36, Dec. 2003Google Scholar
  12. [12]
    Gschwind, M., Altman, E., Sathaye, P., Ledak, Appenzeller, D.: “Dynamic and Transparent Binary Translation”. In IEEE Computer, pp. 54–59, vol. 3 n. 33, 2000Google Scholar
  13. [13]
    Ebcioglu, E. A., “DAISY: Dynamic compilation for 100% architectural compatibility”. In IBM T.J. Watson Research Center – Technical Report, Yorktown Heights, NY, 1996Google Scholar
  14. [14]
    González, A., Tubella, J., Molina, C., “Trace-Level Reuse”. In Int’l Conf. on Parallel Processing, Sep. 1999Google Scholar
  15. [15]
    Stitt, G., Lysecky, R., Vahid, F., “Dynamic Hardware/Software Partitioning: A First Approach”. In Design Automation Conference, 2003Google Scholar
  16. [16]
    N. Clark, W. Tang, and S. Mahlke, “Automatically Generating Custom Instruction Set Extensions”. In Workshop on Application Specific Processors (WASP). Turkey, 2002.Google Scholar
  17. [17]
    K. Wilcox and S.Manne, “Alpha processors: A history of power issues and a look to the future”. In CoolChips Tutorial An Industrial Perspective on Low Power Processor Design in conjunction with Micro-33(1999).Google Scholar
  18. [18]
    Yeager, K.C. “The Mips R10000 Superscalar Microprocessor,”; IEEE Micro, Apr. 1996, pp. 28–40.Google Scholar
  19. [19]
    Burns, J.; Gaudiot, J.-L., “SMT layout overhead and scalability”. In Parallel and Distributed Systems, IEEE Transactions on Parallel and Distributed Systems, pp. 142–155, Volume: 13, Issue: 2, Feb 2002Google Scholar
  20. [20]
    Beck, A. C. S., Carro, L., “Dynamic Reconfiguration with Binary Translation: Breaking the ILP barrier with Software Compatibility”, In Design Automation Conference, 2005Google Scholar

Copyright information

© Springer-Verlag US 2009

Authors and Affiliations

  1. 1.Instituto de InformáticaUniversidade Federal do Rio Grande do SulCampus do ValePorto Alegre/Brazil

Personalised recommendations