Accelerating Speculative Execution in High-Level Synthesis with Cancel Tokens

  • Hagen Gädke
  • Andreas Koch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4943)


We present an improved method for scheduling speculative data paths which relies on cancel tokens to undo computations in mis-speculated paths. Performancewise, this method is considerably faster than lenient execution, and faster than any other known approach applicable for general (including non-pipelined) computation structures. We present experimental evidence obtained by implementing our method as part of the high-level language hardware/software compiler COMRADE.


Data Path Loop Iteration Speculative Execution Integrate Circuit Design Parallel Kernel 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Callahan, T., Hauser, J., Wawrzynek, J.: The Garp architecture and C Compiler. IEEE Computer 33(4), 62–69 (2000)Google Scholar
  2. 2.
    MacMillen, D.: Nimble Compiler Environment for Agile Hardware. Storming Media LLC, USA (2001)Google Scholar
  3. 3.
    Budiu, M.: Spatial Computation. Ph.D. Thesis, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA (December 2003)Google Scholar
  4. 4.
    Gupta, S., et al.: SPARK: A High-Level Synthesis Framework for Applying Parallelizing Compiler Transformations. In: Intl. Conf. on VLSI Design (VLSI), New Delhi, India (January 2003)Google Scholar
  5. 5.
    Guo, Z., Buyukkurt, B., Najjar, W., Vissers, K.: Optimized Generation of Data-path from C Codes for FPGAs. In: Intl. Conf. on Design, Automation, and Test in Europe (DATE), Munich, Germany (March 2005)Google Scholar
  6. 6.
    Mishra, M., Callahan, T., Chelcea, T., Venkataramani, G., Budiu, M., Goldstein, S.: Tartan: Evaluating Spatial Computation for Whole Program Execution. In: Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Jose, California, USA (October 2006)Google Scholar
  7. 7.
    Koch, A., Kasprzyk, N.: High-Level-Language Compilation for Reconfigurable Computers. In: Intl. Conf. on Reconfigurable Communication-centric SoCs (ReCoSoC), Montpellier, France (June 2005)Google Scholar
  8. 8.
    Gädke, H., Koch, A.: COMRADE: A Compiler for Adaptive Computing Systems Using a Novel Fast Speculation Technique. In: Intl. Conf. on Field Programmable Logic and Applications (FPL), Amsterdam, Netherlands (August 2007)Google Scholar
  9. 9.
    Kountouris, A., Wolinski, C.: Efficient Scheduling of Conditional Behaviors for High-Level Synthesis. ACM Transactions on Design Automation of Electronic Systems (TODAES) 7(3), 380–412 (2002)CrossRefGoogle Scholar
  10. 10.
    Gong, W., Wang, G., Kastner, R.: A High Performance Application Representation for Reconfigurable Systems. In: Intl. Conf. on Engineering of Reconfigurable Systems and Algorithms (ERSA), Las Vegas, NEV, USA (June 2004)Google Scholar
  11. 11.
    Mencer, O., Hubert, H., Morf, M., Flynn, M.: StReAm: Object-Oriented Programming of Stream Architectures using PAM-Blox. In: IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa Valley, CA, USA (April 2000)Google Scholar
  12. 12.
    Styles, H., Luk, W.: Pipelining Designs with Loop-Carried Dependencies. In: IEEE Intl. Conf. on Field-Programmable Technology (FPT), Brisbane, Australia (December 2004)Google Scholar
  13. 13.
    Brej, C., Garside, J.: Early Output Logic using Anti-Tokens. In: Intl. Workshop on Logic Synthesis (IWLS), Laguna Beach, CA, USA (March 2003)Google Scholar
  14. 14.
    Ampalam, M., Singh, M.: Counterflow Pipelining: Architectural Support for Preemption in Asynchronous Systems using Anti-Tokens. In: Intl. Conf. on Computer Aided Design (ICCAD), San Jose, CA, USA (November 2006)Google Scholar
  15. 15.
    Kasprzyk, N.: COMRADE - Ein Hochsprachen-Compiler für Adaptive Computersysteme. Ph.D. Thesis, Integrated Circuit Design (E.I.S.), Tech. Univ. Braunschweig, Germany (June 2005)Google Scholar
  16. 16.
    Ferrante, J.: The Program Dependence Graph and Its Use in Optimization. ACM Transactions on Programming Languages and Systems (TOPLAS) 9(3), 319–349 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Campbell, P., Krishna, K., Ballance, R.: Refining and Defining the Program Dependence Web. Technical Report TR 93-6, Department of Computer Science, University of New Mexico, Albuquerque, NM, USA (March 1993)Google Scholar
  18. 18.
    Koch, A.: Advances in Adaptive Computer Technology. Habilitation, Integrated Circuit Design (E.I.S.), Tech. Univ. Braunschweig, Germany (December 2004)Google Scholar
  19. 19.
    Neumann, T., Koch, A.: A Generic Library for Adaptive Computing Environments. In: Intl. Conf. on Field-Programmable Logic and Applications (FPL), Belfast, Northern Ireland, UK (2001)Google Scholar
  20. 20.
    Lange, H., Koch, A.: An Execution Model for Hardware/Software Compilation and its System-Level Realization. In: Intl. Conf. on Field-Programmable Logic and Applications (FPL), Amsterdam, Netherlands (August 2007)Google Scholar
  21. 21.
    Lange, H., Koch, A.: Memory Access Schemes for Configurable Processors. In: Intl. Conf. on Field-Programmable Logic and Applications (FPL), Villach, Austria (August 2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Hagen Gädke
    • 1
  • Andreas Koch
    • 2
  1. 1.Integrated Circuit Design (E.I.S.)Technische Universität BraunschweigGermany
  2. 2.Embedded Systems and Applications Group (ESA)Technische Universität DarmstadtGermany

Personalised recommendations