Compiling for Speculative Architectures

  • Seon Wook Kim
  • Rudolf Eigenmann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1863)


The traditional target machine of a parallelizing compiler can execute code sections either serially or in parallel. In contrast, targeting the generated code to a speculative parallel processor allows the compiler to recognize parallelism to the best of its abilities and leave other optimization decisions up to the processor’s runtime detection mechanisms. In this paper we show that simple improvements of the compiler’s speculative task selection method can already lead to significant (up to 55%) improvement in speedup over that of a simple code generator for a Multiscalar architecture. For an even more improved software/ hardware cooperation we propose an interface that allows the compiler to inform the processor about fully parallel, serial, and speculative code sections as well as attributes of program variables. We have evaluated the degrees of parallelism that such a co-design can realistically exploit.


Basic Block Parallel Execution Task Selection Parallel Task Speculative Architecture 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    William Blume and Rudolf Eigenmann. Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs. IEEE Transactions on Parallel and Distributed Systems, 3(6):643–656, November 1992.Google Scholar
  2. 2.
    M. W. Hall, J. M. Anderson, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, E. Bugnion, and M. S. Lam. Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer, pages 84–89, December 1996.Google Scholar
  3. 3.
    W. Blume, R. Doallo, R. Eigenmann, J. Grout, J. Hoeflinger, T. Lawrence, J. Lee, D. Padua, Y. Paek, B. Pottenger, L. Rauchwerger, and P. Tu. Parallel programming with Polaris. IEEE Computer, pages 78–82, December 1996.Google Scholar
  4. 4.
    Gurindar S. Sohi, Scott E. Breach, and T. N. Vijaykumar. Multiscalar processors. The 22th International Symposium on Computer Architecture (ISCA-22), pages 414–425, June 1995.Google Scholar
  5. 5.
    Lance Hammond, Mark Willey, and Kunle Olukotun. Data speculation support for a chip multiprocessors. Proceedings of the Eighth ACM Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’98), October 1998.Google Scholar
  6. 6.
    J. Gregory Steffan and Todd C. Mowry. The potential for thread-level data speculation to facilitate automatic parallelization. In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture (HPCA-4), pages 2–13, February 1998.Google Scholar
  7. 7.
    J.-Y. Tsai, Z. Jiang, Z. Li, D.J. Lilja, X. Wang, P.-C. Yew, B. Zheng, and S. Schwinn. Superthreading: Integrating compilation technology and processor architecture for cost-effective concurrent multithreading. Journal of Information Science and Engineering, March 1998.Google Scholar
  8. 8.
    Ye Zhang, Lawrence Rauchwerger, and Josep Torrellas. Hardware for speculative parallelization in high-end multiprocessors. The Third PetaFlop Workshop (TPF-3), February 1999.Google Scholar
  9. 9.
    T. N. Vijaykumar and Gurindar S. Sohi. Task selection for a multiscalar processor. The 31st International Symposium on Microarchitecture (MICRO-31), December 1998.Google Scholar
  10. 10.
    Seon Wook Kim. MaxP: Maximum parallelism detection tool in loop-based programs. Technical Report ECE-HPCLab-99206, HPCLAB, Purdue University, School of Electrical and Computer Engineering, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Seon Wook Kim
    • 1
  • Rudolf Eigenmann
    • 1
  1. 1.School of Electrical and Computer EngineeringPurdue UniversityWest Lafayette

Personalised recommendations