Skip to main content

Exploiting speculative thread-level parallelism on a SMT processor

  • Track C3: Computational Science
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1593))

Abstract

In this paper we present a run-time mechanism to simultaneously execute multiple threads from a sequential program on a simultaneous multithreaded (SMT) processor. The threads are speculative in the sense that they are created by predicting the future control flow of the program. Moreover, threads are not necessarily independent. Data dependences among simultaneously executed threads may exist. To avoid the serialization that such dependences may cause, inter-thread dependences as well as the values that flow through them are predicted. Speculative threads correspond to different iterations of the same loop, which may significantly reduce the fetch bandwidth requirements since many instructions are shared by several threads. The performance evaluation results show a significant performance improvement when compared with a single-threaded execution, which demonstrates the potential of the mechanism to exploit unused hardware contexts. Moreover, the new processor architecture can achieve an IPC (instructions per cycle) even higher than the peak fetch bandwidth for some programs.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. H. Akkary and M.A. Driscoll, “A Dynamic Multithreading Processor”, in Proc. 31st. Ann. Int. Symp. on Microarchitecture, 1998.

    Google Scholar 

  2. P.K. Dubey, K. O'Brien, K.M. O'Brien and C. Barton, “Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading”, in Proc. Int. Conf. on Parallel Architectures and Compilation Techniques, pp. 109–121, 1995.

    Google Scholar 

  3. M. Franklin and G.S. Sohi, “The Expandable Split Window Paradigm for Exploiting Fine Grain Parallelism”, in Proc. of Int. Symp. on Computer Architecture, pp. 58–67, 1992.

    Google Scholar 

  4. J. González and A. González, “Speculative Execution via Address Prediction and Data Prefetching”, in Proc of 11th. Int. Conf. on Supercomputing, 1997.

    Google Scholar 

  5. L. Hammond, M. Willey and K. Olukotun, “Data Speculation Support for a Chip Multiprocessor”, in Proc. of Int. Conf. on Architectural Support for Prog. Lang. and O.S., 1998.

    Google Scholar 

  6. M.H. Lipasti and J.P. Shen, “Exceeding the Dataflow Limit via Value Prediction”, in Proc. of Int. Symp. on Microarchitecture, pp. 226–237, 1996.

    Google Scholar 

  7. P. Marcuello and A. González, “Control and Data Dependence Speculation in Multithreaded Processors”, in Proc. of the Workshop on Mulithreaded Execution Architecture and Compilation held in conjuction with HPCA-4, 1998

    Google Scholar 

  8. P. Marcuello and A. González, “Speculative Multithreaded Processors”, in Proc. of the 12th Int. Conf. on Supercomputing, pp.77–84, 1998.

    Google Scholar 

  9. S. Palacharla, N.P. Jouppi and J.E. Smith, “Complexity-Effective Superscalar Processors”, in Proc. of Int. Symp. on Computer Architecture, pp. 206–218, 1997.

    Google Scholar 

  10. E. Rotenberg, S. Bennet and J.E. Smith, “Trace Processors”, in Proc. of the Int. Symp. on Microarchitecture, 1997.

    Google Scholar 

  11. E. Rotenberg, Q. Jacobson, Y. Sazeides and J.E. Smith, “Trace Cache:a Low Latency Approach to High Bandwidth Instruction Fetching”, Proc. Int. Symp. on Microarchitecture, 1996.

    Google Scholar 

  12. G.S. Sohi, S.E. Breach and T.N. Vijaykumar, “Multiscalar Processors”, in Proc. of the Int. Symp. on Computer Architecture, pp. 414–425, 1995.

    Google Scholar 

  13. J-Y. Tsai and P-C. Yew, “The Superthreaded Architecture: Thread Pipelining with Run-Time Data Dependence Checking and Control Speculation”, in Proc. Int. Conf. on Parallel Architectures and Compilation Techniques, pp. 35–46, 1996.

    Google Scholar 

  14. J. Tubella and A. González, “Control Speculation in Multithreaded Processors through Dynamic Loop Detection”, Proc. of the Int. Symp. on High-Performance Computer Architecture, 1998.

    Google Scholar 

  15. D.M. Tullsen, S.J. Eggers and H.M. Levy, “Simultaneous Multithreading: Maximizing On-Chip Parallelism”, in Proc. of the Int. Symp. on Computer Architecture, 1995.

    Google Scholar 

  16. A.K. Uht, “Concurrency Extraction via Hardware Methods Executing the Static Instruction Stream”, IEEE Trans. on Computers, vol. 41, July 1992.

    Google Scholar 

  17. S. Vajapeyam, T. Mitra, “Improving Superscalar Instruction Dispatch and Issue by Exploiting Dynamic Code Sequences”, Proc. the Int. Symp. on Comp. Architecture, 1997.

    Google Scholar 

  18. D.W. Wall, “Limits of Instruction-Level Parallelism”, Tech. Report WRL 93/6, Digital Western Research Laboratory, 1993.

    Google Scholar 

  19. S. Wallace, B. Calder and D. Tullsen, “Threaded Multiple Path Execution”, in Proc. of Int. Symp. on Computer Architecture, pp. 238–249, 1998

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Peter Sloot Marian Bubak Alfons Hoekstra Bob Hertzberger

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag

About this paper

Cite this paper

Marcuello, P., González, A. (1999). Exploiting speculative thread-level parallelism on a SMT processor. In: Sloot, P., Bubak, M., Hoekstra, A., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1999. Lecture Notes in Computer Science, vol 1593. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0100636

Download citation

  • DOI: https://doi.org/10.1007/BFb0100636

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65821-4

  • Online ISBN: 978-3-540-48933-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics