Skip to main content

Do Trace Cache, Value Prediction and Prefetching Improve SMT Throughput?

  • Conference paper
  • 563 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3894))

Abstract

While trace cache, value prediction, and prefetching have been shown to be effective in the single-threaded superscalar, there has been no analysis of these techniques in a Simultaneously Multi threaded (SMT) processor. SMT brings new factors both for and against these techniques, and it is not known how these techniques would fare in SMT. We evaluate these techniques in an SMT to pro vide recommendations for future SMT designs. Our key contribu tions are: (1) we identify a fundamental interaction between the techniques and SMT’s sharing of resources among multiple threads, and (2) we quantify the impact of this interaction on SMT through put. SMT’s sharing of the instruction storage (i.e., trace cache or i-cache), physical registers, and issue queue impacts the effectiveness of trace cache, value prediction, and prefetching, respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mendelson, A., Gabbay, F.: Speculative execution based on value prediction. Technical report, Technion (1997)

    Google Scholar 

  2. Balasubramonian, R., Dwarkadas, S., Albonesi, D.H.: Reducing the complexity of the register file in dynamic superscalar processors. In: Proc. of the 34th MICRO (November 2001)

    Google Scholar 

  3. Black, B., Rychlik, B., Shen, J.P.: The block-based trace cache. In: Proc. of the 26th ISCA (October 1999)

    Google Scholar 

  4. Borch, E., Tune, E., Manne, S., Emer, J.: Loose loops sink chips. In: Proc. of 8th HPCA (February 2002)

    Google Scholar 

  5. Calder, B., Reinman, G., Tullsen, D.M.: Selective value prediction. In: Proc. of the 26th ISCA (May 1999)

    Google Scholar 

  6. Charney, M.J., Reeves, A.P.: Generalized correlation-based hardware prefetching. Technical Report EE-CEG-95-1, Cornell University (February 1995)

    Google Scholar 

  7. Farkas, K.I., Jouppi, N.P.: Complexity/performance tradeoffs with non-blocking loads. In: Proceedings of the 21st Annual International Symposium on Computer Architecture, pp. 211–222 (April 1994)

    Google Scholar 

  8. Friendly, D.H., Patel, S.J., Patt, Y.N.: Alternative fetch and issue policies for the trace cache fetch mechanism. In: Proc. of the 30th MICRO (November 1997)

    Google Scholar 

  9. Hu, Z., Martonosi, M., Kaxiras, S.: Tcp: Tag correlating prefetchers. In: Proc. of 9th HPCA (February 2003)

    Google Scholar 

  10. Joseph, D., Grunwald, D.: Prefetching using markov predictors. In: Proc. of the 24th ISCA (June 1997)

    Google Scholar 

  11. Jouppi, N.P.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In: Proc. of the 17th ISCA (May 1990)

    Google Scholar 

  12. Kaxiras, S., Hu, Z., Martonosi, M.: Cache decay: Exploiting generational behaviour to reduce cache leakage power. In: Proc. of the 28th ISCA (June 2001)

    Google Scholar 

  13. Lai, A.-C., Fide, C., Falsafi, B.: Dead-block prediction and dead-block correlating prefetchers. In: Proc. of the 28th ISCA (June 2001)

    Google Scholar 

  14. Lipasti, M.H., Schmidt, W.J., Kunkel, S.R., Roediger, R.R.: Spaid: software prefetching in pointer and call intensive environments. In: Proc. of the 28th MICRO (November 1995)

    Google Scholar 

  15. Lo, J., Barroso, L., Eggers, S., Gharachorloo, K., Levy, H., Parekh, S.: An analysis of database workload performance on simultaneous multithreaded processors. In: Proc. of the 25th ISCA (June 1998)

    Google Scholar 

  16. Lipasti, M.H., Wilkerson, C.B., Shen, J.P.: Value locality and data speculation. In: Proc. of the 7th ASPLOS (October 1996)

    Google Scholar 

  17. Moshovos, A., Sohi, G.S.: Streamlining inter-operation memory communication via data dependence prediction. In: Proc. of the 30th MICRO (December 1997)

    Google Scholar 

  18. Park, I., Powell, M.D., Vijaykumar, T.N.: Reducing register ports for higher speed and lower energy. In: Proc. of the 35th MICRO (November 2002)

    Google Scholar 

  19. Patel, S.J., Evers, M., Patt, Y.N.: Improving trace cache effectiveness with branch promotion and trace packing. In: Proc. of the 25th ISCA (June 1998)

    Google Scholar 

  20. Patel, S.J., Friendly, D.H., Patt, Y.N.: Evaluation of design options for the trace cache fetch mechanism. IEEE Transactions on Computers, Special Issue on Cache Memory and Related Problems

    Google Scholar 

  21. Patel, S.J., Friendly, D.H., Patt, Y.N.: Critical issues regarding the trace cache fetch mechanism. Technical Report CSE-TR-335-97, University of Michigan (May 1997)

    Google Scholar 

  22. Rotenberg, E., Bennett, S., Smith, J.E.: Trace cache: A low latency approach to high bandwidth instruction fetching. In: Proc. of the 29th MICRO (December 1996)

    Google Scholar 

  23. Sazeides, Y., Smith, J.E.: Implementations of context based value predictors. Technical Report ECE-97-8, University of Wisconsin-Madison (December 1997)

    Google Scholar 

  24. Mowry, T.C., Lam, M.S., Gupta, A.: Design and evaluation of a compiler algorithm for prefetching. In: Proc. of the 5th ASPLOS (October 1992)

    Google Scholar 

  25. Chen, T.F., Baer, J.L.: Reducing memory latency via non-blocking and prefetching caches. In: Proc. of the 5th ASPLOS (October 1992)

    Google Scholar 

  26. Timothy Sherwood, G.H., Perelman, E., Calder, B.: Automatically characterizing large scale program behavior. In: Proc. of the 10th ASPLOS (October 2002)

    Google Scholar 

  27. Tullsen, D.M., Brown, J.A.: Handling long-latency loads in a simultaneous multithreading processor. In: Proc. of the 34th MICRO (December 2001)

    Google Scholar 

  28. Tullsen, D.M., Eggers, S.J., Emer, J.S., Levy, H.M., Lo, J.L., Stamm, R.L.: Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor. In: Proc. of the 23rd ISCA (May 1996)

    Google Scholar 

  29. Tullsen, D.M., Eggers, S.J., Levy, H.M.: Simultaneous multithreading: maximizing on-chip parallelism. In: Proc. of the 22nd ISCA (June 1995)

    Google Scholar 

  30. Tyson, G.S., Austin, T.M.: Improving the accuracy and performance of memory communication through renaming. In: Proc. of the 30th MICRO (December 1997)

    Google Scholar 

  31. Yeh, T.-Y., Marr, D., Patt, Y.: Increasing instruction fetch rate via multiple branch prediction and a branch address cache. In: Proc. of the 7th ACM Int. Conf. on Supercomputing (July 1993)

    Google Scholar 

  32. Zhigang Hu, S.K., Martonosi, M.: Timekeeping in the memory system: Predicting and optimizing memory behavior. In: Proc. of the 29th ISCA (May 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cher, CY., Park, I., VijayKumar, T.N. (2006). Do Trace Cache, Value Prediction and Prefetching Improve SMT Throughput?. In: Grass, W., Sick, B., Waldschmidt, K. (eds) Architecture of Computing Systems - ARCS 2006. ARCS 2006. Lecture Notes in Computer Science, vol 3894. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11682127_17

Download citation

  • DOI: https://doi.org/10.1007/11682127_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32765-3

  • Online ISBN: 978-3-540-32766-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics