Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 3728))

Abstract

The load-store queue (LQ-SQ) of modern superscalar processors is responsible for keeping the order of memory operations. As the performance gap between processing speed and memory access becomes worse, the capacity requirements for the LQ-SQ increase, and its design becomes a challenge due to its CAM structure. In this paper we propose an efficient load-store queue state filtering mechanism that provides a significant energy reduction (on average 35% in the LSQ and 3.5% in the whole processor), and only incurs a negligible performance loss of less than 0.6%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kessler, R.E.: The Alpha 21264 Microprocessor. Technical Report, Compaq Computer Corporation (1999)

    Google Scholar 

  2. Calder, B., Reinman, G.: A Comparative Survey of Load Speculation Architectures. Journal of Instruction-Level Parallelism (May 2000)

    Google Scholar 

  3. Nairy, C., Soltis, D.: Itanium-2 Processor Microarchitecture. IEEE-Micro 23(2), 44–55 (2003)

    Article  Google Scholar 

  4. Tendler, J.M., Dodson, J.S., Fields Jr., J.S., Le, H., Sinharoy, B.: Power-4 System Microarchitecture. IBM Journal of Research and Development 46(1), 5–26 (2002)

    Article  Google Scholar 

  5. Sethumadhavan, S., Desikan, R., Burger, D., Moore, C.R., Keckler, S.W.: Scalable Hardware Memory Disambiguation for High ILP Processors. In: Proceedings of MICRO-36 (December 2003)

    Google Scholar 

  6. Austin, T., Larson, E., Ernst, D.: SimpleScalar: An Infrastructure for Computer System Modeling. Computer 35(2) (February 2002)

    Google Scholar 

  7. Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. In: 28-ISCA, Göteborg, Sweden (July 2001)

    Google Scholar 

  8. Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically charecterizing large scale program behavior. In: Proceedings of ASPLOS 2002 (October 2002)

    Google Scholar 

  9. Sethumadhavan, S., Desikan, R., Burger, D., Moore, C.R., Keckler, S.W.: Scalable Hardware Memory Disambiguation for High ILP Processors. IEEE-Micro 24(6), 118–127 (2004)

    Article  Google Scholar 

  10. Park, I., Liang Ooi, C., Vijaykumar, T.N.: Reducing design complexity of the load-store queue. In: Proceedings of MICRO-36 (December 2003)

    Google Scholar 

  11. Cain, H.W., Lipasti, M.H.: Memory Ordering: A Value-Based Approach. In: Proceedings of ISCA-31 (June 2004)

    Google Scholar 

  12. Roth, A.: A high-bandwidth load-store unit for single- and multi- threaded processors. Technical Report, University of Pennsylvania (2004)

    Google Scholar 

  13. Baugh, L., Zilles, C.: Decomposing the Load-Store Queue by Function for Power Reduction and Scalability. In: Proceedings of PAC Conference (October 2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Castro, F., Chaver, D., Pinuel, L., Prieto, M., Huang, M.C., Tirado, F. (2005). A Power-Efficient and Scalable Load-Store Queue Design. In: Paliouras, V., Vounckx, J., Verkest, D. (eds) Integrated Circuit and System Design. Power and Timing Modeling, Optimization and Simulation. PATMOS 2005. Lecture Notes in Computer Science, vol 3728. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11556930_1

Download citation

  • DOI: https://doi.org/10.1007/11556930_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29013-1

  • Online ISBN: 978-3-540-32080-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics