Abstract
The load-store queue (LQ-SQ) of modern superscalar processors is responsible for keeping the order of memory operations. As the performance gap between processing speed and memory access becomes worse, the capacity requirements for the LQ-SQ increase, and its design becomes a challenge due to its CAM structure. In this paper we propose an efficient load-store queue state filtering mechanism that provides a significant energy reduction (on average 35% in the LSQ and 3.5% in the whole processor), and only incurs a negligible performance loss of less than 0.6%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kessler, R.E.: The Alpha 21264 Microprocessor. Technical Report, Compaq Computer Corporation (1999)
Calder, B., Reinman, G.: A Comparative Survey of Load Speculation Architectures. Journal of Instruction-Level Parallelism (May 2000)
Nairy, C., Soltis, D.: Itanium-2 Processor Microarchitecture. IEEE-Micro 23(2), 44–55 (2003)
Tendler, J.M., Dodson, J.S., Fields Jr., J.S., Le, H., Sinharoy, B.: Power-4 System Microarchitecture. IBM Journal of Research and Development 46(1), 5–26 (2002)
Sethumadhavan, S., Desikan, R., Burger, D., Moore, C.R., Keckler, S.W.: Scalable Hardware Memory Disambiguation for High ILP Processors. In: Proceedings of MICRO-36 (December 2003)
Austin, T., Larson, E., Ernst, D.: SimpleScalar: An Infrastructure for Computer System Modeling. Computer 35(2) (February 2002)
Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. In: 28-ISCA, Göteborg, Sweden (July 2001)
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically charecterizing large scale program behavior. In: Proceedings of ASPLOS 2002 (October 2002)
Sethumadhavan, S., Desikan, R., Burger, D., Moore, C.R., Keckler, S.W.: Scalable Hardware Memory Disambiguation for High ILP Processors. IEEE-Micro 24(6), 118–127 (2004)
Park, I., Liang Ooi, C., Vijaykumar, T.N.: Reducing design complexity of the load-store queue. In: Proceedings of MICRO-36 (December 2003)
Cain, H.W., Lipasti, M.H.: Memory Ordering: A Value-Based Approach. In: Proceedings of ISCA-31 (June 2004)
Roth, A.: A high-bandwidth load-store unit for single- and multi- threaded processors. Technical Report, University of Pennsylvania (2004)
Baugh, L., Zilles, C.: Decomposing the Load-Store Queue by Function for Power Reduction and Scalability. In: Proceedings of PAC Conference (October 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Castro, F., Chaver, D., Pinuel, L., Prieto, M., Huang, M.C., Tirado, F. (2005). A Power-Efficient and Scalable Load-Store Queue Design. In: Paliouras, V., Vounckx, J., Verkest, D. (eds) Integrated Circuit and System Design. Power and Timing Modeling, Optimization and Simulation. PATMOS 2005. Lecture Notes in Computer Science, vol 3728. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11556930_1
Download citation
DOI: https://doi.org/10.1007/11556930_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29013-1
Online ISBN: 978-3-540-32080-7
eBook Packages: Computer ScienceComputer Science (R0)