Abstract
We present a combined architectural and circuit technique for reducing the energy dissipation of microprocessor memory structures. This approach exploits the subarray partitioning of high speed memories and varying application requirements to dynamically disable partitions during appropriate execution periods. When applied to 4-way set associative caches, trading off a 2% performance degradation yields a combined 40% reduction in L1 Dcache and L2 cache energy dissipation.
This research is supported by the National Science Foundation under CAREER Award CCR-9701915 and grant CCR-9811929.
The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-0-387-35498-9_57
Chapter PDF
References
A. Ahi et al. The R10000 superscalar microprocessor. Hot Chips VII Symposium, August 1995.
D.H. Albonesi. Dynamic IPC/clock rate optimization. Proceedings of the 25th International Symposium on Computer Architecture, pages 282–292, June 1998.
D.H. Albonesi. Selective cache ways: On-demand cache resource allocation. Proceedings of the 32nd International Symposium on Microarchitecture, November 1999.
J. Anderson et al. Continuous profiling: Where have all the cycles gone Proceedings of the 16th Symposium on Operating Systems Principles, October 1997.
N. Bellas. Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors. Proceedings of the International Symposium on Low Power Electronics and Design, pages 70–75, August 1998.
W.J. Bowhill. Circuit implementation of a 300-MHz 64-bit second-generation CMOS Alpha CPU. Digital Technical Journal, 7(1):100–118,Special Issue 1995.
D. Burger and T.M. Austin. The SimpleScalar toolset, version 2.0. Technical Report TR-97–1342, University of Wisconsin-Madison, June 1997.
Digital Equipment Corporation. Alpha 21164 microprocessor data sheet. August 1998.
Digital Equipment Corporation. Alpha 21264 microprocessor data sheet. February 1999.
P. Dinda et al. The CMU task parallel program suite. Technical Report CMU-CS-94–131, Carnegie Mellon University, March 1994.
J.H. Edmondson et al. Internal organization of the Alpha 21164, a 300MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal, 7 (1): 119–135, Special Issue 1995.
T. Horel and G. Lauterbach. U1traSPARC III: Designing third-generation 64-bit performance. IEEE Micro, 19 (3): 73–85, May/June 1999.
M.B. Kamble and K. Ghose. Analytical energy dissipation models for low power caches. Proceedings of the International Symposium on Low Power Electronics and Design, pages 143–148, August 1997.
R. Kessler. The Alpha 21264 microprocessor. IEEE Micro, 19 (2): 24–36, March/April 1999.
R.E. Kessler, E.J. McLellan, and D.A. Webb. The Alpha 21264 microprocessor architecture. International Conference on Computer Design, October 1998.
J. Kin, M. Gupta, and W.H. Mangione-Smith. The filter cache: An energy efficient memory structure. Proceedings of the 29th International Symposium on Microarchitecture, pages 184–193, December 1997.
A. Kumar. The HP PA-8000 RISC CPU. IEEE Computer, 17 (2): 27–32, March 1997.
G. Lesartre and D. Hunt. PA-8500: The continuing evolution of the PA-8000 family. Proceedings of Compcon, 1997.
J. Montanaro et al. A 160-MHz, 32-b, 0.5W CMOS RISC microprocessor. Digital Technical Journal, 9 (1): 49–62, 1997.
M. Tremblay and J.M. O’Connor. UltraSparc I: A four-issue processor supporting multimedia. IEEE Micro, 16 (2): 42–50, April 1996.
T. Wada, S. Rajan, and S.A. Przybylski. An analytical access time model for on-chip cache memories. IEEE Journal of Solid-State Circuits, 27 (8): 1147–1156, August 1992.
S.J.E. Wilton and N.P. Jouppi. An enhanced access and cycle time model for on-chip caches. Technical Report 93/5, Digital Western Research Laboratory, July 1994.
B. Xu and D.H. Albonesi. A methodology for the analysis of dynamic application parallelism and its application to reconfigurable computing. Proceedings of the SPIE International Symposium on Voice, Video, and Data Communications (Track: Reconfigurable Technology: FPGAs for Computing and Applications), September 1999.
K.C. Yeager. The Mips R10000 superscalar microprocessor. IEEE Micro, 16 (2): 28–41, April 1996.
X. Zhang et al. System support for automatic profiling and optimization. Proceedings of the 16th Symposium on Operating Systems Principles, October 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 IFIP International Federation for Information Processing
About this chapter
Cite this chapter
Albonesi, D.H. (2000). An Architectural and Circuit-Level Approach to Improving the Energy Efficiency of Microprocessor Memory Structures. In: Silveira, L.M., Devadas, S., Reis, R. (eds) VLSI: Systems on a Chip. IFIP — The International Federation for Information Processing, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-35498-9_18
Download citation
DOI: https://doi.org/10.1007/978-0-387-35498-9_18
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-1014-4
Online ISBN: 978-0-387-35498-9
eBook Packages: Springer Book Archive