Advertisement

Exploring the Potential of Architecture-Level Power Optimizations

  • John S. Seng
  • Dean M. Tullsen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3164)

Abstract

This paper examines the limits of microprocessor energy reduction available via certain classes of architecture-level optimization. It focuses on three sources of waste that consume energy. The first is the execution of instructions that are unnecessary for correct program execution. The second source of wasted power is speculation waste – waste due to speculative execution of instructions that do not commit their results. The third source is architectural waste. This comes from suboptimal sizing of processor structures. This study shows that when these sources of waste are eliminated, processor energy has the potential to be reduced by 55% and 52% for the integer and floating point benchmarks respectively.

Keywords

Computer Architecture Point Benchmark Data Cache Instruction Cache Annual International Symposium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Albonesi, D.: Selective cache ways: on-demand cache resource allocation. In: 32nd International Symposium on Microarchitecture (December 1999)Google Scholar
  2. 2.
    Bahar, R.I., Albera, G., Manne, S.: Power and performance tradeoffs using various caching strategies. In: Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED 1998), August 10–12, 1998, pp. 64–69. ACM Press, New York (1998)Google Scholar
  3. 3.
    Bahar, R.I., Albera, G., Manne, S.: Using confidence to reduce energy consumption in high-performance microprocessors. In: International Symposium on Low Power Electronics and Design 1998 (August 1998)Google Scholar
  4. 4.
    Bahar, R.I., Manne, S.: Power and energy reduction via pipeline balancing. In: 28th Annual International Symposium on Computer Architecture (May 2001)Google Scholar
  5. 5.
    Balasubramonian, R., Albonesi, D., Buyuktosunoglu, A., Dwarkadas, S.: Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In: 33rd International Symposium on Microarchitecture (December 2000)Google Scholar
  6. 6.
    Balasubramonian, R., Dwarkadas, S., Albonesi, D.: Reducing the complexity of the register file in dynamic superscalar processors. In: 34th International Symposium on Microarchitecture (December 2001)Google Scholar
  7. 7.
    Brooks, D., Martonosi, M.: Dynamically exploiting narrow width operands to improve processor power and performance. In: HPCA 1999 (January 1999)Google Scholar
  8. 8.
    Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A framework for architectural-level power analysis and optimizations. In: 27th Annual International Symposium on Computer Architecture (June 2000)Google Scholar
  9. 9.
    Burtsher, M., Zorn, B.: Exploring last n value prediction. In: International Conference on Parallel Architectures and Compilation Techniques (October 1999)Google Scholar
  10. 10.
    Butts, A., Sohi, G.: Dynamic dead-instruction detection and elimination. In: Tenth International Conference on Architectural Support for Programming Languages and Operating Systems (October 2002)Google Scholar
  11. 11.
    Buyuktosunoglu, A., Albonesi, D., Bose, P., Cook, P., Schuster, S.: Tradeoffs in power-efficient issue queue design. In: International Symposium on Low Power Electronics and Design (August 2002)Google Scholar
  12. 12.
    Buyuktosunoglu, A., Schuster, S., Brooks, D., Bose, P., Cook, P., Albonesi, D.: An adaptive issue queue for reduced power at high performance. In: Falsafi, B., VijayKumar, T.N. (eds.) PACS 2000. LNCS, vol. 2008, p. 25. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Calder, B., Reinman, G.: A comparative study of load speculation architectures. Journal of Instruction Level Parallelism (May 2000)Google Scholar
  14. 14.
    Folegnani, D., Gonzalez, A.: Reducing power consumption of the issue logic. In: Workshop on Complexity-Effective Design (May 2000)Google Scholar
  15. 15.
    Folegnani, D., Gonzalez, A.: Energy-effective issue logic. In: 28th Annual International Symposium on Computer Architecture (June 2001)Google Scholar
  16. 16.
    Franklin, M., Sohi, G.S.: Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors. In: 25th International Symposium on Microarchitecture (December 1992)Google Scholar
  17. 17.
    Ghiasi, S., Casmira, J., Grunwald, D.: Using ipc variation in workloads with externally specified rates to reduce power consumption. In: Workshop on Complexity-Effective Design (June 2000)Google Scholar
  18. 18.
    Kaxiras, S., Hu, Z., Martonosi, M.: Cache decay: Exploiting generational behavior to reduce cache leakage power. In: 28th Annual International Symposium on Computer Architecture (June 2001)Google Scholar
  19. 19.
    Kim, H., Somani, A.K., Tyagi, A.: A reconfigurable multi-function computing cache architecture. IEEE Transactions on Very Large Scale Integration Systems (August 2001)Google Scholar
  20. 20.
    Lepak, K.M., Lipasti, M.H.: On the value locality of store instructions. In: 27th Annual International Symposium on Computer Architecture (June 2000)Google Scholar
  21. 21.
    Lipasti, M., Wilkerson, C., Shen, J.: Value locality and load value prediction. In: Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (October 1996)Google Scholar
  22. 22.
    Lipasti, M., Wilkerson, C., Shen, J.: Value locality and load value prediction. In: Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (October 1996)Google Scholar
  23. 23.
    Manne, S., Klauser, A., Grunwald, D.: Pipeline gating: Speculation control for energy reduction. In: 25th Annual International Symposium on Computer Architecture (June 1998)Google Scholar
  24. 24.
    Martin, M., Roth, A., Fischer, C.: Exploiting dead value information. In: 30th International Symposium on Microarchitecture (December 1997)Google Scholar
  25. 25.
    Onder, S., Gupta, R.: Load and store reuse using register file contents. In: 15th International Conference on Supercomputing (June 2001)Google Scholar
  26. 26.
    Parikh, H., Skadron, K., Zhang, Y., Barcella, M., Stan, M.R.: Power issues related to branch prediction. In: Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (February 2002)Google Scholar
  27. 27.
    Powell, M., Agarwal, A., Vijaykumar, T., Falsafi, B., Roy, K.: Reducing set-associative cache energy via way-prediction and selective direct-mapping. In: 34th International Symposium on Microarchitecture (December 2001)Google Scholar
  28. 28.
    Raasch, S., Binkert, N., Reinhardt, S.: A scalable instruction queue design using dependence chains. In: 29th Annual International Symposium on Computer Architecture (May 2002)Google Scholar
  29. 29.
    Ranganathan, P., Adve, S., Jouppi, N.: Reconfigurable caches and their application to media processing. In: 27th Annual International Symposium on Computer Architecture (June 2000)Google Scholar
  30. 30.
    Rotenberg, E.: Exploiting large ineffectual instruction sequences. Technical report, North Carolina State University (1999)Google Scholar
  31. 31.
    Seng, J., Tullsen, D., Cai, G.: Power-sensitive multithreaded architecture. In: International Conference on Computer Design 2000 (September 2000)Google Scholar
  32. 32.
    Seznec, A., Felix, S., Krishnan, V., Sazeides, Y.: Design tradeoffs for the alpha ev8 conditional branch predictor. In: 29th Annual International Symposium on Computer Architecture (May 2002)Google Scholar
  33. 33.
    Sodani, A., Sohi, G.: Dynamic instruction reuse. In: 24th Annual International Symposium on Computer Architecture (June 1997)Google Scholar
  34. 34.
    Sodani, A., Sohi, G.S.: An empirical analysis of instruction repetition. In: Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (October 1998)Google Scholar
  35. 35.
    Tullsen, D.: Simulation and modeling of a simultaneous multithreading processor. In: 22nd Annual Computer Measurement Group Conference (December 1996)Google Scholar
  36. 36.
    Tullsen, D., Eggers, S., Levy, H.: Simultaneous multithreading: Maximizing on-chip parallelism. In: 22nd Annual International Symposium on Computer Architecture, June 1995, pp. 392–403 (1995)Google Scholar
  37. 37.
    Tullsen, D., Seng, J.: Storageless value prediction using prior register values. In: 26th Annual International Symposium on Computer Architecture, May 1999, pp. 270–279 (1999)Google Scholar
  38. 38.
    Yang, J., Gupta, R.: Energy-efficient load and store reuse. In: International Symposium on Low Power Electronic Design (August 2001)Google Scholar
  39. 39.
    Yang, S.-H., Powell, M.D., Falsafi, B., Roy, K., Vijaykumar, T.N.: An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance i-caches. In: Seventh International Symposium on High Performance Computer Architecture (January 2001)Google Scholar
  40. 40.
    Yang, S.-H., Powell, M.D., Falsafi, B., Vijaykumar, T.N.: Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay. In: Eighth International Symposium on High Performance Computer Architecture (February 2002)Google Scholar
  41. 41.
    Yoaz, A., Ronen, R., Chappell, R.S., Almog, Y.: Silence is golden? In: Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (January 2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • John S. Seng
    • 1
  • Dean M. Tullsen
    • 2
  1. 1.Dept. of Computer ScienceCal Poly State UniversitySan Luis ObispoUSA
  2. 2.Dept. of Computer Science and EngineeringUniversity of CaliforniaSan Diego, La JollaUSA

Personalised recommendations