A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth

Yao, Jun; Miwa, Shinobu; Shimada, Hajime; Tomita, Shinji

doi:10.1007/s11390-011-9436-3

A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth

Regular Paper
Published: 05 March 2011

Volume 26, pages 292–301, (2011)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Jun Yao¹,
Shinobu Miwa²,
Hajime Shimada¹ &
…
Shinji Tomita^3,4

78 Accesses
4 Citations
Explore all metrics

Abstract

Recently, a method known as pipeline stage unification (PSU) has been proposed to alleviate the increasing energy consumption problem in modern microprocessors. PSU achieves a high energy efficiency by employing a changeable pipeline depth and its working scheme is eligible for a fine control method. In this paper, we propose a dynamic method to study fine-grained program interval behaviors based on some easy-to-get runtime processor metrics. Using this method to determine the proper PSU configurations during the program execution, we are able to achieve an averaged 13.5% energy-delay-product (EDP) reduction for SPEC CPU2000 integer benchmarks, compared to the baseline processor. This value is only 0.14% larger than the theoretically idealized controlling. Our hardware synthesis result indicates that the proposed method can largely decrease the hardware overhead in both area and delay costs, as compared to a previous program study method which is based on working set signatures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Patmos: a time-predictable microprocessor

Article 23 February 2018

Scheduling algorithms to reduce the static energy consumption of real-time systems

Article 10 September 2014

Dynamic Power Estimation with Hardware Performance Counters Support on Multi-core Platform

References

Shimada H, Ando H, Shimada T. Pipeline stage unification for low-power vonsumption. In Proc. the 5th International Symposium on Low-Power and High-Speed Chips (COOL Chips V), Tokyo, Japan, Apr. 18–20, 2002, pp.194–200.
Shimada H, Ando H, Shimada T. Power consumption reduction through combining pipeline stage unification and DVS. IPSJ Transactions on Advanced Computing Systems (ACS), Feb. 2007, 48(3): 75–87. (In Japanese)
Google Scholar
Koppanalil J, Ramrakhyani P, Desai S, Vaidyanathan A, Rotenberg E. A case for dynamic pipeline scaling. In Proc. the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, Grenoble, France, Oct. 8–11, 2002, pp.1–8.
Gonzalez R, Horowitz M. Energy dissipation in general purpose microprocessors. IEEE Journal of Solid-State Circuits, Sept. 1996, 31(9): 1277–1284.
Article Google Scholar
Yao J, Miwa S, Shimada H, Tomita S. A dynamic control mechanism for pipeline stage unification by identifying program phases. IEICE Transactions on Information and Systems, Apr. 2008, E91-D(4): 1010–1022.
Article Google Scholar
Intel Pentium M processor on 90nm process with 2MB L2 cache datasheet. Intel Corporation, 2006.
Hartstein A, Puzak T R. Optimum power/performance pipeline depth. In Proc. the 36th Annual IEEE/ACM International Symposium on Microarchitecture, San Diego, USA, Dec. 3–5, 2003, pp.117–128.
Srinivasan V, Brooks D, Gschwind M, Bose P, Zyuban V, Strenski P N, Emma P G. Optimizing pipelines for power and performance. In Proc. the 35th Annual ACM/IEEE International Symposium on Microarchitecture, Istanbul, Turkey, Nov. 18–22, 2002, pp.333–344.
Geissler S, Appenzeller D, Cohen E, Charlebois S, Kartschoke P, McCormick P, Rohrer N, Salem G, Sandon P, Singer B, Reyn T V, Zimmerman J. A low-power RISC microprocessor using dual PLLs in a 0.13 μm SOI technology with copper interconnect and low-k BEOL dielectric. In Proc. IEEE International Solid-State Circuits Conference, San Francisco, USA, Feb. 3–7, 2002, Vol.1, pp.148–149.
Senger R M, Marsman E D, Carichner G A, Kubba S, McCorquodale M S, Brown R B. Low-latency, HDL-synthesizable dynamic clock frequency controller with self-referenced hybrid clocking. In Proc. IEEE International Symposium on Circuits and Systems, Island of Kos, Greece, May 21–24, 2006, pp.21–24.
Sherwood T, Perelman E, Hamerly G, Sair S, Calder B. Discovering and exploiting program phases. IEEE Micro, Nov./Dec. 2003, 23(6): 84–93.
Article Google Scholar
Isci C, Buyuktosunoglu A, Martonosi M. Long-term workload phases: Duration predictions and applications to DVFS. IEEE Micro, 2005, 25(5): 39–51.
Article Google Scholar
Isci C, Contreras G, Martonosi M. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In Proc. Micro 2006, Orlando, USA, 2006, pp.359–370.
Hartstein A, Puzak T R. The optimum pipeline depth for a microprocessor. In Proc. the 29th Annual International Symposium on Computer Architecture, Anchorage, May 25–29, USA, 2002, pp.7–13.
Shimada H, Ando H, Shimada T. Pipeline stage unification: A low-energy consumption technique for future mobile processors. In Proc. the 2003 International Symposium on Low Power Electronics and Design, Seoul, Korea, Aug. 25–27, 2003, pp.326–329.
Brooks D, Tiwari V, Martonosi M. Wattch: A framework for architectural-level power analysis and optimizations. In Proc. the 27th Annual International Symposium on Computer Architecture, Vancouver, Canada, Jun. 10–14, 2000, pp.83–94.
Li H, Cher C Y, Roy K, Vijaykumar T N. Combined circuit and architectural level variable supply-voltage scaling for low power. IEEE Trans. VLSI Systems, May 2005, 13(5): 564–576.
Article Google Scholar
Dhodapkar A S, Smith J E. Managing multi-configuration hardware via dynamic working set analysis. In Proc. the 29th Annual International Symposium on Computer Architecture, Anchorage, USA, May 25–29, 2002, pp.233–244.
Burger D, Austin T M. The SimpleScalar tool set, Version 2.0. SIGARCH Computer Architecture News, 1997, 25(3): pp.13–25.
Article Google Scholar
Dhodapkar A S, Smith J E. Comparing program phase detection techniques. In Proc. the 36th Annual IEEE/ACM International Symposium on Microarchitecture, San Diego, USA, Dec. 3–5, 2003, pp.217–227.

Download references

Author information

Authors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, 630–0192, Japan
Jun Yao (Member, IEEE) & Hajime Shimada
School of Engineering, Tokyo University of Agriculture and Technology, Tokyo, 183–8538, Japan
Shinobu Miwa
Graduate School of Informatics, Kyoto University, Kyoto, 606–8501, Japan
Shinji Tomita (Member, ACM, IEEE)
Institute for Integrated Cell-Material Sciences, Kyoto University, Kyoto, 606–8501, Japan
Shinji Tomita (Member, ACM, IEEE)

Authors

Jun Yao
View author publications
You can also search for this author in PubMed Google Scholar
Shinobu Miwa
View author publications
You can also search for this author in PubMed Google Scholar
Hajime Shimada
View author publications
You can also search for this author in PubMed Google Scholar
Shinji Tomita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Yao.

Additional information

This work was originally created in Kyoto University, Japan and then continued in Nara Institute of Science and Technology (NAIST), Japan. This work is partially supported by VLSI Design and Education Center (VDEC), University of Tokyo with the collaboration with Synopsys Corporation.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 124 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, J., Miwa, S., Shimada, H. et al. A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth. J. Comput. Sci. Technol. 26, 292–301 (2011). https://doi.org/10.1007/s11390-011-9436-3

Download citation

Received: 06 August 2009
Revised: 15 November 2010
Published: 05 March 2011
Issue Date: March 2011
DOI: https://doi.org/10.1007/s11390-011-9436-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth

Abstract

Access this article

Similar content being viewed by others

Patmos: a time-predictable microprocessor

Scheduling algorithms to reduce the static energy consumption of real-time systems

Dynamic Power Estimation with Hardware Performance Counters Support on Multi-core Platform

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic Supplementary Material

(PDF 124 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth

Abstract

Access this article

Similar content being viewed by others

Patmos: a time-predictable microprocessor

Scheduling algorithms to reduce the static energy consumption of real-time systems

Dynamic Power Estimation with Hardware Performance Counters Support on Multi-core Platform

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic Supplementary Material

(PDF 124 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation