Skip to main content

Advertisement

Log in

A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Recently, a method known as pipeline stage unification (PSU) has been proposed to alleviate the increasing energy consumption problem in modern microprocessors. PSU achieves a high energy efficiency by employing a changeable pipeline depth and its working scheme is eligible for a fine control method. In this paper, we propose a dynamic method to study fine-grained program interval behaviors based on some easy-to-get runtime processor metrics. Using this method to determine the proper PSU configurations during the program execution, we are able to achieve an averaged 13.5% energy-delay-product (EDP) reduction for SPEC CPU2000 integer benchmarks, compared to the baseline processor. This value is only 0.14% larger than the theoretically idealized controlling. Our hardware synthesis result indicates that the proposed method can largely decrease the hardware overhead in both area and delay costs, as compared to a previous program study method which is based on working set signatures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Shimada H, Ando H, Shimada T. Pipeline stage unification for low-power vonsumption. In Proc. the 5th International Symposium on Low-Power and High-Speed Chips (COOL Chips V), Tokyo, Japan, Apr. 18–20, 2002, pp.194–200.

  2. Shimada H, Ando H, Shimada T. Power consumption reduction through combining pipeline stage unification and DVS. IPSJ Transactions on Advanced Computing Systems (ACS), Feb. 2007, 48(3): 75–87. (In Japanese)

    Google Scholar 

  3. Koppanalil J, Ramrakhyani P, Desai S, Vaidyanathan A, Rotenberg E. A case for dynamic pipeline scaling. In Proc. the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, Grenoble, France, Oct. 8–11, 2002, pp.1–8.

  4. Gonzalez R, Horowitz M. Energy dissipation in general purpose microprocessors. IEEE Journal of Solid-State Circuits, Sept. 1996, 31(9): 1277–1284.

    Article  Google Scholar 

  5. Yao J, Miwa S, Shimada H, Tomita S. A dynamic control mechanism for pipeline stage unification by identifying program phases. IEICE Transactions on Information and Systems, Apr. 2008, E91-D(4): 1010–1022.

    Article  Google Scholar 

  6. Intel Pentium M processor on 90nm process with 2MB L2 cache datasheet. Intel Corporation, 2006.

  7. Hartstein A, Puzak T R. Optimum power/performance pipeline depth. In Proc. the 36th Annual IEEE/ACM International Symposium on Microarchitecture, San Diego, USA, Dec. 3–5, 2003, pp.117–128.

  8. Srinivasan V, Brooks D, Gschwind M, Bose P, Zyuban V, Strenski P N, Emma P G. Optimizing pipelines for power and performance. In Proc. the 35th Annual ACM/IEEE International Symposium on Microarchitecture, Istanbul, Turkey, Nov. 18–22, 2002, pp.333–344.

  9. Geissler S, Appenzeller D, Cohen E, Charlebois S, Kartschoke P, McCormick P, Rohrer N, Salem G, Sandon P, Singer B, Reyn T V, Zimmerman J. A low-power RISC microprocessor using dual PLLs in a 0.13 μm SOI technology with copper interconnect and low-k BEOL dielectric. In Proc. IEEE International Solid-State Circuits Conference, San Francisco, USA, Feb. 3–7, 2002, Vol.1, pp.148–149.

  10. Senger R M, Marsman E D, Carichner G A, Kubba S, McCorquodale M S, Brown R B. Low-latency, HDL-synthesizable dynamic clock frequency controller with self-referenced hybrid clocking. In Proc. IEEE International Symposium on Circuits and Systems, Island of Kos, Greece, May 21–24, 2006, pp.21–24.

  11. Sherwood T, Perelman E, Hamerly G, Sair S, Calder B. Discovering and exploiting program phases. IEEE Micro, Nov./Dec. 2003, 23(6): 84–93.

    Article  Google Scholar 

  12. Isci C, Buyuktosunoglu A, Martonosi M. Long-term workload phases: Duration predictions and applications to DVFS. IEEE Micro, 2005, 25(5): 39–51.

    Article  Google Scholar 

  13. Isci C, Contreras G, Martonosi M. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In Proc. Micro 2006, Orlando, USA, 2006, pp.359–370.

  14. Hartstein A, Puzak T R. The optimum pipeline depth for a microprocessor. In Proc. the 29th Annual International Symposium on Computer Architecture, Anchorage, May 25–29, USA, 2002, pp.7–13.

  15. Shimada H, Ando H, Shimada T. Pipeline stage unification: A low-energy consumption technique for future mobile processors. In Proc. the 2003 International Symposium on Low Power Electronics and Design, Seoul, Korea, Aug. 25–27, 2003, pp.326–329.

  16. Brooks D, Tiwari V, Martonosi M. Wattch: A framework for architectural-level power analysis and optimizations. In Proc. the 27th Annual International Symposium on Computer Architecture, Vancouver, Canada, Jun. 10–14, 2000, pp.83–94.

  17. Li H, Cher C Y, Roy K, Vijaykumar T N. Combined circuit and architectural level variable supply-voltage scaling for low power. IEEE Trans. VLSI Systems, May 2005, 13(5): 564–576.

    Article  Google Scholar 

  18. Dhodapkar A S, Smith J E. Managing multi-configuration hardware via dynamic working set analysis. In Proc. the 29th Annual International Symposium on Computer Architecture, Anchorage, USA, May 25–29, 2002, pp.233–244.

  19. Burger D, Austin T M. The SimpleScalar tool set, Version 2.0. SIGARCH Computer Architecture News, 1997, 25(3): pp.13–25.

    Article  Google Scholar 

  20. Dhodapkar A S, Smith J E. Comparing program phase detection techniques. In Proc. the 36th Annual IEEE/ACM International Symposium on Microarchitecture, San Diego, USA, Dec. 3–5, 2003, pp.217–227.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Yao.

Additional information

This work was originally created in Kyoto University, Japan and then continued in Nara Institute of Science and Technology (NAIST), Japan. This work is partially supported by VLSI Design and Education Center (VDEC), University of Tokyo with the collaboration with Synopsys Corporation.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 124 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, J., Miwa, S., Shimada, H. et al. A Fine-Grained Runtime Power/Performance Optimization Method for Processors with Adaptive Pipeline Depth. J. Comput. Sci. Technol. 26, 292–301 (2011). https://doi.org/10.1007/s11390-011-9436-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-9436-3

Keywords

Navigation