Power Improvement Using Block-Based Loop Buffer with Innermost Loop Control

Zhong, Ming-Yuan; Shieh, Jong-Jiann

doi:10.1007/978-3-642-13136-3_38

Power Improvement Using Block-Based Loop Buffer with Innermost Loop Control

Ming-Yuan Zhong²⁰ &
Jong-Jiann Shieh²¹

Conference paper

703 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6082))

Abstract

The on-chip cache consumes a substantial portion of energy in today’s processors. Loops have temporal locality, so that loop buffer had been proposed. We attempt to apply concept of trace cache in the architecture of the loop buffer, however it is quiet bulky and complicated. If using a trace cache as a loop buffer, we do save the energy. Contrarily, it debases the integral performance due to long latency at fetch stage. We therefore propose these methods of (1) doing innermost loop detection at commit stage and filling/active at fetch stage; and (2) assisting loop buffer in storing the innermost loops with forward branches to pack the instructions captured from the instruction cache as basic blocks. With the preceding modifications, we hope to strengthen the loop buffer for gaining performance and reducing more power. Results with SPEC2000 indicate that up to 45% (integer benchmarks) and 55% (floating benchmarks) of reductions in instruction fetch power compared with the design without loop buffer. Furthermore, we got 3% (integer benchmarks) and 2% (floating benchmarks) of power improvement than the design of the loop buffer that deal with loops at fetch stage.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lee, L., Moyer, B., Arends, J.: Low-Cost Embedded Program Loop Caching – Revisited. U. Mich. Technique Reports, number CSE-TR-411-99.W.-K. Chen, Linear Networks and Systems (Book style). Wadsworth, Belmont, pp. 123–135 (1993)
Google Scholar
Anderson, T., Agarwala, S.: Effective hardware-based two-way loop cache for high performance low power processors. In: International Conference on Computer Design: VLSI in Computers & Processors (2000)
Google Scholar
Wu, I.–W., Tein, B.-H., Chung, C.-P.: Instruction Fetch Energy Reduction Using Forward-Branch and Subroutine Bufferable Innermost Loop Buffer. In: International Computer Symposium (2006)
Google Scholar
Wu, C.-K., Chiu, J.-C.: Design of Buffering Mechanism for Improving Instruction and Data Stream. Master Degree Thesis, Department of Electrical Engineering, National Sun Yat-Sen University (June 2003)
Google Scholar
Fritts, J., Wolf, W.: Instruction fetch characteristics of media processing. In: SPIE Photonics West, on Media Processors 2002, San Jose, CA, January 2002, pp. 72–83 (2002)
Google Scholar
Chu, Y., Ito, M.R.: An efficient instruction cache scheme for object-oriented languages. In: IEEE International Conference on, on Performance, Computing and Communications, pp. 329–336 (April 2001)
Google Scholar
Chen, S.-L., Shieh, J.-J.: Performance Evaluation of a Trace cache Engine, Master Degree Thesis, Department of Computer Science and Engineering, Tatung University (January 2000)
Google Scholar

Download references

Author information

Authors and Affiliations

No Institute Given,
Ming-Yuan Zhong
Department of Computer Science and Engineering, Tatung University, No 40 Chungshan North Road Section 3, Taipei, Taiwan, 104
Jong-Jiann Shieh

Authors

Ming-Yuan Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Jiann Shieh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Information Engineering, Chung Hua University, 300, Hsinchu, Taiwan, China
Ching-Hsien Hsu
Department of Computer Science, St. Francis Xavier University, B2G 2W5, Antigonish, NS, Canada
Laurence T. Yang
Department of Computer Science ad Engineering, Seoul National University of Technology, 172 Gongreund 2-dong, Nowon-gou, 139-742, Seoul, Korea
Jong Hyuk Park
Division of Computer Engineering, Mokwon University, 302-729, Daejeon, Korea
Sang-Soo Yeo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, MY., Shieh, JJ. (2010). Power Improvement Using Block-Based Loop Buffer with Innermost Loop Control. In: Hsu, CH., Yang, L.T., Park, J.H., Yeo, SS. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2010. Lecture Notes in Computer Science, vol 6082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13136-3_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-13136-3_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13135-6
Online ISBN: 978-3-642-13136-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics