Abstract
We present the PARROT concept aimed at both higher performance and power-awareness. The PARROT microarchitectural framework integrates trace caching, dynamic optimizations and pipeline decoupling. We employ a gradual and selective approach for applying complex mechanisms only for the most frequently used traces to maximize the performance gain at any given power constraint, thus attaining finer control of tradeoffs between performance and power awareness.
We show that the PARROT microarchitecture delivers performance increases comparable to those available through conventional doubling of execution resources (average 16% IPC improvement). This improvement comes through better utilization of all available resources with the combination of a trace cache and selective trace optimization. On the other hand, performance advantage of a trace cache alone is limited to wide-machine configurations. No less critical, however, is power awareness. The PARROT microarchitecture delivers the performance increase at a comparable energy level, whereas the conventional path to higher performance consumes an average 70% more energy. Meanwhile, for those designs which can tolerate a higher power budget, PARROT gracefully scales up to use additional execution resources in a uniformly efficient manner. In particular, a PARROT-style doubly-wide machine delivers an average 45% IPC improvement while actually improving the Cubic- MIPS-per-WATT power awareness metric by over 50%.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Almog, Y., Rosner, R., Schwartz, N., Schmorak, A.: Specialized Dynamic Optimizations for High-Performance Energy-Efficient Mi-croarchitecture. In: CGO 2004 (to appear, 2004)
Bala, V., Duesterwald, E., Banerjia, S.: Transparent Dynamic Optimization: The Design and Implementation of Dynamo. TR HPL-1999-78, HP Labs
Bekerman, M., Mendelson, A., Sheaffer, G.: Performance and Hardware Complexity Tradeoffs in Designing Multithreaded Architectures. In: PACT, October 1996, pp. 24–34 (1996)
Black, B., Shen, J.P.: Turboscalar: A High Frequency High IPC Microarchitecture. In: ISCA 27 (June 2000)
Brooks, D.M., et al.: Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors. IEEE Micro 20(6), 36–44 (2000)
Cai, G., Lim, C.H., Daasch, W.R.: Thermal-Scheduling For Ultra Low Power Mobile Microprocessor. In: WCED 2002 (2002)
Ebcioglu, K., Altman, E.R.: DAISY: Dynamic Compilation for 100% Architectural Compatibility. In: ISCA 24, pp. 26–37 (1997)
Fahs, B., Bose, S., Crum, M., Slechta, B., Spadini, F., Tung, T., Patel, S.J., Lumetta, S.S.: Permormance Characterization of a Hardware Mechanism for Dynamic Optimization. In: MICRO 34 (2001)
Friendly, D., Patel, S., Patt, Y.: Putting the Fill Unit to Work: Dynamic Optimizations for Trace Cache Microprocessors. In: MICRO 31 (November 1998)
Gschwind, M., Altman, E.R., Sathaye, S., Ledak, P., Appenzeller, D.: Dynamic and Transparent Binary Translation. IEEE Computer Magazine 33(3), 54–59 (2000)
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., Roussel, P.: The Microarchitecture of the Pentium ® 4 Processor. Intel Technology Journal (2001)
Jacobson, Q., Rotenberg, E., Smith, J.E.: Path-Based Next Trace Prediction. In: MICRO 30 (1997)
Jourdan, S., Rappoport, L., Almog, Y., Erez, M., Yoaz, A., Ronen, R.: eXtended Block Cache. In: HPCA 6 (January 2000)
Kosyakovsky, O., Mendelson, A., Kolodny, A.: The Use of Profile-based Trace Classification for Improving the Power and Performance of Trace Cache Systems. In: 4th Workshop on Feedback-Directed and Dynamic Optimization, Austin (December 2001)
Lam, M.S., Wilson, R.P.: Limits of Control Flow on Parallelism. In: Proc. 19th ISCA, May 1992, pp. 46–57 (1992)
Mahlke, S.A., Lin, D.C., Chen, W.Y., Hank, R.E., Bringmann, R.A.: Effective Compiler Support for Predicated Execution using the Hyperblock. In: MICRO 25 (1992)
Melvin, S., Patt, Y.: Enhancing Instruction Scheduling with a Block-Structured ISA. Intern. Journal of Parallel Prog. 23(3), 221–243 (1995)
Merten, M.C., Trick, A.R., George, C.N., Gyllenhaal, J., Hwu, W.W.: A Hardware-Driven Profiling Scheme for Identifying Program Hot Spots to Support Runtime Optimization. In: ISCA 26 (1999)
Merten, M.C., Trick, A.R., Nystrom, E.M., Barnes, R.D., Mwu, W.: A Hardware Mechanism for Dynamic Extraction and Relayout of Program Hot Spots. In: ISCA 27 (May 2000)
Nair, R., Hopkins, M.E.: Exploiting instruction level parallelism in processors by caching scheduled groups. In: Proc. ISCA 24, pp. 13–25 (1997)
Parikh, A., Kandemir, M., Vijaykrishnan, N., Irwin, M.J.: VLIW Scheduling for Energy and Performance. In: Proc. IEEE Workshop on VLIW, April 2001, pp. 111–117 (2001)
Patel, S., Lumetta, S.: rePlay: A Hardware Framework for Dynamic Optimization. IEEE Trans. on Computers 50(6), 590–608 (2001)
Patel, S., Tung, T., Bose, S., Crum, M.: Increasing the Size of Atomic Instruction Blocks using Control Flow Assertions. In: MICRO 33 (2000)
Peleg, A., Weiser, U.: Dynamic Flow Instruction Cache Memory Organized Around Trace Segments Independent of Virtual Address Line, U.S. Patent 5,381,533 (January 1995)
Postiff, M., Tyson, G., Mudge, T.: Performance Limits of Trace Caches. Journal of ILP 1 (October 1999)
Rosner, R., Mendelson, A., Ronen, R.: Filtering Techniques to Improve Trace-Cache Efficiency. In: Malyshkin, V.E. (ed.) PaCT 2001. LNCS, vol. 2127. Springer, Heidelberg (2001)
Rosner, R., Moffie, M., Sazeides, Y., Ronen, R.: Selecting Long Atomic Traces for High Coverage. In: ICS 2003, pp. 2–11 (2003)
Rotenberg, E., Bennett, S., Smith, J.: A trace cache microarchitecture and evaluation. IEEE Trans. on Computers 48(2), 111–120 (1999)
Solomon, B., Ronen, R., Orenstien, D., Almog, Y., Mendelson, A.: Micro-Operation Cache: A Power Aware Frontend for Variable Instruction Length ISA. In: ISLPED 2001 (August 2001)
Slechta, B., et al.: Dynamic Optimizations of Micro-Operations. In: HPCA 9 (February 2003)
Srinivasan, V., Brooks, D., Gschwind, M., Bose, P., Zyuban, V., Strenski, P.N., Emma, P.G.: Optimizing Pipelines for Power and Performance. In: MICRO 35 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rosner, R., Almog, Y., Moffie, M., Schwartz, N., Mendelson, A. (2005). PARROT: Power Awareness Through Selective Dynamically Optimized Traces. In: Falsafi, B., VijayKumar, T.N. (eds) Power-Aware Computer Systems. PACS 2003. Lecture Notes in Computer Science, vol 3164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28641-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-28641-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24031-0
Online ISBN: 978-3-540-28641-7
eBook Packages: Computer ScienceComputer Science (R0)