RIMP: Runtime Implicit Predication

Tang, YuXing; Deng, Kun; Wang, XiaoDong; Dou, Yong; Zhou, XingMing

doi:10.1007/11573937_10

YuXing Tang¹⁹,
Kun Deng¹⁹,
XiaoDong Wang¹⁹,
Yong Dou¹⁹ &
…
XingMing Zhou¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3756))

Included in the following conference series:

International Workshop on Advanced Parallel Processing Technologies

654 Accesses

Abstract

If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty. Previous predication execution depends on compiler to generate explicit predicated instructions. In this paper, a trace-based predicate mechanism named RIMP (Runtime IMplicit Predication) is discussed. The candidates of if-conversion will be identified during dynamic execution. Conventional trace cache has been modified to store RIMP traces, which include instructions both from fall-through and target block following the conditional branch. Hardware extension will add predication to RIMP trace automatically. With the help of RIMP, legacy applications can benefit from predication mechanism without recompiling source code. Simulation of RIMP implementation under diverse microarchitecture configurations is presented in the paper. Results have shown promising performance improvement. In general, RIMP with 64kB trace storage delivers an average 10.3% IPC improvement while actually speeding up the execution time by over 7%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sharangpani, H., Aurora, K.: Itanium processor microarchitecture. IEEE Micro 20(5), 24–43 (2000)
Article Google Scholar
Chuang, W., Calder, B., Ferrante, J.: Phi-Predication for Light-Weight If-Conversion. In: Proceedings of the Intl. Symposium on code generation and optimization, March 2003, pp. 179–190 (2003)
Google Scholar
Sias, J., Hunter, H., Hwu, W.: Enhancing loop buffering of media and telecommunication applications using low-overhead predication. In: Proceedings of the 34th MICRO (December 2001)
Google Scholar
Jacobson, Q., Smith, J.E.: Trace preconstruction. In: Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA-2000), pp. 37–46. IEEE Computer Society Press, Vancouver (2000)
Chapter Google Scholar
Rotenberg, E., Bennett, S., Smith, J.E.: Trace cache: a low latency approach to high bandwidth instruction fetching [A]. In: Proceedings of the 29th MICRO, pp. 24–35. IEEE Computer Society Press, Los Alamitos (1996)
Google Scholar
Sites, R.L., Witek, R.T.: Alpha AXP Architecture Reference Manual, 2nd edn. Digital Press, Boston (1995)
Google Scholar
Sohm, O.: Variable-Length Decding on the TMS320C6000 DSP platform. Application Report (June 2002), http://www-s.ti.com/sc/psheets/spra805/spra805.pdf
Mahlke, S.A., Lin, D.C., Chen, W.Y., Hank, R.E., Bringmann, R.A.: Effective Compiler Support for Predicated Execution Using the Hyperblock. In: 25th Intl. Conf. On Microarchitecture, December 1992, pp. 45–54 (1992)
Google Scholar
Gwennap, L.: Intel’s P6 uses ducoupled superscalar Design. Microprocessor Report 9(2) (February 1995)
Google Scholar
Hyper-pipelined technology: Intel Pentium 4 Processor – Product Overview (2004), http://www.intel.com/designPentium4/prodbref/
Tremblay, M., Chan, J., Chaudhry, S., Conigliaro, A.W., Tse, S.S.: The MAJC Architecture: A Synthesis of Parallelism and Scalability. IEEE Micro 20(6), 12–25 (2000)
Article Google Scholar
Krewell, K.: Alhpa ev7 processor: a high-performance tradition continues. Microprocessor Report. In-Stat/MDR (April 2002)
Google Scholar
Pnevmatikatos, D.N., Sohi, G.S.: Guarded Execution and Branch Prediction in Dynamic ILP processors. In: 21st Intl. Symp. on computer architecture, June 1994, pp. 120–129 (1994)
Google Scholar
Mahlke, S.A., Hank, R.E., Bringmann, R.A., Gyllenhaal, J.C., Gallagher, D.M., Hwu, W.: Characterizing the Impact of Predicated Execution on Branch Prediction. In: 27th Annual Intl. Symp. On Microarchitecture, San Jose, CA (December 1994)
Google Scholar
Tyson, G.S.: The Effects of Predicated Execution on Branch Prediction. In: 27th Annual Intl. Symp. On Microarchitecture, San Jose, CA, December 1994, pp. 196–206 (1994)
Google Scholar
Rau, R., Yen, D., Yen, W., Towle, R.: The Cydra 5 Departmental Supercomputer. IEEE Computer 22(1), 12–35 (1989)
Google Scholar
Klauser, A., Austin, T., Grunwald, D., Calder, B.: Dynamic Hammock Predication for Non-predicated Instruction Set Architectures. In: Proceedings of ICPACT (1998)
Google Scholar
Chang, P.Y., Hao, E., Patt, Y., Chang, P.: Using Predicated Execution to Improve the Performance of a Dynamically Scheduled Machine with Speculative Execution. In: Intl. Conf. On Parallel Arch. And Compilation Techniques, Limassol, Cyprus (June 1995)
Google Scholar
Aramon, J.L., Gonzalez, J., Gonzalez, A., Smith, J.E.: Dual path instruction processing. In: Proceeding of the 16th Intl. Conf. On Supercomputing, New York (2002)
Google Scholar
Austin, T., Larson, E., Ernst, D.: SimpleScalar: an infrastructure for computer system modeling. IEEE computer 35(2), 59–67 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

National Lab for Parallel and distributed Processing, China
YuXing Tang, Kun Deng, XiaoDong Wang, Yong Dou & XingMing Zhou

Authors

YuXing Tang
View author publications
You can also search for this author in PubMed Google Scholar
Kun Deng
View author publications
You can also search for this author in PubMed Google Scholar
XiaoDong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Dou
View author publications
You can also search for this author in PubMed Google Scholar
XingMing Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong, China
Jiannong Cao
L3S Research Center, Leibniz Universität Hannover, Appelstrasse 9a, 30167, Hannover, Germany
Wolfgang Nejdl
Department of Network Engineering, School of Computer Science, National University of Defense Technology, 410073, Changsha, China
Ming Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, Y., Deng, K., Wang, X., Dou, Y., Zhou, X. (2005). RIMP: Runtime Implicit Predication. In: Cao, J., Nejdl, W., Xu, M. (eds) Advanced Parallel Processing Technologies. APPT 2005. Lecture Notes in Computer Science, vol 3756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573937_10

Download citation

DOI: https://doi.org/10.1007/11573937_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29639-3
Online ISBN: 978-3-540-32107-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics