Overflow Controlled SIMD Arithmetic

Zhu, Jiahua; Zhang, Hongjiang; Shi, Hui; Zang, Binyu; Zhu, Chuanqi

doi:10.1007/11532378_30

Overflow Controlled SIMD Arithmetic

Jiahua Zhu¹⁹,
Hongjiang Zhang¹⁹,
Hui Shi¹⁹,
Binyu Zang¹⁹ &
…
Chuanqi Zhu¹⁹

Conference paper

948 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3602))

Abstract

Although the ”SIMD within a register” parallel architectures have existed for almost 10 years, the automatic optimizations for such architectures are not well developed yet. Since most optimizations for SIMD architectures are transplanted from traditional vectorization techniques, many special features of SIMD architectures, such as packed operations, have not been thoroughly considered. As operands are tightly packed within a register, there is no spare space to indicate overflow. To maintain the accuracy of automatic SIMDized programs, the operands should be unpacked to preserve enough space for interim overflow. By doing this, great overhead would be introduced. Furthermore, the instructions for handling interim overflows can sometimes prevent other optimizations. In this paper, a new technique, OCSA (overflow controlled SIMD arithmetic), is proposed to reduce the negative effects caused by interim overflow handling and eliminate the interference of interim overflows. We have applied our algorithm to the multimedia benchmarks of Berkeley. The experimental results show that the OCSA algorithm can significantly improve the performance of ADPCM-Decoder (110%), MESA-Reflect (113%) and DJVU-Encoder (106%).

Supported by the National Natural Science Foundation of China under Grant No. 60273046; Shanghai Science and Technology Committee of China Key Project Funding (02JC14013).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cheong, G., Lam, M.: An Optimizer for Multimedia Instruction Sets. In: Second SUIF Compiler Workshop, Stanford (January 1996)
Google Scholar
Fisher, R.J., Dietz, H.G.: Compiling for SIMD Within Register. In: Workshop on Language and Compiler for Parallel Computing, University of North Carolina at Chapel Hill, North Carolina (1998)
Google Scholar
Sreraman, N., Govindarajan, R.: A Vectorizing Compiler for Multimedia Extensions. International Journal of Parallel Programming 28(4), 363–400 (2000)
Article Google Scholar
Larsen, S., Amarasinghe, S.: Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In: Proceeding of SIGPLAN Conference on Programming Language Design and Implementation, Vancouver B.C. (2000)
Google Scholar
Bik, A.J.C., Girkae, M., Grey, P.M., Tian, X.: Automatic Intra-Register Vectorization for Intel Architecture. International Journal of Parallel Programming 30(2), 65–98 (2002)
Article MATH Google Scholar
Bik, A.J.C., Girkae, M., Grey, P.M., Tian, X.: Automatic Detection of Saturation and Clipping Idioms. In: Proceedings of the 15th International Workshop on Languages and Compilers for parallel computers (2002)
Google Scholar
Krall, A., Lelait, S.: Compilation Techniques for Multimedia Processor. International Journal of Parallel Programming 18(4), 347–361 (2000)
Article Google Scholar
Stephenson, M., Babb, J., Amarasinghe, S.: Bitwidth Analysis with Application to Silicon Compilation. In: ACM SIGPLAN conference on Programming Language Design and Implementation, Vancouver, British Columbia (June 2000)
Google Scholar
Diefendorff, K., Dubey, P.K.: How Multimedia Workloads Will Change Processor Design. IEEE Computer 30(9), 43–45 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Fudan University, Handan Rd. 220, Shanghai, China
Jiahua Zhu, Hongjiang Zhang, Hui Shi, Binyu Zang & Chuanqi Zhu

Authors

Jiahua Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hongjiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Shi
View author publications
You can also search for this author in PubMed Google Scholar
Binyu Zang
View author publications
You can also search for this author in PubMed Google Scholar
Chuanqi Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of ECE, Purdue University, 47907, West Lafayette, IN
Rudolf Eigenmann
Department of Computer Science, Purdue University, 47906, West Lafayette, IN, USA
Zhiyuan Li
School of Electrical and Computer Engineering, Purdue University, 47907, West Lafayette, IN, USA
Samuel P. Midkiff

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, J., Zhang, H., Shi, H., Zang, B., Zhu, C. (2005). Overflow Controlled SIMD Arithmetic. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds) Languages and Compilers for High Performance Computing. LCPC 2004. Lecture Notes in Computer Science, vol 3602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532378_30

Download citation

DOI: https://doi.org/10.1007/11532378_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28009-5
Online ISBN: 978-3-540-31813-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics