Skip to main content

Overflow Controlled SIMD Arithmetic

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3602))

Abstract

Although the ”SIMD within a register” parallel architectures have existed for almost 10 years, the automatic optimizations for such architectures are not well developed yet. Since most optimizations for SIMD architectures are transplanted from traditional vectorization techniques, many special features of SIMD architectures, such as packed operations, have not been thoroughly considered. As operands are tightly packed within a register, there is no spare space to indicate overflow. To maintain the accuracy of automatic SIMDized programs, the operands should be unpacked to preserve enough space for interim overflow. By doing this, great overhead would be introduced. Furthermore, the instructions for handling interim overflows can sometimes prevent other optimizations. In this paper, a new technique, OCSA (overflow controlled SIMD arithmetic), is proposed to reduce the negative effects caused by interim overflow handling and eliminate the interference of interim overflows. We have applied our algorithm to the multimedia benchmarks of Berkeley. The experimental results show that the OCSA algorithm can significantly improve the performance of ADPCM-Decoder (110%), MESA-Reflect (113%) and DJVU-Encoder (106%).

Supported by the National Natural Science Foundation of China under Grant No. 60273046; Shanghai Science and Technology Committee of China Key Project Funding (02JC14013).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cheong, G., Lam, M.: An Optimizer for Multimedia Instruction Sets. In: Second SUIF Compiler Workshop, Stanford (January 1996)

    Google Scholar 

  2. Fisher, R.J., Dietz, H.G.: Compiling for SIMD Within Register. In: Workshop on Language and Compiler for Parallel Computing, University of North Carolina at Chapel Hill, North Carolina (1998)

    Google Scholar 

  3. Sreraman, N., Govindarajan, R.: A Vectorizing Compiler for Multimedia Extensions. International Journal of Parallel Programming 28(4), 363–400 (2000)

    Article  Google Scholar 

  4. Larsen, S., Amarasinghe, S.: Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In: Proceeding of SIGPLAN Conference on Programming Language Design and Implementation, Vancouver B.C. (2000)

    Google Scholar 

  5. Bik, A.J.C., Girkae, M., Grey, P.M., Tian, X.: Automatic Intra-Register Vectorization for Intel Architecture. International Journal of Parallel Programming 30(2), 65–98 (2002)

    Article  MATH  Google Scholar 

  6. Bik, A.J.C., Girkae, M., Grey, P.M., Tian, X.: Automatic Detection of Saturation and Clipping Idioms. In: Proceedings of the 15th International Workshop on Languages and Compilers for parallel computers (2002)

    Google Scholar 

  7. Krall, A., Lelait, S.: Compilation Techniques for Multimedia Processor. International Journal of Parallel Programming 18(4), 347–361 (2000)

    Article  Google Scholar 

  8. Stephenson, M., Babb, J., Amarasinghe, S.: Bitwidth Analysis with Application to Silicon Compilation. In: ACM SIGPLAN conference on Programming Language Design and Implementation, Vancouver, British Columbia (June 2000)

    Google Scholar 

  9. Diefendorff, K., Dubey, P.K.: How Multimedia Workloads Will Change Processor Design. IEEE Computer 30(9), 43–45 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhu, J., Zhang, H., Shi, H., Zang, B., Zhu, C. (2005). Overflow Controlled SIMD Arithmetic. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds) Languages and Compilers for High Performance Computing. LCPC 2004. Lecture Notes in Computer Science, vol 3602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532378_30

Download citation

  • DOI: https://doi.org/10.1007/11532378_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28009-5

  • Online ISBN: 978-3-540-31813-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics