Skip to main content

Attaining High Performance in General-Purpose Computations on Current Graphics Processors

  • Conference paper
High Performance Computing for Computational Science - VECPAR 2008 (VECPAR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5336))

Abstract

The increase in performance of the last generations of graphics processors (GPUs) has made this class of hardware a coprocessing platform of remarkable success in certain types of operations. In this paper we evaluate the performance of linear algebra and image processing routines, both on classical and unified GPU architectures and traditional processors (CPUs). From this study, we gain insights on the properties that make an algorithm likely to deliver high performance on a GPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barrachina, S., Castillo, M., Igual, F.D., Mayo, R., Quintana-Ortí, E.S.: Evaluation and tuning of the level 3 CUBLAS for graphics processors. In: Workshop on Multithreaded Architectures and Applications, MTAAP 2008 (2008)

    Google Scholar 

  2. NVIDIA Corp. NVIDIA CUBLAS Library (2007)

    Google Scholar 

  3. NVIDIA Corp. NVIDIA CUDA Compute Unified Device Architecture. Programming Guide (2007)

    Google Scholar 

  4. Fatahalian, K., Sugerman, J., Hanrahan, P.: Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. Graphics Hardware (2004)

    Google Scholar 

  5. Basic Linear Algebra Subprograms Technical (BLAST) Forum. Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard (2001)

    Google Scholar 

  6. Galoppo, N., Govindaraju, N., Henson, M., Monocha, D.: LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware. In: ACM/IEEE SC 2005 Conference (2005)

    Google Scholar 

  7. Goto, K., Van de Geijn, R.: High-performance implementation of the level-3 BLAS. ACM Transactions on Mathematical Software

    Google Scholar 

  8. Govindaraju, N., Lloyd, B., Wang, W., Lin, M., Manocha, D.: Fast computation of database operations using graphics processors. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 215–226 (June 2004)

    Google Scholar 

  9. Hong, J.Y., Wang, M.D.: High speed processing of biomedical images using programmable GPU. In: 2004 International Conference on Image Processing, ICIP 2004, 24-27 October 2004, vol. 4, pp. 2455–2458 (2004)

    Google Scholar 

  10. Larsen, E.S., McAllister, D.: Fast matrix multiplies using graphics hardware. In: Supercomputing, ACM/IEEE 2001 Conference, p. 43 (November 2001)

    Google Scholar 

  11. Moravánszky, A.: Dense matrix algebra on the GPU (2003)

    Google Scholar 

  12. Ruiz, A., Sertel, O., Ujaldon, M., Catalyurek, U., Saltz, J., Gurcan, M.: Pathological image analysis using the GPU: Stroma classification for neuroblastoma. In: Proceedings IEEE Intl. Conference on BioInformation and Bio Medicine (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Igual, F.D., Mayo, R., Quintana-Ortí, E.S. (2008). Attaining High Performance in General-Purpose Computations on Current Graphics Processors. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-92859-1_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-92858-4

  • Online ISBN: 978-3-540-92859-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics