Skip to main content

A Code-Based Analytical Approach for Using Separate Device Coprocessors in Computing Systems

  • Conference paper
Architecture of Computing Systems - ARCS 2011 (ARCS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6566))

Included in the following conference series:

Abstract

Special hardware accelerators like FPGAs and GPUs are commonly introduced into a computing system as a separate device. Consequently, the accelerator and the host system do not share a common memory. Sourcing out the data to the additional hardware thus introduces a communication penalty. Based on a combination of a program’s source code and execution profiling we perform an analysis which evaluates the arithmetic intensity as a cost function to identify those parts most reasonable to source out to the accelerating hardware. The basic principles of this analysis are introduced and tested with a sample application. Its concrete results are discussed and evaluated based on the performance of a FPGA-based and a GPU-based implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Harris, M.: Mapping Computational Concepts to GPUs. In: Pharr, M. (ed.) GPU Gems 2, ch. 31, Addison-Wesley Longman, Amsterdam (2005)

    Google Scholar 

  2. Palmer, J.: The Intel® 8087 numeric data processor. In: ISCA 1980: Proceedings of the 7th annual symposium on Computer Architecture, La Baule, USA, pp. 174–181 (1980), http://doi.acm.org/10.1145/800053.801923

  3. Tripp, J.L., Gokhale, M.B., Peterson, K.D.: Trident: From High-Level Language to Hardware Circuitry. Computer 40(3), 28–37 (2007), http://dx.doi.org/10.1109/MC.2007.107

    Article  Google Scholar 

  4. Han, T.D., Abdelrahman, T.S.: hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems (March 31, 2010), http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.62

  5. Weber, R., Gothandaraman, A., Hinde, R.J., Peterson, G.D.: Comparing Hardware Accelerators in Scientific Applications: A Case Study. IEEE Transactions on Parallel and Distributed Systems (June 02, 2010), http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.125

  6. Park, S.J., Ross, J., Shires, D., Richie, D., Henz, B., Nguyen, L.: Hybrid Core Acceleration of UWB SIRE Radar Signal Processing. IEEE Transactions on Parallel and Distributed Systems (May 27, 2010), http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.117

  7. Park, I.K., Singhal, N., Lee, M.H., Cho, S., Kim, C.: Design and Performance Evaluation of Image Processing Algorithms on GPUs. IEEE Transactions on Parallel and Distributed Systems (May 27, 2010), http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.115

  8. Ryoo, S., Rodrigues, C.I., Stone, S.S., Baghsorkhi, S.S., Ueng, S.-Z., Stratton, J.A., Hwu, W.W.: Program optimization space pruning for a multithreaded gpu. In: CGO 2008: Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization, Boston, MA, USA, pp. 195–204 (2008), http://doi.acm.org/10.1145/1356058.1356084

  9. Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., Kirk, D.B., Hwu, W.W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: PPoPP 2008: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Salt Lake City, UT, USA, pp. 73–83 (2008), http://doi.acm.org/10.1145/1345206.1345220

  10. Suffern, K.G.: Ray Tracing from the Ground up, A K Peters Ltd (2007)

    Google Scholar 

  11. Sobe, P., Hampel, V.: FPGA-Accelerated Deletion-Tolerant Coding for Reliable Distributed Storage. In: Lukowicz, P., Thiele, L., Tröster, G. (eds.) ARCS 2007. LNCS, vol. 4415, pp. 14–27. Springer, Heidelberg (2007), http://dx.doi.org/10.1007/978-3-540-71270-1_2

    Chapter  Google Scholar 

  12. Cray Inc.: Cray XD1 FPGA Development. Release 1.4 (2006)

    Google Scholar 

  13. Valgrind Developers: Valgrind User Manual. Release 3.5.0 (August 19, 2009)

    Google Scholar 

  14. Munshi, A. (ed.): The OpenCL-Specification. Version 1.1 (June 11, 2010)

    Google Scholar 

  15. Nvidia Corp.: NVIDIA CUDA C Programming Guide. Version 3.2 (September 8, 2010)

    Google Scholar 

  16. Nvidia Corp.: NVIDIA OpenCL Best Practices Guide. Version 2.3 (August 31, 2009)

    Google Scholar 

  17. Brewer, T.M.: Hybrid-core Computing: Punching through the power/performance wall. Scientific Computing, November/December (2009), http://www.conveycomputer.com/Resources/ScientificComputing62629.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hampel, V., Goronzy, G., Maehle, E. (2011). A Code-Based Analytical Approach for Using Separate Device Coprocessors in Computing Systems. In: Berekovic, M., Fornaciari, W., Brinkschulte, U., Silvano, C. (eds) Architecture of Computing Systems - ARCS 2011. ARCS 2011. Lecture Notes in Computer Science, vol 6566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19137-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19137-4_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19136-7

  • Online ISBN: 978-3-642-19137-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics