Skip to main content

On Performance Analysis of a Multithreaded Application Parallelized by Different Programming Models Using Intel VTune

  • Conference paper
Parallel Computing Technologies (PaCT 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6873))

Included in the following conference series:

Abstract

Multi-core processors are ubiquitous. Extracting the desired performance from them requires efficient techniques for partitioning a single piece of work into multiple fine-grained units of work in order to process them simultaneously. Understanding the performance behavior of a parallel system requires a close familiarity with the underlying architecture and the hardware counters.

We present a performance analysis study of a multi-core system by a state-of-the-art parallel performance analyzer tool, the Intel VTune Performance Analyzer. We chose as a test-case a classic nested-loop application that exhibits unexpected performance gains using two different programming models on the same multi-core system. Our expectations were to be able to reason about the performance results by exploring the application behavior using the parallel analyzer tool. We found that it is very difficult to explain high-level performance measurements of multi-core systems by low-level hardware diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Marowka, A.: Parallel Computing on Any Desktop. Communication of ACM 50(9), 74–78 (2007)

    Article  Google Scholar 

  2. Marowka, A.: Pitfalls and Issues of Manycore Programming. Advances in Computers 79, 71–117 (2010)

    Article  Google Scholar 

  3. Reinders, J.: Intel threading building blocks: outfitting C++ for multi-core processor parallelism. O’Reilly Media, Inc, Sebastopol (2007)

    Google Scholar 

  4. Leiserson, C.E.: The Cilk++ concurrency platform. In: 46th Design Automation Conference, San Francisco, CA (2009)

    Google Scholar 

  5. OpenMP API, Version 3.0 (2008), http://www.openmp.org

  6. Leijen, D., Hall, J.: Optimize managed code for multi-core machines (2007), http://msdn.microsoft.com/msdnmag/issues/07/10/futures/default.aspx

  7. Java Fork/Join Framework (JSR166), http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166ydocs/

  8. Hower, D., Jackson, S.: TaskMan: Simple Task-Parallel Programming, http://pages.cs.wisc.edu/~david/courses/cs758/Fall2009/includes/Projects/JacksonHower-slides.pdf

  9. Faxan, K.-F.: Wool user’s guide, Technical report, Swedish Institute of Computer Science (2009)

    Google Scholar 

  10. Balart, J., Duran, A., Gonzalez, M., Martorell, X., Ayguada, E., Labarta, J.: Nanos mercurium: a research compiler for openmp. In: The Proceedings of the European Workshop on OpenMP (2004)

    Google Scholar 

  11. TBB Web Site, http://www.threadingbuildingblocks.org/

  12. Chapman, B., Jost, G., van der Pas, R.: Using OpenMP, Portable Shared Memory Parallel Programming. MIT Press, Cambridge (2007)

    Google Scholar 

  13. Intel VTune Performance Analyzer, http://software.intel.com/en-us/intel-vtune/

  14. Drepper, U.: What Every Programmer Should Know About Memory (2007), http://people.redhat.com/drepper/cpumemory.pdf

  15. Drepper, U.: Understanding Application Memory Performance. In: RED-HAT (2008)

    Google Scholar 

  16. Contreras, G., Martonosi, M.: Characterizing and Improving the Performance of Intel Threading Building Blocks. In: IEEE Proceeding of International Symposium on Workload Characterization, pp. 57–66 (2008)

    Google Scholar 

  17. Robison, A., Voss, M., Kukanov, A.: Optimization via Reflection on Work Stealing in TBB. In: Proceeding of IEEE International Symposium on Parallel and Distributed Processing, IPDPS, pp. 1–8 (2008)

    Google Scholar 

  18. Wang, L., Xu, X.: Parallel Software Development with Intel Threading Analysis Tools. Intel Technology Journal 11(04), 287–297 (2007)

    Article  Google Scholar 

  19. Kegel, P., Schellmann, M., Gorlatch, S.: Using openMP vs. Threading building blocks for medical imaging on multi-cores. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 654–665. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  20. Podobas, A., Brorsson, M., Faxan, K.: A Comparison of some recent Task-based Parallel Programming Models. In: The proceeding of the Third Workshop on Programmability Issues for Multi-Core Computers, Pisa, Italy, January 24 (2010)

    Google Scholar 

  21. Nathan, R.T., Mellor-Crummey, J.M.: Identifying Performance Bottlenecks in Work-Stealing Computation. IEEE Computer, 44–50 (December 2009)

    Google Scholar 

  22. Gurumani, S.T., Milenkovic, A.: Execution Characteristics of SPEC CPU2000 Benchmarks: Intel C++ vs. Microsoft VC++. In: ACM SE 2004, Huntsville, Alabama, USA, April 2-3, pp. 261–266 (2004)

    Google Scholar 

  23. Prakash, T.K., Peng, L.: Performance characterization of SPEC CPU2006 on Intel core 2 duo processor. In: ISAST 2008, vol. 2(1), pp. 36–41 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Marowka, A. (2011). On Performance Analysis of a Multithreaded Application Parallelized by Different Programming Models Using Intel VTune. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2011. Lecture Notes in Computer Science, vol 6873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23178-0_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23178-0_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23177-3

  • Online ISBN: 978-3-642-23178-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics