Skip to main content

Performance Optimization of a CFD Application on Intel Multicore and Manycore Architectures

  • Conference paper
Book cover Advanced Computer Architecture

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 451))

Abstract

This paper reports our experience optimizing the performance of a high-order and high accurate Computational Fluid Dynamics (CFD) application (HOSTA) on the state of art multicore processor and the emerging Intel Many Integrated Core (MIC) coprocessor. We focus on effective loop vectorization and memory access optimization. A series techniques, including data structure transformations, procedure inlining, compiler SIMDization, OpenMP loop collapsing, and the use of Huge Pages, are explored. Detailed execution time and event counts from Performance Monitoring Units are measured. The results show that our optimizations have improved the performance of HOSTA by 1.61× on a two Intel Sandy Bridge processors based computer node and 1.97× on a Intel Knights Corner coprocessor, the public MIC product. The microarchitecture level effects of these optimizations are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Slotnick, J., Khodadoust, A., Alonso, J., et al.: CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences. Prepared for NASA Langley Research Center, Hampton, Virginia (2013)

    Google Scholar 

  2. Intel Corporation: Many Integrated Core (MIC) Architecture (2012)

    Google Scholar 

  3. Deng, X., Jiang, Y., Mao, M., et al.: Developing hybrid celledge and cell-node dissipative compact scheme for complex geometry flows. In: The Ninth Asian Computational Fluid Dynamics Conference (2012)

    Google Scholar 

  4. Deng, X., Jiang, Y., Mao, M., et al.: High-order and high accurate CFD methods and their applications for complex grid problems. Commun. Comput. Phys. 11, 1081–1102 (2012)

    MathSciNet  Google Scholar 

  5. Top500 Supercomputers sites, http://www.top500.org (accessed December 19, 2013)

  6. David, K.: Intel’s Sandy Bridge Microarchitecture (2010)

    Google Scholar 

  7. Jim, J., James, R.: Intel Xeon Phi Coprocessor High Performance Programming. Morgan Kaufmann Press (2013)

    Google Scholar 

  8. Intel Corporation: An Overview of Programming for Intel Xeon rocessors and Intel Xeon Phi coprocessors. Technical report (2012)

    Google Scholar 

  9. Che, Y., Zhang, L., Wang, Y., et al.: Uniprocessor Performance Tuning of a tructured Grid based Parallel CFD Application. In: Annual Conference on High Performance Computing of China, Zhangjiajie, China, pp. 39–46 (2012) (in Chinese)

    Google Scholar 

  10. Intel Corporation: A Guide to Vectorization with Intel C++ Compilers (2012)

    Google Scholar 

  11. Nikolay, S.: Enabling Huge Paging on MIC with libhugetlbfs library. Technical report, Intel Corporation (2012)

    Google Scholar 

  12. Intel Vtune Amplifier 2013 XE, http://www.intel.com/software/products/vtune (accessed September 12, 2013)

  13. Intel Corporation: Intel 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes (2013)

    Google Scholar 

  14. Intel Corporation: Intel Xeon Phi Coprocessor (codename: Knights Corner) Performance Monitoring Units. Revision 1.01 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Che, Y., Zhang, L., Wang, Y., Xu, C., Liu, W., Cheng, X. (2014). Performance Optimization of a CFD Application on Intel Multicore and Manycore Architectures. In: Wu, J., Chen, H., Wang, X. (eds) Advanced Computer Architecture. Communications in Computer and Information Science, vol 451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44491-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44491-7_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44490-0

  • Online ISBN: 978-3-662-44491-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics