Porting the Princeton Ocean Model to GPUs

Xu, Shizhen; Huang, Xiaomeng; Zhang, Yan; Hu, Yong; Fu, Haohuan; Yang, Guangwen

doi:10.1007/978-3-319-11197-1_1

Shizhen Xu^24,26,
Xiaomeng Huang^24,25,26,
Yan Zhang^24,25,26,
Yong Hu^24,26,
Haohuan Fu^24,25,26 &
…
Guangwen Yang^24,25,26

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8630))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2600 Accesses
2 Citations

Abstract

While GPU is becoming a compelling acceleration solution for a series of scientific applications, most existing work on climate models only achieved limited speedup. It is due to partial porting of the huge code and the memory bound inherence of these models. In this work, we design and implement a customized GPU-based acceleration of the Princeton Ocean Model (gpuPOM). Based on Nvidia’s state-of-the-art GPU architectures (K20X and K40m), we rewrite the original model from the Fortran into the CUDA-C completely. Several accelerating methods, including optimizing memory access in a single GPU, overlapping communication and boundary operations among multiple GPUs, are presented. The experimental results show that the gpuPOM on one K40m GPU achieves 6.9-fold to 17.8-fold speedup and 5.8-fold to 15.5-fold speedup on one K20X GPU comparing with different Intel CPUs. Further experiments on multiple GPUs indicate that the performance of the gpuPOM on a super-workstation containing 4 GPUs is equivalent to a powerful cluster consisting of 34 pure CPU nodes with over 400 CPU cores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Michalakes, J., Vachharajani, M.: Gpu acceleration of numerical weather prediction. Parallel Processing Letters 18(04), 531–548 (2008)
Article MathSciNet Google Scholar
Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 tflops full gpu acceleration of non-hydrostatic weather model asuca production code. In: IEEE 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–11 (2010)
Google Scholar
Fuhrer, O., Osuna, C., Lapillonne, X., Gysi, T., Bianco, M., Schulthess, T.: Towards gpu-accelerated operational weather forecasting. In: The GPU Technology Conference, GTC 2013 (2013)
Google Scholar
Kelly, R.: Gpu computing for atmospheric modeling. Computing in Science & Engineering 12(4), 26–33 (2010)
Article Google Scholar
Mak, J., Choboter, P., Lupo, C.: Numerical ocean modeling and simulation with cuda. In: IEEE OCEANS, pp. 1–6 (2011)
Google Scholar
Carpenter, I., Archibald, R., Evans, K.J., Larkin, J., Micikevicius, P., Norman, M., Rosinski, J., Schwarzmeier, J., Taylor, M.A.: Progress towards accelerating homme on hybrid multi-core systems. International Journal of High Performance Computing Applications 27(3), 335–347 (2013)
Article Google Scholar
Govett, M., Middlecoff, J., Henderson, T.: Running the nim next-generation weather model on gpus. In: IEEE, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), pp. 792–796 (2010)
Google Scholar
Oey, L.Y., Lee, H.C., Schmitz, W.J.: Effects of winds and caribbean eddies on the frequency of loop current eddy shedding: A numerical model study. Journal of Geophysical Research: Oceans (1978–2012) 108(C10) (2003)
Google Scholar
Blumberg, A.F., Mellor, G.L.: A description of a three-dimensional coastal ocean circulation model. Coastal and Estuarine Sciences 4, 1–16 (1987)
Article Google Scholar
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. International Journal of High Performance Computing Applications 14(3), 189–204 (2000)
Article Google Scholar
NVIDIA: CUDA C Programming Guide Version 5.5. available at http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
Jordi, A., Wang, D.P.: sbpom: A parallel implementation of princenton ocean model. Environmental Modelling & Software 38, 59–61 (2012)
Article Google Scholar
Yang, C., Xue, W., Fu, H., Gan, L., Li, L., Xu, Y., Lu, Y., Sun, J., Yang, G., Zheng, W.: A peta-scalable cpu-gpu algorithm for global atmospheric simulations. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 1–12. ACM (2013)
Google Scholar
Potluri, S., Wang, H., Bureddy, D., Singh, A.K., Rosales, C., Panda, D.K.: Optimizing mpi communication on multi-gpu systems using cuda inter-process communication. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 1848–1857. IEEE (2012)
Google Scholar
Whitehead, N., Fit-Florea, A.: Precision & performance: Floating point and ieee 754 compliance for nvidia gpus. rn (A+ B) 21, 1–1874919424 (2011)
Google Scholar
McCalpin, J., Wonnacott, D.: Time skewing: A value-based approach to optimizing for memory locality. Technical report, Technical Report DCS-TR-379, Department of Computer Science, Rugers University (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Ministry of Education Key Laboratory for Earth System Modeling, China
Shizhen Xu, Xiaomeng Huang, Yan Zhang, Yong Hu, Haohuan Fu & Guangwen Yang
Center for Earth System Science, Tsinghua University, 100084, China
Xiaomeng Huang, Yan Zhang, Haohuan Fu & Guangwen Yang
Joint Center for Global Change Studies, Beijing, 100875, China
Shizhen Xu, Xiaomeng Huang, Yan Zhang, Yong Hu, Haohuan Fu & Guangwen Yang

Authors

Shizhen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomeng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Haohuan Fu
View author publications
You can also search for this author in PubMed Google Scholar
Guangwen Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Illinois Institute of Technology, 60616-3793, Chicago, IL, USA
Xian-he Sun
School of Computer Science and Technology, Dalian Maritime University, 1 Linghai Road, 116026, Dalian, China
Wenyu Qu
University of Ottawa, SEECS, 8, King Edward Ave, K1N 6N5, Ottawa, ON, Canada
Ivan Stojmenovic
Deakin University, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Wanlei Zhou
Dalian Maritime University, NO.1 Linhai Road, 116026, Dailian, China
Zhiyang Li & Tingting Yang &
BeiHang University, XueYuan Road No.37,HaiDian District, Beijing, China
Hua Guo
University of Bradford, BD7 1DP, Bradford, West Yorkshire, United Kingdom
Geyong Min
Computer Network Information Center, Chinese Academy of Sciences, 100190, Beijing, China
Yulei Wu
27 Shanda Nanlu, 250100, Jinan City, Shandong Province, China
Lei Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, S., Huang, X., Zhang, Y., Hu, Y., Fu, H., Yang, G. (2014). Porting the Princeton Ocean Model to GPUs. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8630. Springer, Cham. https://doi.org/10.1007/978-3-319-11197-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-11197-1_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11196-4
Online ISBN: 978-3-319-11197-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics