Abstract
MD simulation in process engineering features enormous computational demands, and therefore requires efficient parallelization techniques. This chapter describes ls1 mardyn ’s parallelization approach for shared-memory and distributed-memory architectures. This is done by first defining today’s computing architectures and their governing design principles: Heterogeneity, massive amounts of cores and data parallelism. Based on this, we are then able to reengineer ls1 mardyn in such a way that it can optimally leverage important hardware features, and describe our parallelization approach for shared- and distributed-memory systems at the example of the Intel Xeon processor and the Intel Xeon Phi coprocessor, respectively. We close this section by describing load-balancing techniques in case of a distributed-memory parallelization and heterogeneous particle distributions in the computational domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. Plimpton, Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)
H. Meuer, E. Strohmaier, J. Dongarra, H. Simon, Top500 list (2013), http://www.top500.org. Accessed 23 June 2013
J.L. Hennessy, D.A. Patterson, Computer Architecture—A Quantitative Approach, 5th edn. (Morgan Kaufmann, San Francisco, 2012)
E. Rotenberg, S. Bennett, J. Smith, Trace cache: a low latency approach to high bandwidth instruction fetching, in Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-29, pp. 24–34 (1996)
OpenMP Architecture Review Board, OpenMP Application Program Interface Version 3 (2008)
J. Reinders, Intel threading building blocks, 1st edn. (O’Reilly & Associates Inc., Sebastopol, 2007)
Intel Cooperation, Intel(R) MPI Library for Linux OS, Version 4.1. Update 1 (2013)
S. Potluri, D. Bureddy, K. Hamidouche, A. Venkatesh, K. Kandalla, H. Subramoni, D.K.D. Panda, MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clusters, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC’13 (ACM, New York, 2013), pp. 1–11
A. Heinecke, Boosting scientific computing applications through leveraging data parallel architectures. Ph.D. thesis, Institut für Informatik, Technische Universität München, 2014. Dissertation available from publishing house Dr. Hut under ISBN: 978-3-8439-1408-6
W. Smith, A replicated data molecular dynamics strategy for the parallel Ewald sum. Comput. Phys. Commun. 67(3), 392–406 (1992)
R.K. Kalia, S. de Leeuw, A. Nakano, P. Vashishta, Molecular-dynamics simulations of Coulombic systems on distributed-memory MIMD machines. Comput. Phys. Commun. 74(3), 316–326 (1993)
Y. Liu, C. Hu, C. Zhao, Efficient parallel implementation of Ewald summation in molecular dynamics simulations on multi-core platforms. Comput. Phys. Commun. 182(5), 1111–1119 (2011)
M. Kunaseth, D. F. Richards, J. N. Glosli, R.K. Kalia, A. Nakano, P. Vashishta, Analysis of scalable data-privatization threading algorithms for hybrid MPI/OpenMP parallelization of molecular dynamics. J. Supercomput. 1–25 (2013)
M. Buchholz, Framework zur Parallelisierung von Molekulardynamiksimulationen in verfahrenstechnischen Anwendungen. Dissertation, Institut für Informatik, Technische Universität München, 2010
J.A. Anderson, C.D. Lorenz, A. Travesset, General purpose molecular dynamics simulations fully implemented on graphics processing units. J. Comput. Phys. 227, 5342–5359 (2008)
D.C. Rapaport, Enhanced molecular dynamics performance with a programmable graphics processor. Comput. Phys. Commun. 182(4), 926–934 (2011)
J.E. Stone, J.C. Phillips, P.L. Freddolino, D.J. Hardy, L.G. Trabuco, K. Schulten, Accelerating molecular modeling applications with graphics processors. J. Comput. Chem. 28, 2618–2640 (2007)
J. van Meel, A. Arnold, D. Frenkel, S. Portegies Zwart, R. Belleman, Harvesting graphics power for MD simulations. Mol. Simul. 34(3), 259–266 (2008)
K.J. Bowers, R.O. Dror, D.E. Shaw, Zonal methods for the parallel execution of range-limited N-body simulations. J. Comput. Phys. 221(1), 303–329 (2007)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 The Author(s)
About this chapter
Cite this chapter
Heinecke, A., Eckhardt, W., Horsch, M., Bungartz, HJ. (2015). Parallelization of MD Algorithms and Load Balancing. In: Supercomputing for Molecular Dynamics Simulations. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-17148-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-17148-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17147-0
Online ISBN: 978-3-319-17148-7
eBook Packages: Computer ScienceComputer Science (R0)