Skip to main content

Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2019)

Abstract

Molecular dynamics (MD) simulations must take very small (femtosecond) integration steps in simulation-time to avoid numerical errors. Efficient use of parallel programming models and accelerators in state-of-the art MD programs now is pushing Moore’s limit for time-per-MD step. As a result, directly simulating timescales beyond milliseconds will not be attainable directly, even at exascale. However, concepts from statistical physics can be used to combine many parallel simulations to provide information about longer timescales and to adequately sample the simulation space, while preserving details about the dynamics of the system. Implementing such an approach requires a workflow program that allows adaptable steering of task assignments based on extensive statistical analysis of intermediate results. Here we report the implementation of such an adaptable workflow program to drive simulations on the Summit IBM Power System AC922, a pre-exascale supercomputer at the Oak Ridge Leadership Computing Facility (OLCF). We compare to experiences on Titan, Summit’s predecessor, report the performance of the workflow and its components, and describe the porting process. We find that using a workflow program managed by a Mongo database can provide the fault tolerance, scalable performance, task dispatch rate, and reconfigurability required for robust and portable implementation of ensemble simulations such as are used in enhanced-sampling molecular dynamics. This type of workflow generator can also be used to provide adaptive steering of ensemble simulations for other applications in addition to MD.

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan.).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abraham, M.J., et al.: GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015)

    Article  Google Scholar 

  2. Adorf, C.S., Dodd, P.M., Ramasubramani, V., Glotzer, S.C.: Simple data and workflow management with the signac framework. Comput. Mater. Sci. 146, 220–229 (2018)

    Article  Google Scholar 

  3. Ailamaki, A., Ioannidis, Y.E., Livny, M.: Scientific workflow management by database management. In: Proceedings of Tenth International Conference on Scientific and Statistical Database Management (Cat. No. 98TB100243), pp. 190–199. IEEE (1998)

    Google Scholar 

  4. Amaro, R.E., et al.: Ensemble docking in drug discovery. Biophys. J. 114, 2271–2278 (2018)

    Article  Google Scholar 

  5. Balasubramanian, V., Jensen, T., Turilli, M., Kasson, P., Shirts, M., Jha, S.: Implementing adaptive ensemble biomolecular applications at scale. arXiv preprint arXiv:1804.04736 (2018)

  6. Balasubramanian, V., et al.: Harnessing the power of many: extensible toolkit for scalable ensemble applications. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 536–545. IEEE (2018)

    Google Scholar 

  7. Balasubramanian, V., et al.: Extasy: scalable and flexible coupling of MD simulations and advanced sampling techniques. In: 2016 IEEE 12th International Conference on e-Science, pp. 361–370. IEEE (2016)

    Google Scholar 

  8. Balasubramanian, V., Treikalis, A., Weidner, O., Jha, S.: Ensemble toolkit: scalable and flexible execution of ensembles of tasks. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 458–463. IEEE (2016)

    Google Scholar 

  9. Bernardi, R.C., Melo, M.C., Schulten, K.: Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochim. Biophys. Acta 1850(5), 872–877 (2015)

    Article  Google Scholar 

  10. Bowman, G.R., Pande, V.S., Noé, F. (eds.): An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation. AEMB, vol. 797. Springer, Dordrecht (2014). https://doi.org/10.1007/978-94-007-7606-7

    Book  MATH  Google Scholar 

  11. Buchete, N.V., Hummer, G.: Peptide folding kinetics from replica exchange molecular dynamics. Phys. Rev. E 77(3), 030902 (2008)

    Article  Google Scholar 

  12. Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1345–1350. ACM (2008)

    Google Scholar 

  13. Deelman, E., et al.: Pegasus, a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015)

    Article  Google Scholar 

  14. Deelman, E., Vahi, K., Rynge, M., Juve, G., Mayani, R., da Silva, R.F.: Pegasus in the cloud: science automation through workflow technologies. IEEE Internet Comput. 20(1), 70–76 (2016)

    Article  Google Scholar 

  15. Dorier, M., Wozniak, J.M., Ross, R.: Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs. In: Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science, p. 5. ACM (2017)

    Google Scholar 

  16. Dou, L., et al.: Scientific workflow design 2.0: demonstrating streaming data collections in Kepler. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 1296–1299. IEEE (2011)

    Google Scholar 

  17. Eastman, P., et al.: OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput. Biol. 13(7), e1005659 (2017)

    Article  Google Scholar 

  18. Garcia, A.E., Herce, H., Paschek, D.: Simulations of temperature and pressure unfolding of peptides and proteins with replica exchange molecular dynamics. Ann. Rep. Comput. Chem. 2, 83–95 (2006)

    Article  Google Scholar 

  19. HACCmk. https://asc.llnl.gov/CORAL-benchmarks/Summaries/HACCmk_Summary_v1.0.pdf

  20. Hänggi, P., Talkner, P., Borkovec, M.: Reaction-rate theory: fifty years after Kramers. Rev. Mod. Phys. 62(2), 251 (1990)

    Article  MathSciNet  Google Scholar 

  21. Hruska, E., Abella, J.R., Nüske, F., Kavraki, L.E., Clementi, C.: Quantitative comparison of adaptive sampling methods for protein dynamics. J. Chem. Phys. 149(24), 244119 (2018)

    Article  Google Scholar 

  22. Hummer, G.: Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations. New J. Phys. 7(1), 34 (2005)

    Article  Google Scholar 

  23. Husic, B.E., McGibbon, R.T., Sultan, M.M., Pande, V.S.: Optimized parameter selection reveals trends in Markov state models for protein folding. J. Chem. Phys. 145(19), 194103 (2016)

    Article  Google Scholar 

  24. Jain, A., et al.: FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comput.: Pract. Exp. 27(17), 5037–5059 (2015)

    Article  Google Scholar 

  25. Kasson, P.M., Jha, S.: Adaptive ensemble simulations of biomolecules. Curr. Opin. Struct. Biol. 52, 87–94 (2018)

    Article  Google Scholar 

  26. Kubo, R.: The fluctuation-dissipation theorem. Rep. Prog. Phys. 29(1), 255 (1966)

    Article  Google Scholar 

  27. Kumar, S., Rosenberg, J.M., Bouzida, D., Swendsen, R.H., Kollman, P.A.: The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comput. Chem. 13(8), 1011–1021 (1992)

    Article  Google Scholar 

  28. Laney, D.: Workflow project overview. https://www.csm.ornl.gov/SOS20/documents/Laney-Workflow-Overview-SOS16.pdf

  29. Li, Y., Manoharan, S.: A performance comparison of SQL and NoSQL databases. In: 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19. IEEE (2013)

    Google Scholar 

  30. MacLean, C.: Python usage metrics on Blue Waters. Cray User Group (2017)

    Google Scholar 

  31. Merzky, A., Santcroos, M., Turilli, M., Jha, S.: RADICAL-Pilot: Scalable execution of heterogeneous and dynamic workloads on supercomputers. Computer Research Repository (CoRR), abs/1512.08194 (2015)

    Google Scholar 

  32. Merzky, A., Turilli, M., Maldonado, M., Jha, S.: Design and performance characterization of radical-pilot on Titan. arXiv preprint arXiv:1801.01843 (2018)

  33. Messina, P.: The exascale computing project. Comput. Sci. Eng. 19(3), 63–67 (2017)

    Article  Google Scholar 

  34. Parker, Z., Poe, S., Vrbsky, S.V.: Comparing NoSQL MongoDB to an SQL DB. In: Proceedings of the 51st ACM Southeast Conference, p. 5. ACM (2013)

    Google Scholar 

  35. Pennycook, S.J., Sewall, J.D., Lee, V.: A metric for performance portability. arXiv preprint arXiv:1611.07409 (2016)

  36. Pérez-Hernández, G., Paul, F., Giorgino, T., De Fabritiis, G., Noé, F.: Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys. 139(1), 07B604\(\_1\) (2013)

    Google Scholar 

  37. Pouya, I., Pronk, S., Lundborg, M., Lindahl, E.: Copernicus, a hybrid dataflow and peer-to-peer scientific computing platform for efficient large-scale ensemble sampling. Future Gener. Comput. Syst. 71, 18–31 (2017)

    Article  Google Scholar 

  38. Prinz, J.H.: Git Commit. https://github.com/markovmodel/adaptivemd/commit/186ffa097059168cb6b 17dfd2f0b01f83bc7b6e1

  39. Prinz, J.H.: https://github.com/jhprinz

  40. Prinz, J.H., et al.: Markov models of molecular kinetics: generation and validation. J. Chem. Phys. 134(17), 174105 (2011)

    Article  Google Scholar 

  41. Pronk, S., et al.: Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform. J. Chem. Theory Comput. 11(6), 2600–2608 (2015)

    Article  Google Scholar 

  42. PyMongo: https://github.com/mongodb/mongo-python-driver

  43. Scherer, M.K., et al.: PyEMMA 2: a software package for estimation, validation, and analysis of Markov models. J. Chem. Theory Comput. 11(11), 5525–5542 (2015)

    Article  Google Scholar 

  44. Sedova, A., Eblen, J.D., Budiardja, R., Tharrington, A., Smith, J.C.: High-performance molecular dynamics simulation for biological and materials sciences: challenges of performance portability. In: 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 1–13. IEEE (2018)

    Google Scholar 

  45. Sedova, A., Tillack, A.F., Tharrington, A.: Using compiler directives for performance portability in scientific computing: kernels from molecular simulation. In: Chandrasekaran, S., Juckeland, G., Wienke, S. (eds.) WACCPD 2018. LNCS, vol. 11381, pp. 22–47. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12274-4_2

    Chapter  Google Scholar 

  46. Venkatraman, S., Fahd, K., Kaspi, S., Venkatraman, R.: SQL versus NoSQL movement with big data analytics. IJ Inf. Technol. Comput. Sci. 8, 59–66 (2016)

    Google Scholar 

  47. Sorin, E.J., Pande, V.S.: Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. Biophys. J. 88(4), 2472–2493 (2005)

    Article  Google Scholar 

  48. Souza, R., Mattoso, M.: Provenance of dynamic adaptations in user-steered dataflows. In: Belhajjame, K., Gehani, A., Alper, P. (eds.) IPAW 2018. LNCS, vol. 11017, pp. 16–29. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98379-0_2

    Chapter  Google Scholar 

  49. Souza, R., Silva, V., Oliveira, D., Valduriez, P., Lima, A.A., Mattoso, M.: Parallel execution of workflows driven by a distributed database management system. In: ACM/IEEE Conference on Supercomputing, Poster (2015)

    Google Scholar 

  50. Swenson, D.W., Prinz, J.H., Noe, F., Chodera, J.D., Bolhuis, P.G.: OpenPathSampling: a Python framework for path sampling simulations. 1. Basics. J. Chem. Theory Comput. 15, 813–836 (2018)

    Article  Google Scholar 

  51. Trott, C.R., Plimpton, S.J., Thompson, A.P.: Solving the performance portability issue with Kokkos (2017)

    Google Scholar 

  52. Turilli, M., Santcroos, M., Jha, S.: A comprehensive perspective on pilot-job systems. ACM Comput. Surv. (CSUR) 51(2), 43 (2018)

    Article  Google Scholar 

  53. Weinan, E., Ren, W., Vanden-Eijnden, E.: String method for the study of rare events. Phys. Rev. B 66(5), 052301 (2002)

    Google Scholar 

  54. Wolstencroft, K., et al.: The Taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41(W1), W557–W561 (2013)

    Article  Google Scholar 

  55. Woolf, T.B., Roux, B.: Conformational flexibility of o-phosphorylcholine and o-phosphorylethanolamine: a molecular dynamics study of solvation effects. J. Am. Chem. Soc. 116(13), 5916–5926 (1994)

    Article  Google Scholar 

  56. Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102. IEEE (2013)

    Google Scholar 

  57. Wu, H., Paul, F., Wehmeyer, C., Noé, F.: Multiensemble Markov models of molecular thermodynamics and kinetics. In: Proceedings of the National Academy of Sciences, p. 201525092 (2016)

    Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge Micholas Dean Smith for help with MD test systems preparation, and Shantenu Jha and lab for extensive help with incorporation of the Radical Cybertools software stack. An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR227525. ORNL is managed by UT-Battelle, LLC for the US Department of Energy. FN acknowledges European Commission (ERC CoG 772230) and Deutsche Forschungsgemeinschaft (NO 825/3-1). JCS acknowledges DOE contract ERKP752. CC acknowledges support from the National Science Foundation (CHE-1265929, CHE-1740990, CHE-1900374, and PHY-1427654) and the Welch Foundation (C-1570).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ada Sedova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ossyra, J., Sedova, A., Tharrington, A., Noé, F., Clementi, C., Smith, J.C. (2019). Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34356-9_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34355-2

  • Online ISBN: 978-3-030-34356-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics