Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer

  • John Ossyra
  • Ada SedovaEmail author
  • Arnold Tharrington
  • Frank Noé
  • Cecilia Clementi
  • Jeremy C. Smith
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11887)


Molecular dynamics (MD) simulations must take very small (femtosecond) integration steps in simulation-time to avoid numerical errors. Efficient use of parallel programming models and accelerators in state-of-the art MD programs now is pushing Moore’s limit for time-per-MD step. As a result, directly simulating timescales beyond milliseconds will not be attainable directly, even at exascale. However, concepts from statistical physics can be used to combine many parallel simulations to provide information about longer timescales and to adequately sample the simulation space, while preserving details about the dynamics of the system. Implementing such an approach requires a workflow program that allows adaptable steering of task assignments based on extensive statistical analysis of intermediate results. Here we report the implementation of such an adaptable workflow program to drive simulations on the Summit IBM Power System AC922, a pre-exascale supercomputer at the Oak Ridge Leadership Computing Facility (OLCF). We compare to experiences on Titan, Summit’s predecessor, report the performance of the workflow and its components, and describe the porting process. We find that using a workflow program managed by a Mongo database can provide the fault tolerance, scalable performance, task dispatch rate, and reconfigurability required for robust and portable implementation of ensemble simulations such as are used in enhanced-sampling molecular dynamics. This type of workflow generator can also be used to provide adaptive steering of ensemble simulations for other applications in addition to MD.


High Performance Computing Molecular dynamics Scientific workflows Adaptive sampling 



The authors would like to acknowledge Micholas Dean Smith for help with MD test systems preparation, and Shantenu Jha and lab for extensive help with incorporation of the Radical Cybertools software stack. An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR227525. ORNL is managed by UT-Battelle, LLC for the US Department of Energy. FN acknowledges European Commission (ERC CoG 772230) and Deutsche Forschungsgemeinschaft (NO 825/3-1). JCS acknowledges DOE contract ERKP752. CC acknowledges support from the National Science Foundation (CHE-1265929, CHE-1740990, CHE-1900374, and PHY-1427654) and the Welch Foundation (C-1570).


  1. 1.
    Abraham, M.J., et al.: GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015)CrossRefGoogle Scholar
  2. 2.
    Adorf, C.S., Dodd, P.M., Ramasubramani, V., Glotzer, S.C.: Simple data and workflow management with the signac framework. Comput. Mater. Sci. 146, 220–229 (2018)CrossRefGoogle Scholar
  3. 3.
    Ailamaki, A., Ioannidis, Y.E., Livny, M.: Scientific workflow management by database management. In: Proceedings of Tenth International Conference on Scientific and Statistical Database Management (Cat. No. 98TB100243), pp. 190–199. IEEE (1998)Google Scholar
  4. 4.
    Amaro, R.E., et al.: Ensemble docking in drug discovery. Biophys. J. 114, 2271–2278 (2018)CrossRefGoogle Scholar
  5. 5.
    Balasubramanian, V., Jensen, T., Turilli, M., Kasson, P., Shirts, M., Jha, S.: Implementing adaptive ensemble biomolecular applications at scale. arXiv preprint arXiv:1804.04736 (2018)
  6. 6.
    Balasubramanian, V., et al.: Harnessing the power of many: extensible toolkit for scalable ensemble applications. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 536–545. IEEE (2018)Google Scholar
  7. 7.
    Balasubramanian, V., et al.: Extasy: scalable and flexible coupling of MD simulations and advanced sampling techniques. In: 2016 IEEE 12th International Conference on e-Science, pp. 361–370. IEEE (2016)Google Scholar
  8. 8.
    Balasubramanian, V., Treikalis, A., Weidner, O., Jha, S.: Ensemble toolkit: scalable and flexible execution of ensembles of tasks. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 458–463. IEEE (2016)Google Scholar
  9. 9.
    Bernardi, R.C., Melo, M.C., Schulten, K.: Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochim. Biophys. Acta 1850(5), 872–877 (2015)CrossRefGoogle Scholar
  10. 10.
    Bowman, G.R., Pande, V.S., Noé, F. (eds.): An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation. AEMB, vol. 797. Springer, Dordrecht (2014). Scholar
  11. 11.
    Buchete, N.V., Hummer, G.: Peptide folding kinetics from replica exchange molecular dynamics. Phys. Rev. E 77(3), 030902 (2008)CrossRefGoogle Scholar
  12. 12.
    Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1345–1350. ACM (2008)Google Scholar
  13. 13.
    Deelman, E., et al.: Pegasus, a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015)CrossRefGoogle Scholar
  14. 14.
    Deelman, E., Vahi, K., Rynge, M., Juve, G., Mayani, R., da Silva, R.F.: Pegasus in the cloud: science automation through workflow technologies. IEEE Internet Comput. 20(1), 70–76 (2016)CrossRefGoogle Scholar
  15. 15.
    Dorier, M., Wozniak, J.M., Ross, R.: Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs. In: Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science, p. 5. ACM (2017)Google Scholar
  16. 16.
    Dou, L., et al.: Scientific workflow design 2.0: demonstrating streaming data collections in Kepler. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 1296–1299. IEEE (2011)Google Scholar
  17. 17.
    Eastman, P., et al.: OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput. Biol. 13(7), e1005659 (2017)CrossRefGoogle Scholar
  18. 18.
    Garcia, A.E., Herce, H., Paschek, D.: Simulations of temperature and pressure unfolding of peptides and proteins with replica exchange molecular dynamics. Ann. Rep. Comput. Chem. 2, 83–95 (2006)CrossRefGoogle Scholar
  19. 19.
  20. 20.
    Hänggi, P., Talkner, P., Borkovec, M.: Reaction-rate theory: fifty years after Kramers. Rev. Mod. Phys. 62(2), 251 (1990)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Hruska, E., Abella, J.R., Nüske, F., Kavraki, L.E., Clementi, C.: Quantitative comparison of adaptive sampling methods for protein dynamics. J. Chem. Phys. 149(24), 244119 (2018)CrossRefGoogle Scholar
  22. 22.
    Hummer, G.: Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations. New J. Phys. 7(1), 34 (2005)CrossRefGoogle Scholar
  23. 23.
    Husic, B.E., McGibbon, R.T., Sultan, M.M., Pande, V.S.: Optimized parameter selection reveals trends in Markov state models for protein folding. J. Chem. Phys. 145(19), 194103 (2016)CrossRefGoogle Scholar
  24. 24.
    Jain, A., et al.: FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comput.: Pract. Exp. 27(17), 5037–5059 (2015)CrossRefGoogle Scholar
  25. 25.
    Kasson, P.M., Jha, S.: Adaptive ensemble simulations of biomolecules. Curr. Opin. Struct. Biol. 52, 87–94 (2018)CrossRefGoogle Scholar
  26. 26.
    Kubo, R.: The fluctuation-dissipation theorem. Rep. Prog. Phys. 29(1), 255 (1966)CrossRefGoogle Scholar
  27. 27.
    Kumar, S., Rosenberg, J.M., Bouzida, D., Swendsen, R.H., Kollman, P.A.: The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comput. Chem. 13(8), 1011–1021 (1992)CrossRefGoogle Scholar
  28. 28.
  29. 29.
    Li, Y., Manoharan, S.: A performance comparison of SQL and NoSQL databases. In: 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19. IEEE (2013)Google Scholar
  30. 30.
    MacLean, C.: Python usage metrics on Blue Waters. Cray User Group (2017)Google Scholar
  31. 31.
    Merzky, A., Santcroos, M., Turilli, M., Jha, S.: RADICAL-Pilot: Scalable execution of heterogeneous and dynamic workloads on supercomputers. Computer Research Repository (CoRR), abs/1512.08194 (2015)Google Scholar
  32. 32.
    Merzky, A., Turilli, M., Maldonado, M., Jha, S.: Design and performance characterization of radical-pilot on Titan. arXiv preprint arXiv:1801.01843 (2018)
  33. 33.
    Messina, P.: The exascale computing project. Comput. Sci. Eng. 19(3), 63–67 (2017)CrossRefGoogle Scholar
  34. 34.
    Parker, Z., Poe, S., Vrbsky, S.V.: Comparing NoSQL MongoDB to an SQL DB. In: Proceedings of the 51st ACM Southeast Conference, p. 5. ACM (2013)Google Scholar
  35. 35.
    Pennycook, S.J., Sewall, J.D., Lee, V.: A metric for performance portability. arXiv preprint arXiv:1611.07409 (2016)
  36. 36.
    Pérez-Hernández, G., Paul, F., Giorgino, T., De Fabritiis, G., Noé, F.: Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys. 139(1), 07B604\(\_1\) (2013)Google Scholar
  37. 37.
    Pouya, I., Pronk, S., Lundborg, M., Lindahl, E.: Copernicus, a hybrid dataflow and peer-to-peer scientific computing platform for efficient large-scale ensemble sampling. Future Gener. Comput. Syst. 71, 18–31 (2017)CrossRefGoogle Scholar
  38. 38.
  39. 39.
  40. 40.
    Prinz, J.H., et al.: Markov models of molecular kinetics: generation and validation. J. Chem. Phys. 134(17), 174105 (2011)CrossRefGoogle Scholar
  41. 41.
    Pronk, S., et al.: Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform. J. Chem. Theory Comput. 11(6), 2600–2608 (2015)CrossRefGoogle Scholar
  42. 42.
  43. 43.
    Scherer, M.K., et al.: PyEMMA 2: a software package for estimation, validation, and analysis of Markov models. J. Chem. Theory Comput. 11(11), 5525–5542 (2015)CrossRefGoogle Scholar
  44. 44.
    Sedova, A., Eblen, J.D., Budiardja, R., Tharrington, A., Smith, J.C.: High-performance molecular dynamics simulation for biological and materials sciences: challenges of performance portability. In: 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 1–13. IEEE (2018)Google Scholar
  45. 45.
    Sedova, A., Tillack, A.F., Tharrington, A.: Using compiler directives for performance portability in scientific computing: kernels from molecular simulation. In: Chandrasekaran, S., Juckeland, G., Wienke, S. (eds.) WACCPD 2018. LNCS, vol. 11381, pp. 22–47. Springer, Cham (2019). Scholar
  46. 46.
    Venkatraman, S., Fahd, K., Kaspi, S., Venkatraman, R.: SQL versus NoSQL movement with big data analytics. IJ Inf. Technol. Comput. Sci. 8, 59–66 (2016)Google Scholar
  47. 47.
    Sorin, E.J., Pande, V.S.: Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. Biophys. J. 88(4), 2472–2493 (2005)CrossRefGoogle Scholar
  48. 48.
    Souza, R., Mattoso, M.: Provenance of dynamic adaptations in user-steered dataflows. In: Belhajjame, K., Gehani, A., Alper, P. (eds.) IPAW 2018. LNCS, vol. 11017, pp. 16–29. Springer, Cham (2018). Scholar
  49. 49.
    Souza, R., Silva, V., Oliveira, D., Valduriez, P., Lima, A.A., Mattoso, M.: Parallel execution of workflows driven by a distributed database management system. In: ACM/IEEE Conference on Supercomputing, Poster (2015)Google Scholar
  50. 50.
    Swenson, D.W., Prinz, J.H., Noe, F., Chodera, J.D., Bolhuis, P.G.: OpenPathSampling: a Python framework for path sampling simulations. 1. Basics. J. Chem. Theory Comput. 15, 813–836 (2018)CrossRefGoogle Scholar
  51. 51.
    Trott, C.R., Plimpton, S.J., Thompson, A.P.: Solving the performance portability issue with Kokkos (2017)Google Scholar
  52. 52.
    Turilli, M., Santcroos, M., Jha, S.: A comprehensive perspective on pilot-job systems. ACM Comput. Surv. (CSUR) 51(2), 43 (2018)CrossRefGoogle Scholar
  53. 53.
    Weinan, E., Ren, W., Vanden-Eijnden, E.: String method for the study of rare events. Phys. Rev. B 66(5), 052301 (2002)Google Scholar
  54. 54.
    Wolstencroft, K., et al.: The Taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41(W1), W557–W561 (2013)CrossRefGoogle Scholar
  55. 55.
    Woolf, T.B., Roux, B.: Conformational flexibility of o-phosphorylcholine and o-phosphorylethanolamine: a molecular dynamics study of solvation effects. J. Am. Chem. Soc. 116(13), 5916–5926 (1994)CrossRefGoogle Scholar
  56. 56.
    Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102. IEEE (2013)Google Scholar
  57. 57.
    Wu, H., Paul, F., Wehmeyer, C., Noé, F.: Multiensemble Markov models of molecular thermodynamics and kinetics. In: Proceedings of the National Academy of Sciences, p. 201525092 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of TennesseeKnoxvilleUSA
  2. 2.Oak Ridge National LaboratoryOak RidgeUSA
  3. 3.Freie Universität BerlinBerlinGermany
  4. 4.Rice UniversityHoustonUSA
  5. 5.Oak Ridge National Laboratory, University of TennesseeOak RidgeUSA

Personalised recommendations