Skip to main content

Highly Interactive, Steered Scientific Workflows on HPC Systems: Optimizing Design Solutions

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2019)

Abstract

Scientific workflows are becoming increasingly important in high performance computing (HPC) settings, as the feasibility and appeal of many simultaneous heterogeneous tasks increases with increasing hardware capabilities. Currently no HPC-based workflow platform supports a dynamically adaptable workflow with interactive steering and analysis at run-time. Furthermore, for most workflow programs, compute resources are fixed for a given instance, resulting in a possible waste of expensive allocation resources when tasks are spawned and killed. Here we describe the design and testing of a run-time-interactive, adaptable, steered workflow tool capable of executing thousands of parallel tasks without an MPI programming model, using a database management system to facilitate task management through multiple live connections. We find that on the Oak Ridge Leadership Computing Facility pre-exascale Summit supercomputer it is possible to launch and interactively steer workflows with thousands of simultaneous tasks with negligible latency. For the case of particle simulation and analysis tasks that run for minutes to hours, this paradigm offers the prospect of a robust and efficient means to perform simulation-space exploration with on-the-fly analysis and adaptation.

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ailamaki, A., Ioannidis, Y.E., Livny, M.: Scientific workflow management by database management. In: Proceedings of the Tenth International Conference on Scientific and Statistical Database Management (Cat. No. 98TB100243), pp. 190–199. IEEE (1998)

    Google Scholar 

  2. Amaro, R.E., et al.: Ensemble docking in drug discovery. Biophys. J. 114, 2271–2278 (2018)

    Article  Google Scholar 

  3. Bernardi, R.C., Melo, M.C., Schulten, K.: Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochim. Biophys. Acta 1850(5), 872–877 (2015)

    Article  Google Scholar 

  4. Bowman, G.R., Pande, V.S., Noé, F. (eds.): An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation. AEMB, vol. 797. Springer, Dordrecht (2014). https://doi.org/10.1007/978-94-007-7606-7

    Book  MATH  Google Scholar 

  5. Buchete, N.V., Hummer, G.: Peptide folding Kinetics from replica exchange molecular dynamics. Phys. Rev. E 77(3), 030902 (2008)

    Article  Google Scholar 

  6. Dorier, M., Wozniak, J.M., Ross, R.: Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs. In: Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science, p. 5. ACM (2017)

    Google Scholar 

  7. Eastman, P., et al.: OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput. Biol. 13(7), e1005659 (2017)

    Article  Google Scholar 

  8. Garcia, A.E., Herce, H., Paschek, D.: Simulations of temperature and pressure unfolding of peptides and proteins with replica exchange molecular dynamics. Annu. Rep. Comput. Chem. 2, 83–95 (2006)

    Article  Google Scholar 

  9. Hänggi, P., Talkner, P., Borkovec, M.: Reaction-rate theory: fifty years after Kramers. Rev. Mod. Phys. 62(2), 251 (1990)

    Article  MathSciNet  Google Scholar 

  10. Hruska, E., Abella, J.R., Nüske, F., Kavraki, L.E., Clementi, C.: Quantitative comparison of adaptive sampling methods for protein dynamics. J. Chem. Phys. 149(24), 244119 (2018)

    Article  Google Scholar 

  11. Hummer, G.: Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations. New J. Phys. 7(1), 34 (2005)

    Article  Google Scholar 

  12. Husic, B.E., McGibbon, R.T., Sultan, M.M., Pande, V.S.: Optimized parameter selection reveals trends in Markov state models for protein folding. J. Chem. Phys. 145(19), 194103 (2016)

    Article  Google Scholar 

  13. Jain, A., et al.: FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comput.: Pract. Exp. 27(17), 5037–5059 (2015)

    Article  Google Scholar 

  14. Kasson, P.M., Jha, S.: Adaptive ensemble simulations of biomolecules. Curr. Opin. Struct. Biol. 52, 87–94 (2018)

    Article  Google Scholar 

  15. Kubo, R.: The fluctuation-dissipation theorem. Rep. Prog. Phys. 29(1), 255 (1966)

    Article  Google Scholar 

  16. Kumar, S., Rosenberg, J.M., Bouzida, D., Swendsen, R.H., Kollman, P.A.: The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comput. Chem. 13(8), 1011–1021 (1992)

    Article  Google Scholar 

  17. Noé, F., Horenko, I., Schütte, C., Smith, J.C.: Hierarchical analysis of conformational dynamics in biomolecules: transition networks of metastable states. J. Chem. Phys. 126(15), 04B617 (2007)

    Article  Google Scholar 

  18. Ossyra, J.R., Sedova, A., Tharrington, A., Noé, F., Clementi, C., Smith, J.C.: Porting adaptive ensemble molecular dynamics workflows to the summit supercomputer. In: Proceedings of ISC 19; IWOPH. SLNCS (2019, in press)

    Google Scholar 

  19. Pérez-Hernández, G., Paul, F., Giorgino, T., De Fabritiis, G., Noé, F.: Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys. 139(1), 07B604\_1 (2013)

    Article  Google Scholar 

  20. Pouya, I., Pronk, S., Lundborg, M., Lindahl, E.: Copernicus, a hybrid dataflow and peer-to-peer scientific computing platform for efficient large-scale ensemble sampling. Future Gener. Comput. Syst. 71, 18–31 (2017)

    Article  Google Scholar 

  21. Prinz, J.H., et al.: Markov models of molecular Kinetics: generation and validation. J. Chem. Phys. 134(17), 174105 (2011)

    Article  Google Scholar 

  22. Scherer, M.K., et al.: PyEMMA 2: a software package for estimation, validation, and analysis of Markov models. J. Chem. Theory Comput. 11(11), 5525–5542 (2015)

    Article  Google Scholar 

  23. da Silva, R.F., Filgueira, R., Pietri, I., Jiang, M., Sakellariou, R., Deelman, E.: A characterization of workflow management systems for extreme-scale applications. Future Gener. Comput. Syst. 75, 228–238 (2017)

    Article  Google Scholar 

  24. Sorin, E.J., Pande, V.S.: Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. Biophys. J. 88(4), 2472–2493 (2005)

    Article  Google Scholar 

  25. Souza, R., Silva, V., Oliveira, D., Valduriez, P., Lima, A.A., Mattoso, M.: Parallel execution of workflows driven by a distributed database management system. In: ACM/IEEE Conference on Supercomputing, Poster (2015)

    Google Scholar 

  26. Weinan, E., Ren, W., Vanden-Eijnden, E.: String method for the study of rare events. Phys. Rev. B 66(5), 052301 (2002)

    Google Scholar 

  27. Woolf, T.B., Roux, B.: Conformational flexibility of o-phosphorylcholine and o-phosphorylethanolamine: a molecular dynamics study of solvation effects. J. Am. Chem. Soc. 116(13), 5916–5926 (1994)

    Article  Google Scholar 

  28. Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: large-scale application composition via distributed-memory dataflow processing. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102. IEEE (2013)

    Google Scholar 

  29. Wu, H., Paul, F., Wehmeyer, C., Noé, F.: Multiensemble Markov models of molecular thermodynamics and Kinetics. Proc. Natl. Acad. Sci. 113, E3221–E3230 (2016). https://doi.org/10.1073/pnas.1525092113

    Article  Google Scholar 

Download references

Acknowledgements

An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR227525. JCS acknowledges ORNL LDRD funds. The authors would like to thank Oscar Hernandez, Frank Noé and group, Cecilia Clementi and group, and Shantenu Jha and group, for valuable insight and discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ada Sedova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ossyra, J.R., Sedova, A., Baker, M.B., Smith, J.C. (2019). Highly Interactive, Steered Scientific Workflows on HPC Systems: Optimizing Design Solutions. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34356-9_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34355-2

  • Online ISBN: 978-3-030-34356-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics