Skip to main content

A Scalable Process-Management Environment for Parallel Programs

  • Conference paper
  • First Online:
Book cover Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1908))

Abstract

We present a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel jobs comprising a thousand processes is quick, that signals can be quickly delivered to processes, and that stdin, stdout, and stderr are managed intuitively. Our primary target is parallel machines made up of clusters of SMPs, but the system is also useful in more tightly integrated environments. We describe how MPD enables much faster startup and better runtime management of MPICH jobs. We show how close control of stdio can support the easy implementation of a number of convenient system utilities, even a parallel debugger. MPD is implemented and freely distributed with MPICH.

This work was supported by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract W-31-109-Eng-38.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chiba City home page. http://www.mcs.anl.gov/chiba

  2. The Maui scheduler home page. http://maui-scheduler.mhpcc.edu/newdoc, http://www.mhpcc.edu/maui.

  3. M. A. Baker, G. C. Fox, and H. W. Yau. Review of cluster management software. NHSE Review, 1(1), May 1996.

    Google Scholar 

  4. Amnon Barak, Shai Guday, and Richard G. Wheeler. The MOSIX distributed operating system: Load balancing for UNIX, volume 672 of Lecture Notes in Computer Science. Springer-Verlag, New York, 1993.

    MATH  Google Scholar 

  5. Micah Beck, Jack J. Dongarra, Graham E. Fagg, G. Al Geist, Paul Gray, James Kohl, Mauro Migliardi, Keith Moore, Terry Moore, Philip Papadopoulous, Stephen L. Scott, and Vaidy Sunderam. HARNESS: A next generation distributed virtual machine. International Journal on Future Generation Computer Systems, 15(5/6), 1999.

    Google Scholar 

  6. Greg Burns, Raja Daoud, and James Vaigl. LAM: An open cluster environment for MPI. In John W. Ross, editor, Proceedings of Supercomputing Symposium’ 94, pages 379–386. University of Toronto, 1994.

    Google Scholar 

  7. Ralph Butler and Ewing Lusk. Monitors, messages, and clusters: The p4 parallel programming system. Parallel Computing, 20:547–564, April 1994.

    Google Scholar 

  8. DQS home page. http://www.scri.fsu.edu/~pasko/dqs.html.

  9. I. Foster and C. Kesselman, editors. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.

    Google Scholar 

  10. Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Bob Manchek, and Vaidy Sunderam. PVM: Parallel Virtual Machine—A User’s Guide and Tutorial for Network Parallel Computing. MIT Press, Cambridge, Mass., 1994.

    Google Scholar 

  11. Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: A Global Layer Unix for a network of workstations. Software—Practice and Experience, 28(9):929–961, July 1998.

    Google Scholar 

  12. William Gropp and Ewing Lusk. Scalable Unix tools on parallel processors. In Proceedings of the Scalable High-Performance Computing Conference, pages 56–62. IEEE Computer Society Press, 1994.

    Google Scholar 

  13. William Gropp, Ewing Lusk, Nathan Doss, and Anthony Skjellum. A high-performance, portable implementation of the MPI Message-Passing Interface standard. Parallel Computing, 22(6):789–828, 1996.

    Article  MATH  Google Scholar 

  14. IBM. Loadleveler: Using and Administering, version 2 release 1 edition, November 1998. SA22-7311-00.

    Google Scholar 

  15. M. J. Litzkow, M. Livny, and M. W. Mutka. Condor-A hunter of idle workstations. In Proc. 8th Intl. Conf. on Distributed Computing Systems, pages 104–111, San Jose, Calif., June 1988.

    Google Scholar 

  16. M. Migliardi and V. Sunderam. PVM emulation in the Harness metacomput-ing system: A plug-in based approach. In J.J. Dongarra, E. Luque, and Tomas Margalef, editors, Recent advances in parallel virtual machine and message passing interface: 6th European PVM/MPI Users’ Group Meeting, Barcelona, Spain, September 26–29, 1999: Proceedings, volume 1697 of Lecture Notes in Computer Science, pages 117–124, Berlin, 1999. Springer-Verlag.

    Chapter  Google Scholar 

  17. PBS home page. http://pbs.mrj.com/.

  18. Load Sharing Facility (LSF). http://www.platform.com.

  19. J. Pruyne and M. Livny. Interfacing Condor and PVM to harness the cycles of workstation clusters. Future Generation Computer Systems, 12(1):67–85, May 1996.

    Google Scholar 

  20. Andrew S. Tanenbaum. Computer Networks. Prentice Hall, third edition, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Butler, R., Gropp, W., Lusk, E. (2000). A Scalable Process-Management Environment for Parallel Programs. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2000. Lecture Notes in Computer Science, vol 1908. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45255-9_25

Download citation

  • DOI: https://doi.org/10.1007/3-540-45255-9_25

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41010-2

  • Online ISBN: 978-3-540-45255-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics