Skip to main content

An Improved Ganglia-Like Clusters Monitoring System

  • Conference paper
Grid and Cooperative Computing (GCC 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3033))

Included in the following conference series:

Abstract

Ganglia [1] is a scalable distributed monitoring system for high performance computing systems such as clusters and Grids. We propose an improved Ganglia-like clusters monitoring system, which has more reliability with federation node and associated link failures; some monitoring data is accessed by permission; adding control functions such as restart or shutdown confusion processes; send email or pager to cluster administrator when important event occurs; and optionally select some data to federation node based on user policy in order to speedup the WAN access. We have implemented a prototype system.

This research was supported by Guangdong Key Laboratory of Computer Network under grant 2002B60113.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Massie, M.L., Chun, B.N., Culler, D.E.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience (February 2003) (submitted for publication)

    Google Scholar 

  2. The TeraGrid Project. Teragrid project web page (2001), http://www.teragrid.org

  3. Foster, Kesselman, C.: Globus: A meta computing infrastructure toolkit. International Journal of Supercomputer Applications 11(2), 115–128 (1997)

    Article  Google Scholar 

  4. Sottile, M., Minnich, R.: Supermon: A high speed cluster monitoring system. In: Proceedings of Cluster (September 2002)

    Google Scholar 

  5. Anderson, E., Patterson, D.: Extensible, scalable monitoring for clusters of computers. In: Proceedings of the 11th Systems Administration Conference (October 1997)

    Google Scholar 

  6. Amir, E., McCanne, S., Katz, R.H.: An active service framework and its application to realtime multimedia transcoding. In: Proceedings of the ACM SIGCOMM 1998 Conference on Communications Architectures and Protocols, pp. 178–189 (1998)

    Google Scholar 

  7. Chun, B.N., Culler, D.E.: Rexec: A decentralized, secure remote execution environment for clusters. In: Proceedings of the 4th Workshop on Communication, Architecture and Applications for Network based Parallel Computing (January 2000)

    Google Scholar 

  8. Hyarary, F.: Graph Theory. Addison-Wesley, Reading (1969)

    Google Scholar 

  9. Peterson, L., Culler, D., Anderson, T., Roscoe, T.: A blueprint for introducing disruptive technology into the internet. In: Proceedings of the 1st Workshop on Hot Topics in Networks, HotNets-I (October 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wei, W., Dong, S., Zhang, L., Liang, Z. (2004). An Improved Ganglia-Like Clusters Monitoring System. In: Li, M., Sun, XH., Deng, Q., Ni, J. (eds) Grid and Cooperative Computing. GCC 2003. Lecture Notes in Computer Science, vol 3033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24680-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24680-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21993-4

  • Online ISBN: 978-3-540-24680-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics