Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1734))

  • 321 Accesses

Abstract

When planning the purchase of a compute cluster, usually much thought is spent on the choice of compute nodes, interconnects, switches, and—to a lesser extent—the operating system and system software. Important software tools for system configuration, user administration, fault tolerance, debugging, and monitoring are often overlooked. While in small systems, this does not matter too much, the lack of suitable software tools might become a nightmare, though, when trying to operate compute clusters for a large, diverse user community. The following three chapters deal with tools.

In Chapter 24, researchers from Technische Universität München (TUM) present a network monitoring tool that has been implemented in the context of their SMiLE project. With the data obtained from a hardware monitor on their own adapter card (see Chapter 4), the TUM researchers have implemented an infrastructure for the evaluation and controlled deterministic execution of hardware-supported distributed shared memory architectures.

Based on the Dolphin PCI adapter cards, researchers from the University of Paderborn have developed a simple but powerful software that allows the user to observe the utilization of processors and the network. The software monitor presented in Chapter 25 is intended for administrators to trace the system status and for users to debug and tune their application. In contrast to the above TUM project, this monitor does not actively influence the application.

Finally, Chapter 26 addresses the important issue of operating large SCI clusters as general purpose compute servers in a multi-user environment. The authors from Paderborn present the architecture of their Computer Center Software (CCS) which provides mechanisms for system partitioning, job scheduling, and user access management. With CCS, an SCI cluster is no longer seen as a collection of machines, but rather as a dedicated high-performance computer. Hence the focus of CCS is on supporting parallel high-performance applications rather than throughput computing (which is the prevalent operation mode for LAN clusters).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hellwagner, H., Reinefeld, A. (1999). Tools for SCI Clusters. In: Hellwagner, H., Reinefeld, A. (eds) SCI: Scalable Coherent Interface. Lecture Notes in Computer Science, vol 1734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10704208_31

Download citation

  • DOI: https://doi.org/10.1007/10704208_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66696-7

  • Online ISBN: 978-3-540-47048-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics