Risk Assessment in Grid Computing

Carlsson, Christer; Fullér, Robert

doi:10.1007/978-3-642-22642-7_7

Christer Carlsson &
Robert Fullér

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 270))

609 Accesses
4 Citations

Abstract

There is an increasing demand for computing power in scientific and engineering applications which has motivated the deployment of high performance computing (HPC) systems that deliver tera-scale performance. Current and future HPC systems that are capable of running large-scale parallel applications may span hundreds of thousands of nodes.

In 2006 the highest processor count was 131K nodes according to top500.org [282]. For parallel programs, the failure probability of nodes and computing tasks assigned to the nodes has been shown to increase significantly with the increase in number of nodes. Large-scale computing environments, such as the current grids CERN LCG, NorduGrid, TeraGrid and Grid’5000 gather (tens of) thousands of resources for the use of an ever-growing scientific community. Many of these Grids offer computing resources grouped in clusters, whose owners may share them only for limited periods of time and Grids often have the problems of any large-scale computing environment to which is added that their middleware is still relatively immature, which contributes to making Grids relatively unreliable computing platforms. Long et al. [237] collected a dataset on node failures over 11 months from 1139 workstations on the Internet to determine their uptime intervals. Plank and Elwasif [277] collected a dataset on failure information for a collection of 16 DEC Alpha work-stations at Princeton University; the size of this network is smaller and is a typical local cluster of homogeneous processors; the failure data was collected for 7 months and shows similar characteristics as for the larger clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Authors

Christer Carlsson
View author publications
You can also search for this author in PubMed Google Scholar
Robert Fullér
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Carlsson, C., Fullér, R. (2011). Risk Assessment in Grid Computing. In: Possibility for Decision. Studies in Fuzziness and Soft Computing, vol 270. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22642-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-22642-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22641-0
Online ISBN: 978-3-642-22642-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics