XtremWeb: Building an Experimental Platform for Global Computing
Global Computing achieves highly distributed computations by harvesting a very large number of unused computing resources connected to the Internet. Although the basic techniques for Global Computing are well understood, several issues remain unadressed, such as the ability to run a large variety of applications, economical models for resource management, performance models accounting for WAN and machine components, and finally new parallel algorithms based on true massive parallelism, with very limited, if any, communication capability. The main purpose of XtremWeb is to build a platform to explore the potential of Global Computing. This paper presents the design decisions of the first implementation of XtremWeb. We also present some early performance measurement, mostly to highlight that even some basic performance features are not well understood yet.
KeywordsJava Virtual Machine Mean Time Between Failure Remote Method Invocation Pierre Auger Observatory Pull Model
Unable to display preview. Download preview PDF.
- 1.Abramson, D., Buyya, R and Giddy, J. “Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global Computational Grid, International Conference on High Performance Computing in Asia-Paci_c Region (HPC Asia’2000), Beijing, China. IEEE Computer Society Press, USA, 2000.Google Scholar
- 2.K. Aida, U. Nagashima, H. Nakada, S. Matsuoka and A. Takefusa. Performance Evaluation Model for Job Scheduling in a Global Computing System. In 7th IEEE Int. Symp on High Performance Distributed Computing, pages 352–353, 1998.Google Scholar
- 3.Anderson D., Bowyer S., Cobb J., Gedye D., Sullivan W. T. and Werthimer D. A New Major SETI Project Based on Project Serendip Data and 100,000 Personal Computers. in Astronomical and Biochemical Origins and the Search for Life in the Universe, Proc. of the Fifth Intl. Conf. on Bioastronomy, 1997Google Scholar
- 4.T. E. Anderson R. Wahbe, S. Lucco and S. L. Graham. Efficient Software-Based Fault Isolation. In Symp. on Operating System Principles, 1993.Google Scholar
- 5.Baldeschwieler J. E., Blumofe R.D. and Brewer E.A.. Atlas: An Infrastructure for Global Computing. in Proc. of HPCN’95, High Performance Computing and Networking Europe, Lecture Notes in Computer Science 918, pp. 582–587, Milano, Italy, May 1995Google Scholar
- 6.J. Basney and M. Levy. Deploying a High Throughput Computing Cluster, volume 1, chapter 5. Prentice Hall, 1999. R. Buyya Ed.Google Scholar
- 7.The Pierre Auger Observatory Cronin J. (University of Chicago) and Watson A. (University of Leed) http://www.auger.org.
- 9.G. Fedak. Exécution délocalisée et Répartition de Charge: une Étude Expérimentale. In RenPar’2000, 2000.Google Scholar
- 10.I. Foster and C. Kesselman. The Globus Project: a Status Report. in Futur Generation Computer System, 40:35–48, 1999.Google Scholar
- 11.J. Maasen, R. van Nieuwpoort, R. Veldema, H. E. Bal and A. Plaat. An Efficient Implementation of Java’s Remote Method Invocation. In Proc. ACM Symposium on Principles and Practice of Parallel Programming. May 1999.Google Scholar