Abstract
Desktop Grids are composed of several thousands of resources. They are characterized by high volatility of resources, due to voluntary disconnections or failures. This could affect the proper termination of applications execution. PastryGrid is a decentralized system which manages desktop grid resources and user applications over a fully decentralized P2P network. In this paper we present PastryGridCP: our rollback-recovery protocol, which is based on checkpoints designed for the decentralized Desktop Grid system PastryGrid. It provides fault tolerance for grid applications and ensures the termination of the execution of applications in a transparent way to users. We have conducted out experimentations on 110 nodes of Grid’5000. Obtained results validate our protocol and improve the performance of applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abbes, H., Cérin, C., Jemni, M.: PastryGrid: decentralisation of the execution of distributed applications in desktop grid. In: MGC 2008, pp. 1–6 (2008)
Rowstron, A., Druschel, P.: Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems (2001)
Abbes, H., Cérin, C., Jemni, M., Missaoui, Y.: Fault tolerance for pastrygrid middleware. In: IPDPS Workshops, pp. 1–8 (2010)
Anderson, D.P.: BOINC: A System for Public-Resource Computing and Storage. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, GRID 2004, pp. 4–10. IEEE Computer Society, Washington, DC (2004)
Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the condor experience. Research articles. Concurr. Comput.: Pract. Exper. 17(2-4), 323–356 (2005)
Cappello, F., Djilali, S., Fedak, G., Hérault, T., Magniette, F., Néri, V., Lodygensky, O.: Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid. Future Generation Comp. Syst. 21(3), 417–437 (2005)
Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)
Rilling, L.: Vigne: Towards a self-healing grid operating system. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006. LNCS, vol. 4128, pp. 437–447. Springer, Heidelberg (2006)
Cirne, W., Vilar Brasileiro, F., Andrade, N., Costa, L., Andrade, A., Novaes, R., Mowbray, M.: Labs of the World, Unite!!! J. Grid Comput. 4(3), 225–246 (2006)
Chakravarti, A.J., Baumgartner, G., Lauria, M.: The organic grid: self-organizing computation on a peer-to-peer network. Trans. Sys. Man Cyber. Part A 35(3), 373–384 (2005)
Schulz, S., Blochinger, W., Held, M., Dangelmayr, C.: COHESION - A microkernel based Desktop Grid platform for irregular task-parallel applications. Future Gener. Comput. Syst. 24(5), 354–370 (2008)
Zhou, D., Lo, V.: Cluster Computing on the Fly: Resource Discovery in a Cycle Sharing Peer-to-Peer System. In: IEEE Intl. Workshop on Global and Peer-to-Peer Computing, pp. 66–73 (2004)
Luther, A., Buyya, R., Ranjan, R., Venugopal, S.: Alchemi: A.NET-based Enterprise Grid Computing System. In: 6th International Conference on Internet Computing (ICOMP 2005), Las Vegas (2005)
Mengotti, T.: GPU, a Framework for Distributed Computing over Gnutella. Master’s thesis, ETH Zuerich, Switzerland (2004)
Abbes, H., Cérin, C., Jemni, M.: A decentralized and fault-tolerant Desktop Grid system for distributed applications. Concurrency and Computation: Practice and Experience 22(3), 261–277 (2010)
Rowstron, A., Druschel, P.: Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Proc. of the 18th ACM Symp. on Operating Systems Principles, pp. 188–201. ACM, New York (2001)
Duell, J.: The design and implementation of Berkeley Labs linux Checkpoint/Restart. Technical report (2003)
PastryGrid Source Code (May 2013), http://sourceforge.net/projects/pastrygrid/
Mehnert-Spahn, J., Ropars, T., Schoettner, M., Morin, C.: The architecture of the xtreemOS grid checkpointing service. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 429–441. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Abbes, H., Louati, T. (2013). PastryGridCP: A Decentralized Rollback-Recovery Protocol for Desktop Grid Systems. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-03859-9_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03858-2
Online ISBN: 978-3-319-03859-9
eBook Packages: Computer ScienceComputer Science (R0)