Abstract
Despite the fact that cloud computing offers a high degree of dynamism on resource provisioning, there is a general lack of support for managing dynamic adaptations of replicated services in the cloud, and, even when such support exists, it is focused mainly on elasticity by means of horizontal scalability. We analyse the benefits a replicated service may obtain from dynamic adaptations in the cloud and the requirements on the replication system. For example, adaptation can be done to increase and decrease the capacity of a service, move service replicas closer to their clients, obtain diversity in the replication (for resilience), recover compromised replicas, or rejuvenate ageing replicas. We introduce FITCH, a novel infrastructure to support dynamic adaptation of replicated services in cloud environments. Two prototype services validate this architecture: a crash fault-tolerant Web service and a Byzantine fault-tolerant key-value store based on state machine replication.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abd-El-Malek, M., et al.: Fault-scalable Byzantine fault-tolerant services. In: Proc. of SOSP (2005)
Amazon Web Services: Amazon Elastic Compute Cloud (Amazon EC2) (2006), http://aws.amazon.com/ec2/
Barroso, L., Hölzle, U.: The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis Lectures on Computer Architecture 4(1) (2009)
Bessani, A., Correia, M., Quaresma, B., André, F., Sousa, P.: Depsky: dependable and secure storage in a cloud-of-clouds. In: Proc. of EuroSys (2011)
Bessani, A., et al.: BFT-SMaRt webpage, http://code.google.com/p/bft-smart
Buisson, J., Andre, F., Pazat, J.L.: Supporting adaptable applications in grid resource management systems. In: Proc. of the IEEE/ACM Int. Conf. on Grid Computing (2007)
Buyya, R., Garg, S., Calheiros, R.: SLA-oriented resource provisioning for cloud computing: Challenges, architecture, and solutions. In: Proc. of Cloud and Service Computing (2011)
Castro, M., Liskov, B.: Practical Byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20(4) (2002)
Chandra, T., Griesemer, R., Redstone, J.: Paxos made live - an engineering perspective. In: Proc. of the PODC (2007)
Chen, W., Hiltunen, M., Schlichting, R.: Constructing adaptive software in distributed systems. In: Proc. of ICDCS (2001)
Cooper, B., et al.: Benchmarking cloud serving systems with YCSB. In: Proc. of SOCC (2010)
Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: high availability via asynchronous virtual machine replication. In: Proc. of the NSDI 2008 (2008)
DeCandia, G., et al.: Dynamo: Amazon’s highly available key-value store. In: Proc. of SOSP (2007)
Dejun, J., Pierre, G., Chi, C.H.: Autonomous resource provisioning for multi-service web applications. In: Proc. of the WWW (2010)
Distler, T., et al.: SPARE: Replicas on Hold. In: Proc. of NDSS (2011)
Dwork, C., Lynch, N., Stockmeyer, L.: Consensus in the presence of partial synchrony. J. ACM 35 (1988)
Garlan, D., et al.: Rainbow: Architecture-based self-adaptation with reusable infrastructure. Computer 37(10) (2004)
Huang, Y., Kintala, C., Kolettis, N., Fulton, N.: Software rejuvenation: analysis, module and applications. In: Proc. of FTCS (1995)
Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1) (2003)
Lamport, L., Malkhi, D., Zhou, L.: Reconfiguring a state machine. SIGACT News 41(1) (2010)
Lorch, J., et al.: The SMART way to migrate replicated stateful services. In: Proc. of EuroSys (2006)
Reiser, H., Kapitza, R.: Hypervisor-based efficient proactive recovery. In: Proc. of SRDS (2007)
Schneider, F.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22(4) (1990)
Sousa, P., Neves, N., Verissimo, P.: How resilient are distributed f fault/intrusion-tolerant systems? In: Proc. of DSN (2005)
Sousa, P., et al.: Highly available intrusion-tolerant services with proactive-reactive recovery. IEEE Trans. on Parallel and Distributed Systems (2010)
Sun Microsystems: Web services performance: Comparing JavaTM 2 enterprise edition (J2EETM platform) and .NET framework. Tech. rep., Sun Microsystems, Inc. (2004)
Veríssimo, P.: Travelling throught wormholes: Meeting the grand challenge of distributed systems. In: Proc. of FuDiCo (2002)
Yi, S., Andrzejak, A., Kondo, D.: Monetary cost-aware checkpointing and migration on amazon cloud spot instances. IEEE Trans. on Services Computing PP(99) (2011)
Zhang, W.: Linux virtual server for scalable network services. In: Proc. of Linux (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Cogo, V.V., Nogueira, A., Sousa, J., Pasin, M., Reiser, H.P., Bessani, A. (2013). FITCH: Supporting Adaptive Replicated Services in the Cloud. In: Dowling, J., Taïani, F. (eds) Distributed Applications and Interoperable Systems. DAIS 2013. Lecture Notes in Computer Science, vol 7891. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38541-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-38541-4_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38540-7
Online ISBN: 978-3-642-38541-4
eBook Packages: Computer ScienceComputer Science (R0)