Online Recovery in Parallel Database Systems

Jiménez-Peris, Ricardo

doi:10.1007/978-1-4614-8265-9_1089

Online Recovery in Parallel Database Systems

Ricardo Jiménez-Peris³

Reference work entry
First Online: 01 January 2018

174 Accesses

Synonyms

Continuous availability; High availability; 24×7 operation

Definition

Replication (also known as clustering) is a technique to provide high availability in parallel and distributed databases. High availability aims to provide continuous service operation. High availability has two faces. On one hand, it provides fault-tolerance by introducing redundancy in the form of replication, that is, having multiple copies or replicas of the data at different sites. On the other hand, since sites holding the replicas may crash and/or fail, in order to keep a given degree of availability, failed or new replicas should be reintroduced into the system. Introducing new replicas requires transferring to them the current state in a consistent fashion (known as recovery). A simple solution to this problem is offline recovery, that is, in order to obtain a quiescent state, request processing is suspended, then the state is transferred from a working replica (termed recoverer replica) to the new...

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

Bernstein PA, Hadzilacos V, Goodman N. Concurrency control and recovery in database systems. Reading: Addison Wesley; 1987.
Google Scholar
Castro M, Liskov B. Practical byzantine fault tolerance and proactive recovery. ACM Trans Comput Syst. 2002;20(4):398–461.
Article Google Scholar
Gançarski S, Naacke H, Pacitti E, Valduriez P. The leganet system: freshness-aware transaction routing in a database cluster. Inform Syst. 2007;32(2):320–43.
Article Google Scholar
Gashi I, Popov P, Strigini L. Fault tolerance via diversity for off-the-shelf products: a study with SQL database servers. IEEE Trans Depend Secur Comput. 2007;4(4):280–94.
Article Google Scholar
Jiménez-Peris R, Patiño-Martínez M, Alonso G. Non-intrusive, parallel recovery of replicated data. In: Proceedings of the 21st Symposium on Reliable Distributed Systems; 2002. p. 150–9.
Google Scholar
Kemme B. and Alonso G. Don’t be lazy, be consistent: Postgres-R, a new way to implement database replication. In: Proceedings of the 26th International Conference on Very Large Data Bases; 2000. p. 134–43.
Google Scholar
Kemme B, Alonso G. A new approach to developing and implementing eager database replication protocols. ACM Trans Database Syst. 2000;25(3):333–79.
Article Google Scholar
Kemme B, Bartoli A, Babaoglu O. Online reconfiguration in replicated databases based on group communication. In: Proceedings of the International Conference on Dependable Systems and Networks; 2001. p. 117–30.
Google Scholar
Lau E Madden S. An integrated approach to recovery and high availability in an updatable, distributed data warehouse. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 703–14.
Google Scholar
Manassiev K, Amza C. Scaling and continuous availability in database server clusters through multiversion replication. In: Proceedings of the International Conference on Dependable Systems and Networks; 2007. p. 666–76.
Google Scholar
Özsu MT, Valduriez P. Principles of distributed database systems. 2nd ed. Upper Saddle River: Prentice-Hall; 1999.
Google Scholar
Pacitti E, Simon E. Update propagation strategies to improve freshness in lazy master replicated databases. VLDB J. 2000;8(3):305–18.
Article Google Scholar
Patiño-Martínez M, Jiménez-Peris R, Kemme B, Alonso G. Middle-R: consistent database replication at the middleware level. ACM Trans Comput Syst. 2005;23(4):375–423.
Article Google Scholar
Pedone F, Guerraoui R, Schiper A. The database state machine approach. Distrib Parallel Databases. 2003;14(1):71–98.
Article Google Scholar
Plattner C, Alonso G. Ganymed: scalable replication for transactional web applications. In: Proceedings of the ACM/IFIP/USENIX 5th International Middleware Conference; 2004. p. 155–74.
Chapter Google Scholar
PostgreSQL PostgreSQL Point in Time Recovery. http://www.postgresql.org/docs/8.0/interactive/backup-online.html.
Vandiver B, Balakrishnan H, Liskov B, Madden S. Tolerating Byzantine faults in database systems using commit barrier scheduling. In: Proceedings of the 21st ACM Symposium on Operating System Principles; 2007. p. 59–72.
Google Scholar

Download references

Author information

Authors and Affiliations

Distributed Systems Lab, Universidad Politecnica de Madrid, Madrid, Spain
Ricardo Jiménez-Peris

Authors

Ricardo Jiménez-Peris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ricardo Jiménez-Peris .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

INRIA, LINA, Nantes, France
Patrick Valduriez

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Jiménez-Peris, R. (2018). Online Recovery in Parallel Database Systems. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1089

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_1089
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics