Proposal on Network-Wide Rollback Scheme for Fast Recovery from Operator Errors

Yoshihara, Kiyohito; Arai, Daisuke; Idoue, Akira; Horiuchi, Hiroki

doi:10.1007/978-3-540-75694-1_20

Kiyohito Yoshihara¹,
Daisuke Arai¹,
Akira Idoue¹ &
…
Hiroki Horiuchi¹

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 4785))

Included in the following conference series:

International Workshop on Distributed Systems: Operations and Management

770 Accesses
1 Citations

Abstract

This paper proposes a new network-wide rollback scheme for fast recovery from operator errors, toward the high availability of networks and services. A technical issue arises from the fact that operators, who manipulate one or more diverse devices and services due to their network-wide dependency in a typical management task, are the major cause of failure. The lack of systems or tools fully addressing the issue motivated us to develop a new scheme. The underlying idea is that, for any operational device or service, the observable behavior is identical whenever the same setting is configured. High availability will thus be achieved by rolling the settings that may cause an abnormal state by an operator error, back to past ones with which devices and services were stable. Certain policies for the network-wide rollback are identified and a prototype implementation and preliminary results will be presented.

Download to read the full chapter text

Chapter PDF

Adaptive Domain-Specific Service Monitoring

An Experimental Validation of the Practical Byzantine Fault Tolerant Algorithm

An Active Service Reselection Triggering Mechanism

References

Patterson, D.A.: A Simple Way to Estimate the Cost of Downtime. In: Proc. of the 16th Systems Administration Conference, pp. 185–188 (November 2002)
Google Scholar
Brown, A.B., Patterson, D.A.: Undo for Operators: Building an Undoable E-mail Store. In: Proc. of USENIX 2003, pp. 1–14 (June 2003)
Google Scholar
O’Brien, J., Shapiro, M.: Undo for anyone, anywhere, anytime. In: Proc. of the 11th workshop on ACM SIGOPS European workshop, ACM Press, New York (2004)
Google Scholar
Shrubbery Networks, Inc.: Really Awesome New Cisco confIg Differ (RANCID) (URL available for May 2007), http://www.shrubbery.net/rancid/
AdventNet, Inc.: DeviceExpert (URL available for May 2007), http://manageengine.adventnet.com/products/device-expert/index.html

Download references

Author information

Authors and Affiliations

KDDI R&D Laboratories Inc., 2-1-15 Ohara Fujimino-shi, Saitama 356-8502, Japan
Kiyohito Yoshihara, Daisuke Arai, Akira Idoue & Hiroki Horiuchi

Authors

Kiyohito Yoshihara
View author publications
You can also search for this author in PubMed Google Scholar
Daisuke Arai
View author publications
You can also search for this author in PubMed Google Scholar
Akira Idoue
View author publications
You can also search for this author in PubMed Google Scholar
Hiroki Horiuchi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alexander Clemm Lisandro Zambenedetti Granville Rolf Stadler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yoshihara, K., Arai, D., Idoue, A., Horiuchi, H. (2007). Proposal on Network-Wide Rollback Scheme for Fast Recovery from Operator Errors. In: Clemm, A., Granville, L.Z., Stadler, R. (eds) Managing Virtualization of Networks and Services. DSOM 2007. Lecture Notes in Computer Science, vol 4785. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75694-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-540-75694-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75693-4
Online ISBN: 978-3-540-75694-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Proposal on Network-Wide Rollback Scheme for Fast Recovery from Operator Errors

Abstract

Chapter PDF

Similar content being viewed by others

Adaptive Domain-Specific Service Monitoring

An Experimental Validation of the Practical Byzantine Fault Tolerant Algorithm

An Active Service Reselection Triggering Mechanism

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Proposal on Network-Wide Rollback Scheme for Fast Recovery from Operator Errors

Abstract

Chapter PDF

Similar content being viewed by others

Adaptive Domain-Specific Service Monitoring

An Experimental Validation of the Practical Byzantine Fault Tolerant Algorithm

An Active Service Reselection Triggering Mechanism

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation