Abstract
Checkpointing support allows program execution to roll-back to an earlier program point, discarding any modifications made since that point. Existing software-based checkpointing methods are mainly libraries that snapshot all of working-memory, and hence have prohibitive overhead for many potential applications. In this paper we present a light-weight, fine-grain checkpointing framework implemented entirely in software through compiler transformations and optimizations. A programmer can specify arbitrary checkpoint regions via a simple API, and the compiler automatically transforms the code to implement the checkpoint at the granularity of individual stores, optimizing to remove redundancy. We explore two application areas for this support. First, we investigate its application to debugging, in particular by providing the ability to rewind to an arbitrarily-placed point in a buggy program’s execution. A study using BugBench applications shows that our compiler-based approach is more than 100x less overhead than full-process checkpointing. Second, we demonstrate that compiler-based checkpointing support can be leveraged to free the programmer from manually implementing and maintaining software rollback mechanisms when coding a back-tracking algorithm, with runtime overhead of only 15% compared to the manual implementation.
Chapter PDF
Similar content being viewed by others
Keywords
- Transactional Memory
- Compiler Optimization
- Checkpoint Region
- Software Transactional Memory
- Redundancy Rate
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adl-Tabatabai, A., Lewis, B.T., Menon, V.S., Murphy, B.R., Saha, B., Shpeisman, T.: Compiler and runtime optimizations for efficient software transactional memory. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI (2006)
Agrawal, H., Demillo, R., Spafford, E.: An execution-backtracking approach to debugging. IEEE Transactions on Software (May-June 1991)
Agrawal, H., Demillo, R., Spafford, E.: Debugging with dynamic slicing and backtracking. Software: Practice and Experience (October 2006)
Akkary, H., Rajwar, R., Srinivasan, S.: Checkpoint processing and recovery: An efficient, scalable alternative to reorder buffers. IEEE Computer Society (2003)
Betz, V., Rose, J.: Vpr: A new packing, placement and routing tool for fpga research. In: VPR: A New Packing, Placement and Routing Tool for FPGA Research (1997)
Betz, V., Rose, J., Marquardt, A.: Architecture and cad for deep-submicron fpgas. Kluwer Academic Publishers (February 1999)
Elnozahy, W., Johnson, D., Zwaenepoel, W.: The performance of consistent checkpointing. In: 11th Symposium on Reliable Distributed Systems, pp. 39-47 (October 1992)
Feldman, S.I., Brown, C.I.: Igor: A system for program debugging via reversible execution. In: ACM SIGPLAN Notices, Workshop on Parallel and Distributed Debugging (1989)
Free Softwar Foundation. Gdb: the gnu debugger manual 7.0 (September 2009)
Hammond, L., Willey, M., Olukotun, K.: Data speculation support for a chip multiprocessor. In: ACM SIGOPS Operating Systems (December 1998)
Hammond, L., Wong, V., Chen, M., Carlstrom, B.D., Davis, J.D., Hertzberg, B., Prabhu, M., Wijaya, H., Kozyrakis, C., Olukotun, K.: Transactional memory coherence and consistency. In: CM SIGARCH Computer Architecture News (March 2004)
Herlihy, M., Luchangco, V., Moir, M., Scherer, W.N.: Software transactional memory for dynamic-sized data structures. In: The Twenty-Second Annual Symposium on Principles of Distributed Computing (2003)
Hwu, W., Patt, Y.: Checkpoint repair for out-of-order execution machines. In: Computer Science Division. ACM, University of California at Berkeley (1987)
Jagadish, H.V., Silberschatz, A., Sudarshan, S.: Recovering from main-memory lapses. In: Procs. of the International Conf. on Very Large Databases, VLDB (1993)
King, S.T., Dunlap, G.W., Chen, P.M.: Debugging operating systems with time-traveling virtual machines. In: Annual USENIX Technical Conference (2005)
Kingsley, G., Beck, M., Plank, J.: Compiler-assisted checkpoint optimization using suif. In: First SUIF Compiler Workshop (1995)
Lattner, C., Adve, V.: Llvm a compilation framework for lifelong program analysis and transformation. In: Proc. of the 2004 International Symposium on Code Generation and Optimization (CGO) (March 2004)
Lattner, C., Adve, V.: The LLVM Compiler Framework and Infrastructure Tutorial. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds.) LCPC 2004. LNCS, vol. 3602, pp. 15–16. Springer, Heidelberg (2005)
Li, C., Stewart, E., Fuchs, W.: Compiler-assisted full checkpointing. Software-practice and Experience 24(10), 871–886 (1994)
Lu, S., Li, Z., Qin, F., Tan, L., Zhou, P., Zhou, Y.: Bugbench: Benchmarks for evaluating bug detection tools. In: Workshop on the Evaluation of Software Defect Detection Tools (2005)
Mcdonald, A., Chung, J., Carlstrom, B.D., Minh, C.C., Chafi, H., Kozyrakis, C., Olukotun, K.: Architectural semantics for practical transactional memory. Computer Architecture News (2006)
Moore, K.E., Bobba, J., Moravan, M.J., Hill, M.D., Wood, D.A.: Logtm: Log-based transactional memory. In: High-Performance Computer Architecture (2006)
Eliot, J., Moss, B.: Log-based recovery for nested transactions. In: Proceedings of the 13th International Conference on Very Large Data Bases (1987)
Ng, W., Chen, P.: The symmetric improvement of fault tolerance in the rio file cache. In: Proceedings of 1999 Fault Tolerance Computing, FTC (1999)
Plank, J., Beck, M., Kingsley, G.: Compiler-assisted memory exclusion for fast checkpointing. In: IEEE Technical Committee on Operating System and Application Environments, Special Issue on Fault-Tolerance (1995)
Plank, J.S., Beck, M., Kingsley, G., Li, K.: Libckpt: Transparent checkpointing under unix. In: Usenix Winter Technical Conference (1995)
Chandra, S.: An evaluation of recovery related properties of software faults. Ph.D. thesis (2004)
Saha, B., Adl-Tabatabai, A.-R., Hudson, R.L., Minh, C.C.: Mcrt-stm: A high performance software transactional memory system for a multi-core runtime. In: Principles and Practice of Parallel Programming, PPOPP (2006)
Gregory Steffan, J., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: International Symposium on Computer Architecture (ISCA) (June 2000)
Wang, Y., Huang, Y., Vo, K., Chung, P., Kintala, C.: Checkpointing and its applications. In: 25th Int. Symp. On Fault-Tol. Comp., pp. 22–31 (June 1995)
Whaley, J.: System checkpointing using reflection and program analysis
Xu, M., Malyugin, V., Sheldon, J., Venkitachalam, G., Weissman, B.: Retrace: Collecting execution trace with virtual machine deterministic replay. In: 3rd Workshop on Modeling, Benchmarking and Simulation (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, C.(., Steffan, J.G., Amza, C., Kielstra, A. (2012). Compiler Support for Fine-Grain Software-Only Checkpointing. In: O’Boyle, M. (eds) Compiler Construction. CC 2012. Lecture Notes in Computer Science, vol 7210. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28652-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-28652-0_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28651-3
Online ISBN: 978-3-642-28652-0
eBook Packages: Computer ScienceComputer Science (R0)