Exception Handling in the Choices Operating System

  • Francis M. David
  • Jeffrey C. Carlyle
  • Ellick M. Chan
  • David K. Raila
  • Roy H. Campbell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4119)


Exception handling is a powerful abstraction that can be used to help manage errors and support the construction of reliable operating systems. Using exceptions to notify system components about exceptional conditions also reduces coupling of error handling code and increases the modularity of the system. We explore the benefits of incorporating exception handling into the Choices operating system in order to improve reliability. We extend the set of exceptional error conditions in the kernel to include critical kernel errors such as invalid memory access and undefined instructions by wrapping them with language-based software exceptions. This allows developers to handle both hardware and software exceptions in a simple and unified manner through the use of an exception hierarchy. We also describe a catch-rethrow approach for exception propagation across protection domains. When an exception is caught by the system, generic recovery techniques like policy-driven micro-reboots and restartable processes are applied, thus increasing the reliability of the system.


Kernel Process Exception Handling Page Fault Operating System Principle Operating System Kernel 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Randell, B.: Operating Systems: The Problems of Performance and Reliability. In: Proceedings of IFIP Congress, vol. 71(1), pp. 281–290 (1971)Google Scholar
  2. 2.
    Denning, P.J.: Fault Tolerant Operating Systems. ACM Comput. Surv. 8(4), 359–389 (1976)CrossRefzbMATHGoogle Scholar
  3. 3.
    Lee, I., Iyer, R.K.: Faults, Symptoms, and Software Fault Tolerance in the Tandem GUARDIAN 1990 Operating System. In: FTCS, pp. 20–29 (1993)Google Scholar
  4. 4.
    Swift, M.M., Bershad, B.N., Levy, H.M.: Improving the Reliability of Commodity Operating Systems. In: Proceedings of the nineteenth ACM Symposium on Operating Systems Principles, pp. 207–222. ACM Press, New York (2003)CrossRefGoogle Scholar
  5. 5.
    Patterson, D., Brown, A., Broadwell, P., Candea, G., Chen, M., Cutler, J., Enriquez, P., Fox, A., Kiciman, E., Merzbacher, M., Oppenheimer, D., Sastry, N., Tetzlaff, W., Traupman, J., Treuhaft, N.: Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies. Technical report, Berkeley, CA, USA (2002)Google Scholar
  6. 6.
    Chou, A., Yang, J., Chelf, B., Hallem, S., Engler, D.R.: An Empirical Study of Operating System Errors. In: Symposium on Operating Systems Principles, pp. 73–88 (2001)Google Scholar
  7. 7.
    Ganapathi, A.: Why Does Windows Crash? Technical Report UCB/CSD-05-1393, EECS Department, University of California, Berkeley (2005)Google Scholar
  8. 8.
  9. 9.
    Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.E.: Basic Concepts and Taxonomy of Dependable and Secure Computing. IEEE Transactions on Dependable and Secure Computing 1(1), 11–33 (2004)CrossRefGoogle Scholar
  10. 10.
    Candea, G., Kawamoto, S., Fujiki, Y., Friedman, G., Fox, A.: Microreboot – A Technique for Cheap Recovery. In: Symposium on Operating Systems Design and Implementation, San Francisco, CA (2004)Google Scholar
  11. 11.
    Demsky, B., Rinard, M.: Automatic Data Structure Repair for Self-Healing Systems. In: Proceedings of the First Workshop on Algorithms and Architectures for Self-Managed Systems, San Diego, California (2003)Google Scholar
  12. 12.
    Swift, M.M., Annamalai, M., Bershad, B.N., Levy, H.M.: Recovering Device Drivers. In: Symposium on Operating Systems Design and Implementation, pp. 1–16 (2004)Google Scholar
  13. 13.
    Campbell, R.H., Johnston, G.M., Russo, V.: Choices (Class Hierarchical Open Interface for Custom Embedded Systems). ACM Operating Systems Review 21(3), 9–17 (1987)CrossRefGoogle Scholar
  14. 14.
    de Dinechin, C.: C++ Exception Handling. IEEE Concurrency 8(4), 72–79 (2000)CrossRefGoogle Scholar
  15. 15.
    Cameron, D., Faust, P., Lenkov, D., Mehta, M.: A Portable Implementation of C++ Exception Handling. In: USENIX C++ Conference, USENIX, pp. 225–243 (1992)Google Scholar
  16. 16.
    Glyfason, H.I., Hjalmtysson, G.: Exceptional Kernel: Using C++ Exceptions in the Linux Kernel (2004)Google Scholar
  17. 17.
  18. 18.
    Campbell, R.H., Islam, N., Johnson, R., Kougiouris, P., Madany, P.: Choices, Frameworks and Refinement. In: Cabrera, L.-F., Russo, V., Shapiro, M. (eds.) Object-Orientation in Operating Systems, pp. 9–15. IEEE Computer Society Press, Los Alamitos (1991)CrossRefGoogle Scholar
  19. 19.
    Russo, V.F., Madany, P.W., Campbell, R.H.: C++ and Operating Systems Performance a Case Study. In: USENIX C++ Conference, San Francisco, CA, pp. 103–114 (1990)Google Scholar
  20. 20.
    Raila, D.: The Choices Object-oriented Operating System on the Sparc Architecture. Technical report, The University of Illinois at Urbana-Champaign (1992)Google Scholar
  21. 21.
    Lee, L.: PC-Choices Object-oriented Operating System. Technical report, The University of Illinois at Urbana-Champaign (1992)Google Scholar
  22. 22.
    Dike, J.: A user-mode port of the Linux kernel. In: Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, Georgia (2000)Google Scholar
  23. 23.
    Tan, S., Raila, D., Liao, W., Campbell, R.: Virtual Hardware for Operating System Development. Technical report, University of Illinois at Urbana- Champaign (1995)Google Scholar
  24. 24.
    Texas Instruments OMAP Platform,
  25. 25.
    Bellard, F.: QEMU, a Fast and Portable Dynamic Translator. In: USENIX Annual Technical Conference, FREENIX Track (2005)Google Scholar
  26. 26.
    ARM Integrator Family,
  27. 27.
    Qin, F., Tucek, J., Sundaresan, J., Zhou, Y.: Rx: Treating Bugs as Allergies - A Safe Method to Survive Software Failures. In: Symposium on Operating Systems Principles, pp. 235–248 (2005)Google Scholar
  28. 28.
    Miller, R., Tripathi, A.: Issues with Exception Handling in Object-Oriented Systems. In: European Conference in Object-Oriented Computing (1997)Google Scholar
  29. 29.
    Tools Interface Standards: DWARF Debugging Information Format,
  30. 30.
  31. 31.
    Exception Handling ABI for the ARMTMarchitecture,
  32. 32.
    Candea, G., Fox, A.: Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel. In: Proceedings of the Eighth Workshop on Hot Topics in Operating Systems, p. 125. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  33. 33.
    Drew, S., Gouph, K., Ledermann, J.: Implementing zero overhead exception handling. Technical report, Queensland University of Technology (1995)Google Scholar
  34. 34.
    Schilling, J.L.: Optimizing away C++ exception handling. SIGPLAN Not. 33(8), 40–47 (1998)CrossRefMathSciNetGoogle Scholar
  35. 35.
    Thekkath, C.A., Levy, H.M.: Hardware and software support for efficient exception handling. In: Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, pp. 110–119 (1994)Google Scholar
  36. 36.
    ISO 14882: C++ Programming Language,
  37. 37.
    Bershad, B.N., Savage, S., Pardyak, P., Sirer, E.G., Fiuczynski, M., Becker, D., Eggers, S., Chambers, C.: Extensibility, Safety and Performance in the SPIN Operating System. In: 15th Symposium on Operating Systems Principles, Copper Mountain, Colorado, pp. 267–284 (1995)Google Scholar
  38. 38.
    Vinoski, S.: Distributed Object Computing with Corba. C++ Report Magazine (1993)Google Scholar
  39. 39.
  40. 40.
    Katchabaw, M.J., Lutfiyya, H.L., Marshall, A.D., Bauer, M.A.: Policy-driven fault management in distributed systems. In: Proceedings of the The Seventh International Symposium on Software Reliability Engineering, p. 236. IEEE Computer Society, Los Alamitos (1996)CrossRefGoogle Scholar
  41. 41.
    Zeng, L., Lei, H., Jeng, J.J., Chung, J.Y., Benatallah, B.: Policy-Driven Exception-Management for Composite Web Services. In: Proceedings of the 7th IEEE International Conference on E-Commerce Technology, pp. 355–363 (2005)Google Scholar
  42. 42.
    Torres-Pomales, W.: Software Fault Tolerance: A Tutorial. Technical Report NASA/TM-2000-210616, NASA Langley Research Center (2000)Google Scholar
  43. 43.
    Gray, J.: Why do computers stop and what can be done about it? In: Proceedings of the 5th Symposium on Reliability in Distributed Software and Database Systems, pp. 3–12 (1986)Google Scholar
  44. 44.
    Rinard, M., Cadar, C., Dumitran, D., Roy, D.M., Leu, T., William, S., Beebee, J.: Enhancing Server Availability and Security Through Failure-Oblivious Computing. In: Symposium on Operating Systems Design and Implementation, pp. 303–316 (2004)Google Scholar
  45. 45.
    Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault tolerance. ACM Trans. Comput. Syst. 14(1), 80–107 (1996)CrossRefGoogle Scholar
  46. 46.
    Sidiroglou, S., Locasto, M.E., Boyd, S.W., Keromytis, A.D.: Building a Reactive Immune System for Software Services. In: USENIX 2005 Annual Technical Conference (2005)Google Scholar
  47. 47.
    Randell, B.: System structure for software fault tolerance. In: Proceedings of the International Conference on Reliable Software, pp. 437–449 (1975)Google Scholar
  48. 48.
    Avizienis, A.: The N-Version Approach to Fault - Tolerant Systems. IEEE Transactions on Software Engineering, 1491–1501 (1985)Google Scholar
  49. 49.
    Chandra, S., Chen, P.M.: The Impact of Recovery Mechanisms on the Likelihood of Saving Corrupted State. In: ISSRE, pp. 91–101 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Francis M. David
    • 1
  • Jeffrey C. Carlyle
    • 1
  • Ellick M. Chan
    • 1
  • David K. Raila
    • 1
  • Roy H. Campbell
    • 1
  1. 1.University of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations