Using Process Restarts to Improve Dynamic Provisioning

  • Raquel V. Lopes
  • Walfredo Cirne
  • Francisco V. Brasileiro
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3278)


Load variations are unexpected perturbations that can degrade performance or even cause unavailability of a system. There are efforts that attempt to dynamically provide resources to accommodate load fluctuations during the execution of applications. However, these efforts do not consider the existence of software faults, whose effects can influence the application behavior and its quality of service, and may mislead a dynamic provisioning system. When trying to tackle both problems simultaneously the fundamental issue to be addressed is how to differentiate a saturated application from a faulty one. The contributions of this paper are threefold. Firstly, we introduce the idea of taking software faults into account when specifying a dynamic provisioning scheme. Secondly, we define a simple algorithm that can be used to distinguish saturated from faulty software. By implementing this algorithm one is able to realize dynamic provisioning with restarts into a full server infrastructure data center. Finally, we implement this algorithm and experimentally demonstrate its efficacy.


dynamic provisioning software faults restart n-tier applications 


  1. 1.
    Gribble, S.D.: Robustness in complex systems. In: Proceedings of the Eighth Workshop on Hot Topics in Operating Systems, pp. 21–26 (2001)Google Scholar
  2. 2.
    Ejasent: Utility computing: Solutions for the next generation IT infrastructure. Technical report, Ejasent (2001)Google Scholar
  3. 3.
    Chase, J.S., Anderson, D.C., Thakar, P.N., Vahdat, A., Doyle, R.P.: Managing energy and server resources in hosting centres. In: Symposium on Operating Systems Principles, pp. 103–116 (2001)Google Scholar
  4. 4.
    Appleby, K., et al.: Oceano - sla based management of a computing utility. In: 7th IFIP/IEEE International Symposium on Integrated Network Management, pp. 855–868 (2001)Google Scholar
  5. 5.
    Ranjan, S., Rolia, J., Fu, H., Knightly, E.: Qos-driven server migration for internet data centers. In: Proceedings of the International Workshop on Quality of Service (2002)Google Scholar
  6. 6.
    Doyle, R., Chase, J., Asad, O., Jen, W., Vahdat, A.: Model-based resource provisioning in a web service utility. In: Proceedings of the USENIX Symposium on Internet Technologies and Systems USITS 2003 (2003)Google Scholar
  7. 7.
    Fox, A., Gribble, S.D., Chawathe, Y., Brewer, E.A., Gauthier, P.: Cluster-based scalable network services. In: Proceedings of the 6th ACM Symposium on Operating Systems Principles, pp. 78–91. ACM Press, New York (1997)CrossRefGoogle Scholar
  8. 8.
    Rolia, J., Zhu, X., Arlitt, M.F.: Resource access management for a utility hosting enterprise applications. In: Proceeding of the 2003 International Symposium on Integrgated Management, pp. 549–562 (2003)Google Scholar
  9. 9.
    Rolia, J., Arlitt, M., Andrzejak, A., Zhu, X.: Statistical service assurancecs for applications in utility grid environments. In: Proceedings of the Tenth IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telcommunication Systems, pp. 247–256 (2003)Google Scholar
  10. 10.
    Rolia, J., et al.: Grids for enterprise applications. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 129–147. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  11. 11.
    Rolia, J., Singhal, S., Friedrich, R.: Adaptive internet data centers. In: SSGRR 2000 Conference (2000)Google Scholar
  12. 12.
    Gray, J.: Why do computers stop and what can be done about it? In: Symposium on Reliability in Distributed Software and Database Systems (1986)Google Scholar
  13. 13.
    Vaidyanathan, K., Trivedi, K.S.: Extended classification of software faults based on aging. In: Proceedings of the 12th International Symposium on Software Reliability Engineering (2001)Google Scholar
  14. 14.
    Lassettre, E., et al.: Dynamic surge protection: An approach to handling unexpected workload surges with resource actions that have dead times. In: Brunner, M., Keller, A. (eds.) DSOM 2003. LNCS, vol. 2867, pp. 82–92. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  15. 15.
    Welsh, M., Culler, D., Brewer, E.: Seda: an architecture for well-conditioned, scalable internet services. In: Proceedings of the 8th ACM Symposium on Operating Systems Principles, pp. 230–243. ACM Press, New York (2001)Google Scholar
  16. 16.
    Huang, Y., Kintala, C., Kolettis, N., Fulton, N.D.: Software rejuvenation: Analysis, module and applications. In: Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing, pp. 381–390. IEEE Computer Society, Los Alamitos (1995)CrossRefGoogle Scholar
  17. 17.
    Candea, G., Fox, A.: Recursive restartability: Turning the reboot sledgehammer into a scalpel. In: Proceedings of the Eighth Workshop on Hot Topics in Operating Systems, pp. 125–132 (2001)Google Scholar
  18. 18.
    Candea, G., Keyani, P., Kiciman, E., Zhang, S., Fox, A.: Jagr: An autonomous self-recovering application server. In: 5th International Workshop on Active Middleware Services (2003)Google Scholar
  19. 19.
    Hong, Y., Chen, D., Li, L., Trivedi, K.: Closed loop design for software rejuvenation. In: Workshop on Self-Healing, Adaptive, and Self-Managed Systems (2002)Google Scholar
  20. 20.
    Li, L., Vaidyanathan, K., Trivedi, K.S.: An approach for estimation of software aging in a web server. In: International Symposium on Empirical Software Engineering (2002)Google Scholar
  21. 21.
    Bao, Y., Sun, X., Trivedi, K.S.: Adaptive software rejuvenation: Degradation model and rejuvenation scheme. In: Proceedings of the 2003 International Conference on Dependable Systems and Networks, pp. 241–248. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  22. 22.
    Erickson, C.: Memory leak detection in embedded systems. Linux Lournal (2002)Google Scholar
  23. 23.
    Oracle: Oracle database 10g: A revolution in database technology. Technical report, Oracle (2003)Google Scholar
  24. 24.
    Fielding, R., et al.: Hypertext transfer protocol – http/1.1. Technical report, RFC 2616 (1999)Google Scholar
  25. 25.
    Kant, K., Tewari, V., Iyer, R.: Geist: A generator of e-commerce and internet server traffic. In: Proceedings of the 2001 IEEE International Symposium on Performance Analysis of Systems and Software, pp. 49–56. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  26. 26.
    Arlitt, M., Krishnamurthy, D., Rolia, J.: Characterizing the scalability of a large web-based shopping system. ACM Trans. Inter. Tech. 1, 44–69 (2001)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2004

Authors and Affiliations

  • Raquel V. Lopes
    • 1
  • Walfredo Cirne
    • 1
  • Francisco V. Brasileiro
    • 1
  1. 1.Departamento de Sistemas e ComputaçãoUniversidade Federal de Campina Grande, Coordenação de Pós-graduação em Engenharia ElétricaCampina GrandeBrazil

Personalised recommendations