I-Queue: Smart Queues for Service Management

  • Mohamed S. Mansour
  • Karsten Schwan
  • Sameh Abdelaziz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4294)


Modern enterprise applications and systems are characterized by complex underlying software structures, constantly evolving feature sets, and frequent changes in the data on which they operate. The dynamic nature of these applications and systems poses substantial challenges to their use and management, suggesting the need for automated solutions. This paper considers a specific set of dynamic changes, large data updates that reflect changes in the current state of the business, where the frequency of such updates can be multiple times per day. The paper then presents techniques and their middleware implementation for automatically managing requests streams directed at server applications subjected to dynamic data updates, the goal being to improve application reliability in face of evolving feature sets and business data. These techniques (1) automatically detect input patterns that lead to performance degradation or failures and then (2) use these detections to trigger application-specific methods that control input patterns to avoid or at least, defer such undesirable phenomena. Lab experiments using actual traces from Worldspan show a 16% decrease in frequency of server restarts when using these techniques, at negligible costs in additional overheads and within delays suitable for the rates of changes experienced by this application.


IEEE Computer Society Autonomic Computing Enterprise Application Input Message Message Queue 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Barham, P.T., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP 2003), Bolton Landing, NY, pp. 164–177 (2003)Google Scholar
  2. 2.
    Agarwala, S., Poellabauer, C., Kong, J., Schwan, K., Wolf, M.: System-level resource monitoring in high-performance computing environments. Journal of Grid Computing 1, 273–289 (2003)zbMATHCrossRefGoogle Scholar
  3. 3.
    IBM: Common base event (2003), (online; viewed: 5/24/2006)
  4. 4.
    Swint, G.S., Jung, G., Pu, C., Sahai, A.: Automated staging for built-to-order application systems. In: Proceedings of the 2006 IFIP/IEEE Network Operations and Management Symposium (NOMS 2006), Vancouver, Canada (2006)Google Scholar
  5. 5.
    IBM: IBM Tivoli monitoring, (online; viewed: 5/24/2006)
  6. 6.
    Bodic, P., Friedman, G., Biewald, L., Levine, H., Candea, G., Patel, K., Tolle, G., Hui, J., Fox, A., Jordan, M.I., Patterson, D.: Combining visualization and statistical analysis to improve operator confidence and efficiency for failure detection and localization. In: ICAC 2005: Proceedings of the Second International Conference on Automatic Computing, Washington, DC, USA, pp. 89–100. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  7. 7.
    Roblee, C., Cybenko, G.: Implementing large-scale autonomic server monitoring using process query systems. In: ICAC 2005: Proceedings of the Second International Conference on Automatic Computing, Washington, DC, USA, pp. 123–133. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  8. 8.
    Mansour, M.S., Schwan, K.: I-RMI: Performance isolation in information flow applications. In: Alonso, G. (ed.) Middleware 2005. LNCS, vol. 3790, pp. 375–389. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Keller, A., Ludwig, H.: The WSLA framework: Specifying and monitoring service level agreements for web services. J. Netw. Syst. Manage. 11, 57–81 (2003)CrossRefGoogle Scholar
  10. 10.
    Chen, M., Kiciman, E., Fratkin, E., Brewer, E., Fox, A.: Pinpoint: Problem determination in large, dynamic, internet services. In: Proceedings of the International Conference on Dependable Systems and Networks (IPDS Track), Washington D.C (2002)Google Scholar
  11. 11.
    Jin, W., Chase, J.S., Kaur, J.: Interposed proportional sharing for a storage service utility. In: Proceedings of the joint international conference on Measurement and modeling of computer systems, pp. 37–48. ACM Press, New York (2004)Google Scholar
  12. 12.
    Candea, G., Cutler, J., Fox, A.: Improving availability with recursive microreboots: a soft-state system case study. Perform. Eval. 56, 213–248 (2004)CrossRefGoogle Scholar
  13. 13.
    Kumar, V., Cai, Z., Cooper, B.F., Eisenhauer, G., Schwan, K., Mansour, M.S., Seshasayee, B., Widener, P.: IFLOW: Resource-aware overlays for composing and managing distributed information flows. In: Proceedings of ACM SIGOPS EUROSYS 2006, Leuven, Belgium (2006)Google Scholar
  14. 14.
    Sun Microsystems: Java message service (JMS), (online; viewed: 5/24/2006)
  15. 15.
    Tibco: Tibco Rendezvous, (online; viewed: 5/24/2006)
  16. 16.
    Oreizy, P., Gorlick, M., Taylor, R., Heimbigner, D., Johnson, G., Medvidovic, N., Quilici, A., Rosenblum, D., Wolf, A.: An architecture-based approach to selfadaptive software. IEEE Intelligent Systems 14, 54–62 (1999)CrossRefGoogle Scholar
  17. 17.
    Hanson, J.E., Whalley, I., Chess, D.M., Kephart, J.O.: An architectural approach to autonomic computing. In: Proceedings of the First International Conference on Autonomic Computing (ICAC 2004), Washington, DC, USA, pp. 2–9. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  18. 18.
    Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition, pp. 267–296 (1990)Google Scholar
  19. 19.
    Wald, A.: Sequential Analysis. John Wiley & Sons, NY (1947)zbMATHGoogle Scholar
  20. 20.
    Cassidy, K.J., Gross, K.C., Malekpour, A.: Advanced pattern recognition for detection of complex software aging phenomena in online transaction processing servers. In: DSN 2002: Proceedings of the 2002 International Conference on Dependable Systems and Networks, Washington, DC, USA, pp. 478–482. IEEE Computer Society, Los Alamitos (2002)CrossRefGoogle Scholar
  21. 21.
    Gross, K.C., Lu, W., Huang, D.: Time-series investigation of anomalous CRC error patterns in fiber channel arbitrated loops. In: Wani, M.A., Arabnia, H.R., Cios, K.J., Hafeez, K., Kendall, G. (eds.) ICMLA, pp. 211–215. CSREA Press, Stanford (2002)Google Scholar
  22. 22.
    Mansour, M.S., Scwhan, K., Abdelaziz, S.: I-Queue: Smart queues for service management. Technical Report GIT-CERCS-06-11, CERCS (2006)Google Scholar
  23. 23.
    Fox, A., Kiciman, E., Patterson, D.: Combining statistical monitoring and predictable recovery for self-management. In: WOSS 2004: Proceedings of the 1st ACM SIGSOFT workshop on Self-managed systems, pp. 49–53. ACM Press, New York (2004)CrossRefGoogle Scholar
  24. 24.
    Lohman, G., Champlin, J., Sohn, P.: Quickly finding known software problems via automated symptom matching. In: ICAC 2005: Proceedings of the Second International Conference on Automatic Computing, Washington, DC, USA, pp. 101–110. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  25. 25.
    Jiang, G., Chen, H., Ungureanu, C., Yoshihira, K.: Multi-resolution abnormal trace detection using varied-length n-grams and automata. In: ICAC 2005: Proceedings of the Second International Conference on Automatic Computing, Washington, DC, USA, pp. 111–122. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  26. 26.
    Wohlstadter, E., Tai, S., Mikalsen, T.A., Rouvellou, I., Devanbu, P.T.: GlueQoS: Middleware to sweeten quality-of-service policy interactions. In: ICSE, pp. 189–199. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  27. 27.
    Tai, S., Khalaf, R., Mikalsen, T.A.: Composition of coordinated web services. In: Jacobsen, H.-A. (ed.) Middleware 2004. LNCS, vol. 3231, pp. 294–310. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  28. 28.
    Li, Y., Lan, Z.: Exploit failure prediction for adaptive fault-tolerance in cluster computing. In: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2006), Los Alamitos, CA, USA, pp. 531–538. IEEE Computer Society, Los Alamitos (2006)Google Scholar
  29. 29.
    Coffman, E., Gilbert, E.: Optimal strategies for scheduling checkpoints and preventative maintenance. IEEE Trans. Reliability 39, 9–18 (1990)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Mohamed S. Mansour
    • 1
  • Karsten Schwan
    • 1
  • Sameh Abdelaziz
    • 2
  1. 1.The College of Computing at Georgia TechAtlantaUSA
  2. 2.Worldspan, L.P.AtlantaUSA

Personalised recommendations