Abstract
Service-oriented computing has enabled developers to build large, cross-domain service compositions in a more routine manner. These systems inhabit complex, multi-tier operating environments that pose many challenges to their reliable operation. Unanticipated failures at runtime can be time-consuming to diagnose and may propagate across administrative boundaries. It has been argued that measuring readily available data about system operation can significantly increase the failure management capabilities of such systems. We have built an online monitoring system for cross-domain Web service compositions called Monere, which we use in a controlled experiment involving human operators in order to determine the effects of such an approach on diagnosis times for system-level failures. This paper gives an overview of how Monere is able to instrument relevant components across all layers of a service composition and to exploit the structure of BPEL workflows to obtain structural cross-domain dependency graphs. Our experiments reveal a reduction in diagnosis time of more than 20%. However, further analysis reveals this benefit to be dependent on certain conditions, which leads to insights about promising directions for effective support of failure diagnosis in large Web service compositions.
Chapter PDF
Similar content being viewed by others
Keywords
- Diagnosis Time
- Service Composition
- Dependency Graph
- Business Process Execution Language
- Collection Interval
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adobe Flex, http://bit.ly/2DbkE9
Hyperic SIGAR API, http://bit.ly/96BIG3
RHQ, http://bit.ly/apijCR
ActiveBPEL (2010), http://bit.ly/be87LF
Agarwala, S., Schwan, K.: Sysprof: Online distributed behavior diagnosis through fine-grain system monitoring. In: ICDCS (July 2006)
Baresi, L., Guinea, S.: Self-supervising bpel processes. IEEE TSE 37, 247–263 (2011)
Barham, P., Donnelly, A., Isaacs, R., Mortier, R.: Using magpie for request extraction and workload modelling. In: OSDI. USENIX, Berkeley (2004)
Bhatia, S., Kumar, A., Fiuczynski, M.E., Peterson, L.: Lightweight, high-resolution monitoring for troubleshooting production systems. In: OSDI. USENIX, Berkeley (2008)
Birman, K., van Renesse, R., Vogels, W.: Adding high availability and autonomic behavior to web services. In: ICSE. IEEE CS, Washington, DC, USA (2004)
Box, D., et al.: Simple Object Access Protocol (SOAP 1.1) (May 2000)
Cantrill, B.M., Shapiro, M.W., Leventhal, A.H.: Dynamic instrumentation of production systems. In: ATC. USENIX, Berkeley (2004)
Chandra, A., Prinja, R., Jain, S., Zhang, Z.: Co-designing the failure analysis and monitoring of large-scale systems. SIGMETRICS Perform. Eval. Rev. 36 (August 2008)
Chen, M.Y., Kiciman, E., Fratkin, E., Fox, A., Brewer, E.: Pinpoint: problem determination in large, dynamic internet services. In: DSN. IEEE (2002)
Emmerich, W., Butchart, B., Chen, L., Wassermann, B., Price, S.L.: Grid Service Orchestration using the Business Process Execution Language (BPEL). JOGC 3(3-4), 283–304 (2005)
Gupta, M., Neogi, A., Agarwal, M.K., Kar, G.: Discovering Dynamic Dependencies in Enterprise Environments for Problem Determination. In: Brunner, M., Keller, A. (eds.) DSOM 2003. LNCS, vol. 2867, pp. 125–166. Springer, Heidelberg (2003)
Jordan, D., et al.: Web Services Business Process Execution Language 2.0 WS-BPEL (August 2006)
Kashima, H., Tsumura, T., Ide, T., Nogayama, T., Hirade, R., Etoh, H., Fukuda, T.: Network-based problem detection for distributed systems. In: ICDE. IEEE CS, Washington, DC, USA (2005)
Katchabaw, M., Howard, S., Lutfiyya, H., Marshall, A., Bauer, M.: Making distributed applications manageable through instrumentation. In: Proc., 2nd Intl. Workshop on SEPDS 1997 (May 1997)
Lee, W., McGough, S., Newhouse, S., Darlington, J.: A standard based approach to job submission through web services. In: Cox, S. (ed.) Proc. of the UK e-Science All Hands Meeting, Nottingham, pp. 901–905. UK EPSRC (2004) ISBN 1-904425-21-6
Litzkow, M., Livny, M., Mutka, M.: Condor - A Hunter of Idle Workstations. In: ICDCS (June 1988)
Moser, O., Rosenberg, F., Dustdar, S.: Event Driven Monitoring for Service Composition Infrastructures. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 38–51. Springer, Heidelberg (2010)
Perry, J.S.: Java Management Extensions, 1st edn. O’Reilly & Associates, Inc., Sebastopol (2002)
Plattner, B.: Real-time execution monitoring. IEEE TSE 10(6), 756–764 (1984)
Red Hat, Inc.: Jopr. http://www.jboss.org/jopr
Skene, J., Raimondi, F., Emmerich, W.: Service-level agreements for electronic services. IEEE TSE 36(2), 288–304 (2010)
Tak, B.C., Tang, C., Zhang, C., Govindan, S., Urgaonkar, B., Chang, R.N.: vpath: precise discovery of request processing paths from black-box observations of thread and network activities. In: ATC. USENIX, Berkeley (2009)
Vogels, W.: World wide failures. In: Proc. of the 7th Workshop on ACM SIGOPS European Workshop: Systems Support for Worldwide Applications. ACM, New York (1996)
Vogels, W., Re, C.: Ws-membership - failure management in a web-services world. In: WWW (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wassermann, B., Emmerich, W. (2011). Monere: Monitoring of Service Compositions for Failure Diagnosis. In: Kappel, G., Maamar, Z., Motahari-Nezhad, H.R. (eds) Service-Oriented Computing. ICSOC 2011. Lecture Notes in Computer Science, vol 7084. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25535-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-25535-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25534-2
Online ISBN: 978-3-642-25535-9
eBook Packages: Computer ScienceComputer Science (R0)