Skip to main content

Mining Hot Clusters of Similar Anomalies for System Management

  • Conference paper
  • 1606 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6230))

Abstract

Recently automatic system management has attracted much attention on mining system log files for anomaly detection, diagnosis and prediction. An important problem in this area is mining hot clusters of similar anomalies for system management. A hot anomaly cluster is defined as a largest-sized group of similar anomalies, whose similarity satisfies some user-specified constraints. While, some major anomalies have common symptoms and are shared by several hot clusters, these clusters do not have to be disjoint. So this problem could not be easily solved by existing clustering algorithms, such as k-means and EM. In this paper we propose a novel heuristic clustering algorithm, named Hot Clustering (HC), for mining these patterns. The key idea of HC is to group neighboring anomalies into hot clusters based on some heuristic rules. To validate our approach, we perform the experiment on bug reports from Bugzilla database by k-means, EM and HC. The experimental results show that our approach is both efficient and effective for this problem.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Peng, W., Li, T., Ma, S.: Mining logs files for computing system management. In: ICAC 2005, Seattle, WA, USA, pp. 309–310 (2005)

    Google Scholar 

  2. Topol, B., Ogle, D., Pierson, D., Thoenscn, J., Sweitzer, J., Chow, M., Hoffmann, M.A., Durham, P., Telford, R., Sheth, S., Studwell, T.: Automating problem determination: A first step toward self-healing computing systems. In: IBM White Paper (October 2003)

    Google Scholar 

  3. Li, Z., Tan, L., Wang, X., Lu, S., Zhou, Y., Zhai, C.: Have things changed now? an empirical study of bug characteristics in modern open source software. In: ASID 2006, San Jose, California, USA, pp. 25–33 (2006)

    Google Scholar 

  4. Chen, M.Y., Zheng, A.X., Lloyd, J., Jordan, M.I., Brewer, E.A.: Failure diagnosis using decision trees. In: ICAC 2004, New York, NY, USA, pp. 36–43 (2004)

    Google Scholar 

  5. Liang, Y., Zhang, Y., Xiong, H., Sahoo, R., Sivasubramaniam, A.: Failure prediction in ibm bluegene/l event logs. In: ICDM 2007, Omaha, Nebraska, USA, pp. 583–588 (2007)

    Google Scholar 

  6. Srivastava, A.N., Zane-Ulman, B.: Enabling the discovery of recurring anomalies in aerospace problem reports using high-dimensional clustering techniques. In: IEEE Aerospace Conference 2006, p. 17 (2006)

    Google Scholar 

  7. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996, Portland, Oregon, USA, pp. 226–231 (1996)

    Google Scholar 

  8. Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia database with noise. In: KDD 1998, New York, NY, USA, pp. 58–65 (1998)

    Google Scholar 

  9. Jiang, D., Pei, J., Zhang, A.: Dhc: A density-based hierarchical clustering method for time series gene expression data. In: BIBE 2003, Bethesda, MD, USA, pp. 393–400 (2003)

    Google Scholar 

  10. Forman, G.: An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3, 1289–1305 (2003)

    Article  MATH  Google Scholar 

  11. Mozilla.org Bugzilla (2005), https://bugzilla.mozilla.org

  12. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2-3), 107–145 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, D., Lin, F., Shi, Z., Huang, H. (2010). Mining Hot Clusters of Similar Anomalies for System Management. In: Zhang, BT., Orgun, M.A. (eds) PRICAI 2010: Trends in Artificial Intelligence. PRICAI 2010. Lecture Notes in Computer Science(), vol 6230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15246-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15246-7_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15245-0

  • Online ISBN: 978-3-642-15246-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics