One Graph Is Worth a Thousand Logs: Uncovering Hidden Structures in Massive System Event Logs

  • Michal Aharon
  • Gilad Barash
  • Ira Cohen
  • Eli Mordechai
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)


In this paper we describe our work on pattern discovery in system event logs. For discovering the patterns we developed two novel algorithms. The first is a sequential and efficient text clustering algorithm which automatically discovers the templates generating the messages. The second, the PARIS algorithm (Principle Atom Recognition In Sets), is a novel algorithm which discovers patterns of messages that represent processes occurring in the system. We demonstrate the usefulness of our analysis, on real world logs from various systems, for debugging of complex systems, efficient search and visualization of logs and characterization of system behavior.


Latent Dirichlet Allocation Window Event Word Position Unique Message PARIS Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Shenk, J.: Demanding More from Log Management Systems, A SANS Whitepaper (June 2008),
  2. 2.
    Aharon, M., Elad, M., Bruckstein, A.M.: The K-SVD: An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation. The IEEE Trans. On Signal Processing 54(11), 4311–4322 (2006)CrossRefGoogle Scholar
  3. 3.
    Elad, M., Aharon, M.: Image Denoising Via Sparse and Redundant representations over Learned Dictionaries. The IEEE Trans. on Image Processing 15(12), 3736–3745 (2006)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Protter, M., Elad, M.: Image Sequence Denoising Via Sparse and Redundant Representations. IEEE Trans. on Image Processing 18(1), 27–36 (2009)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Bryt, O., Elad, M.: Compression of Facial Images Using the K-SVD Algorithm. Journal of Visual Communication and Image Representation 19(4), 270–283 (2008)CrossRefGoogle Scholar
  6. 6.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of machine Learning Research 3 (2003)Google Scholar
  7. 7.
    Cohen, I., Goldszmidt, M., Kelly, T., Symons, J., Chase, J.S.: Correlating instrumentation data to system states: A building block for automated diagnosis and control. In: Proc. 6th USENIX OSDI, San Francisco, CA (December 2004)Google Scholar
  8. 8.
    Cohen, I., Zhang, S., Goldszmidt, M., Symons, J., Kelly, T., Fox, A.: Capturing, indexing, clustering, and retrieving system history. In: Proc. 20th ACM SOSP (2005)Google Scholar
  9. 9.
    Powers, R., Cohen, I., Goldszmidt, M.: Short term performance forecasting in enterprise systems. In: SIGKDD 2005 (2005)Google Scholar
  10. 10.
    Peng, W., Perng, C., Li, T., Wang, H.: Event Summarization for System Management. In: SIGKDD 2007 (2007)Google Scholar
  11. 11.
    Hellerstein, J.L., Ma, S., Perng, C.-S.: Discovering actionable patterns in event data. IBM System Journal 41(3), 475 (2002)CrossRefGoogle Scholar
  12. 12.
    Li, T., Liang, F., Ma, S., Peng, W.: An integrated framework on mining logs Files for computing system management. In: SIGKDD 2005 (2005)Google Scholar
  13. 13.
    Sabato, S., Yom-Tov, E., Tsherniak, A., Rossetm, S.: Analyzing System Logs: A New View of What’s Important. In: Second Workshop on Tackling Computer Systems Problems with Machine Learning Techniques, SysML 2007 (2007)Google Scholar
  14. 14.
    Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.: Mining Console Logs for Large-Scale System Problem Detection. In: SysML 2008 (2008)Google Scholar
  15. 15.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Michal Aharon
    • 1
  • Gilad Barash
    • 1
  • Ira Cohen
    • 1
  • Eli Mordechai
    • 1
  1. 1.HP-Labs IsraelTechnion CityIsrael

Personalised recommendations