Skip to main content

SBAD: Sequence Based Attack Detection via Sequence Comparison

  • Conference paper
Privacy and Security Issues in Data Mining and Machine Learning (PSDML 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6549))

Abstract

Given a stream of time-stamped events, like alerts in a network monitoring setting, how can we isolate a sequence of alerts that form a network attack? We propose a Sequence Based Attack Detection (SBAD) method, which makes the following contributions: (a) it automatically identifies groups of alerts that are frequent; (b) it summarizes them into a suspicious sequence of activity, representing them with graph structures; and (c) it suggests a novel graph-based dissimilarity measure. As a whole, SBAD is able to group suspicious alerts, visualize them, and spot anomalies at the sequence level. The evaluations from three datasets—two benchmark datasets (DARPA 1999, PKDD 2007) and a private dataset Acer 2007 gathered from a Security Operation Center in Taiwan—support our approach. The method performs well even without the help of the IP and payload information. No need for privacy information as the input makes the method easy to plug into existing system such as an intrusion detector. To talk about efficiency, the proposed method can deal with large-scale problems, such as processing 300K alerts within 20 mins on a regular PC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. on Knowledge and Data Engineering 5(6), 914–925 (1993)

    Article  Google Scholar 

  2. Ding, B., Lo, D., Han, J., Khoo, S.-C.: Efficient mining of closed repetitive gapped subsequences from a sequence database. In: ICDE 2009 (March 2009)

    Google Scholar 

  3. Exbrayat, M.: ECML/PKDD challenge: analyzing web traffic a boundaries signature approach. In: PKDD 2007, pp. 17–29 (2007)

    Google Scholar 

  4. Faloutsos, C., Lin, K.-I.D.: Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: ACM SIGMOD, May 23-25, pp. 163–174 (1995)

    Google Scholar 

  5. Ji, X., Bailey, J., Dong, G.: Mining minimal distinguishing subsequence patterns with gap constraints. In: ICDM 2005 (2005)

    Google Scholar 

  6. Ke, Y., Cheng, J., Yu, J.X.: Top-k correlative graph mining. In: SDM, pp. 1038–1049 (2009)

    Google Scholar 

  7. Keogh, E., Lonardi, S., Ratanamahatana, C.A.: Towards parameter-free data mining. In: KDD 2004, pp. 206–215 (2004)

    Google Scholar 

  8. Kohavi, R., Provost, F.: Glossary of terms. Editorial for the Special Issue on Applications of Machine Learning and the Knowledge Discovery Process 30, 271–274 (1998)

    Google Scholar 

  9. Lane, T., Brodley, C.E.: An empirical study of two approaches to sequence learning for anomaly detection. Machine Learning 51(1), 73–107 (2004)

    Article  MATH  Google Scholar 

  10. Law, M.H.C., Zhang, N., Jain, A.K.: Nonlinear manifold learning for data stream. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178. Springer, Heidelberg (2004)

    Google Scholar 

  11. Lee, Y.-J., Mangasarian, O.L.: SSVM: A smooth support vector machine for classification. Comput. Optim. Appl. 20(1), 5–22 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  12. Li, M., Badger, J.H., Chen, X., Kwong, S., Kearney, P., Zhang, H.: An information-based sequence distance and its application to whole mitochondrialgenome phylogeny. Bioinformatics 17(2), 149–154 (2001)

    Article  Google Scholar 

  13. Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and Its Applications, 2nd edn. Springer, New York (1997)

    Book  MATH  Google Scholar 

  14. Mannila, H., Toivonen, H., Verkamo, A.I.: Discovering frequent episodes in sequences. In: Fayyad, U.M., Uthurusamy, R. (eds.) KDD 1995 (1995)

    Google Scholar 

  15. Ning, P., Cui, Y., Reeves, D., Xu, D.: Techniques and tools for analyzing intrusion alerts. ACM Trans. Inf. Sys. Secur. 7(2), 274–318 (2004)

    Article  Google Scholar 

  16. Pao, H.-K., Case, J.: Computing entropy for ortholog detection. In: International Conference on Computational Intelligence, pp. 89–92 (2004)

    Google Scholar 

  17. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  18. Zhou, J., Heckman, M., Reynolds, B., Carlson, A., Bishop, M.: Modeling network intrusion detection alerts for correlation. ACM Trans. Inf. Sys. Secur. 10(1), 1–31 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mao, CH., Pao, HK., Faloutsos, C., Lee, HM. (2011). SBAD: Sequence Based Attack Detection via Sequence Comparison. In: Dimitrakakis, C., Gkoulalas-Divanis, A., Mitrokotsa, A., Verykios, V.S., Saygin, Y. (eds) Privacy and Security Issues in Data Mining and Machine Learning. PSDML 2010. Lecture Notes in Computer Science(), vol 6549. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19896-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19896-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19895-3

  • Online ISBN: 978-3-642-19896-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics