Skip to main content

Leveraging Event Structure for Adaptive Machine Learning on Big Data Landscapes

  • Conference paper
  • First Online:
Book cover Mobile, Secure, and Programmable Networking (MSPN 2015)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 9395))

Abstract

Modern machine learning techniques have been applied to many aspects of network analytics in order to discover patterns that can clarify or better demonstrate the behavior of users and systems within a given network. Often the information to be processed has to be converted to a different type in order for machine learning algorithms to be able to process them. To accurately process the information generated by systems within a network, the true intention and meaning behind the information must be observed. In this paper we propose different approaches for mapping network information such as IP addresses to integer values that attempts to keep the relation present in the original format of the information intact. With one exception, all of the proposed mappings result in (at most) 64 bit long outputs in order to allow atomic operations using CPUs with 64 bit registers. The mapping output size is restricted in the interest of performance. Additionally we demonstrate the benefits of the new mappings for one specific machine learning algorithm (k-means) and compare the algorithm’s results for datasets with and without the proposed transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cheng, F., Meinel, C., Azodi, A., Jaeger, D.: A new approach to building a multi-tier direct access knowledgebase for ids/siem systems. In: Proceedings of the 11th IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC2013), Chengdu, China, 12 2013. IEEE CS (2013)

    Google Scholar 

  2. Cheng, F., Meinel, C., Azodi, A., Jaeger, D.: Pushing the limits in event normalisation to improve attack detection in ids/siem systems. In: Proceedings of the 1st International Conference on Advanced Cloud and Big Data, Nanjing, China, 12 2013. IEEE CS (2013)

    Google Scholar 

  3. Azodi, A., Gawron, M., Cheng, F., Meinel, C., Sapegin, A., Jaeger, D.: Hierarchical object log format for normalization of security events. In: Proceedings of the 9th International Conference on Information Assurance and Security (IAS 2013), Tunis, Tunisia, 12 2013. IEEE CS (2013)

    Google Scholar 

  4. Aumasson, J.-P., Bernstein, D.J.: Siphash: a fast short-input prf, Jan 2015. https://131002.net/siphash/

  5. Brink, H., Richards, J.: Real-world machine learning. In: MEAP, pp. 1–22 (2014)

    Google Scholar 

  6. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Disc. 2(2), 121–167 (1998)

    Article  Google Scholar 

  7. Consul, P.C., Famoye, F.: Generalized poisson distribution. In: Lagrangian Probability Distributions, pp. 165–190 (2006)

    Google Scholar 

  8. Fangohr, H.: Performance of python’s long data type, Jan 2013. http://www.southampton.ac.uk/~fangohr/blog/performance-of-pythons-long-data-type.html

  9. Google Inc. Cityhash provides hash functions for strings, Jan 2010. https://code.google.com/p/cityhash/

  10. Google Inc. The farmhash family of hash functions, Jan 2015. https://code.google.com/p/farmhash/

  11. Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. Appl. Stat. 28, 100–108 (1979)

    Article  MATH  Google Scholar 

  12. Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, vol. 501. Wiley, New York (2009)

    MATH  Google Scholar 

  13. Schneider, P.: Tcp/ip traffic classification based on port numbers. Division of Applied Sciences, Cambridge, MA, 2138 (1996)

    Google Scholar 

  14. Schreiber, T.: A voronoi diagram based adaptive k-means-type clustering algorithm for multidimensional weighted data. In: Bieri, H., Noltemeier, H. (eds.) CG-WS 1991. LNCS, vol. 553, pp. 265–275. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Azodi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Azodi, A., Gawron, M., Sapegin, A., Cheng, F., Meinel, C. (2015). Leveraging Event Structure for Adaptive Machine Learning on Big Data Landscapes. In: Boumerdassi, S., Bouzefrane, S., Renault, É. (eds) Mobile, Secure, and Programmable Networking. MSPN 2015. Lecture Notes in Computer Science(), vol 9395. Springer, Cham. https://doi.org/10.1007/978-3-319-25744-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25744-0_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25743-3

  • Online ISBN: 978-3-319-25744-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics