Skip to main content

One-Against-All Methodology for Features Selection and Classification of Internet Applications

  • Conference paper
IP Operations and Management (IPOM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 5843))

Included in the following conference series:

  • 669 Accesses

Abstract

Traffic classification by Internet applications, even on off-line mode, can be interesting for many applications such as attack identification, QoS prioritization, network capacity planning and also computer forensic tools. Into the classification problem context is well-known the fact that a higher number of discriminators not necessarily will increase the discrimination power. This work investigates a methodology for features selection and Internet traffic classification in which the problem to classify one among M classes is split in M one-against-all binary classification problems, with each binary problem adopting eventually a set of different discriminators. Different combinations of discriminators selection methods, classification methods and decision algorithms could be embedded into the methodology. To investigate the performance of this methodology we have used the Naïve Bayes classifier to select the set of discriminators and for classification. The proposed method intends to reduce the total number of different discriminators used into the classification problem. The methodology was tested for classification of traffic flows and the experimental results showed that we can reduce significantly the number of discriminators per class sustaining the same accuracy level.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Moore, D., et al.: CoralReef software suite as a tool for system and network administrators. In: In Proceedings of the LISA 2001 15th Systems Administration Conference (2001)

    Google Scholar 

  2. Sariou, S., Gummadi, K., Dunn, R., Gribble, S., Levy, H.: An analysis of Internet content delivery systems. In SIGOPS Oper.Syst. Rev., 315–327 (2002)

    Google Scholar 

  3. Sen, S., Wang, D.: Analyzing peer-to-peer traffic across large networks. In: ACM SIGCOMM Internet Measurement Workshop (2002)

    Google Scholar 

  4. Sen, S., Spatscheck, O., Wang, D.: Accurate, Scalable In-Network Identification on P2P Traffic using Application Signatures. In: WWW 2004: Proceedings of the 13th International Conference on World Wide Web (2004)

    Google Scholar 

  5. Karagiannis, T., Broido, A., Faloutsos, M., Claffy, K.: Transport Layer Identification of P2P Traffic. In: Proceedings of IMC 2004 (2004)

    Google Scholar 

  6. Paxson, V.: Empirically derived analytic models of wide-area TCP connections. IEEE/ACM Trans. Netw., 316–336 (1994)

    Google Scholar 

  7. Auld, T., et al.: Bayesian Neural Networks for Internet Traffic Classification. IEEE Transactions on Neural Networks (2007)

    Google Scholar 

  8. Carmo, M.F.F., Maia, J.E.B., Holanda Filho, R., de Souza, J.N.: Attack Detection based on Statistical Discriminators. In: IEEE International Global Information Infrastructure Symposium, 2007, Marrakech. Proceedings of the IEEE International Global Information Infrastructure Symposium (2007)

    Google Scholar 

  9. Paulino, G., Maia, J.E.B., Holanda Filho, R., de Souza, J.N.: P2P Traffic Identification using Cluster Analysis. In: IEEE International Global Information Infrastructure Symposium, 2007, Marrakech. Proceedings of the IEEE International Global Information Infrastructure Symposium (2007)

    Google Scholar 

  10. Holanda Filho, R., Maia, J.E.B., do Carmo, M.F.F., Paulino, G.: An Internet Traffic Classification Methodology based on Statistical Discriminators. In: IEEE/IFIP Network Operations & Management Symposium, 2008, Salvador, Brazil. Proceedings of the NOMS 2008 (2008)

    Google Scholar 

  11. Moore, A., Zuev, D.: Internet Traffic Classification Using Bayesian Analysis Techniques. In: Proceedings of the 2005 ACM Sigmetrics International Conference on Measurements and Modeling of Computer Systems, Alberta, Canada (2005)

    Google Scholar 

  12. Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. John Wiley Sons, New York (1958)

    MATH  Google Scholar 

  13. Johnson, D.: Applied Multivariate Methods for Data Analysis. Brooks/Cole Publishing Co. (1998)

    Google Scholar 

  14. Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley and Sons, Inc., Chichester (1990)

    Google Scholar 

  15. Jain, R.: The Art of Computer Systems Performance Analysis. John Wiley Sons, Inc., Chichester (1991)

    MATH  Google Scholar 

  16. MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  17. Moore, A., et al.: Discriminators for use in flow-based classification. RR-05.13 Department of Computer Science. University of London (2005)

    Google Scholar 

  18. Moore, A., et al.: Architecture of a Network Monitor. In: Passive & Active Measurement Workshop, PAM (2003)

    Google Scholar 

  19. Wei, L., et al.: Efficient application identification and the temporal and spatial stability of classification schema. Computer Networks 53, 790–809 (2009)

    Article  MATH  Google Scholar 

  20. Kim, H., et al.: Internet Traffic Classification Demystified: Myths, Caveats, and the Best Practices. In: ACM CoNEXT 2008, Madrid, SPAIN, December 10-12 (2008)

    Google Scholar 

  21. WEKA: Data Mining Software in Java

    Google Scholar 

  22. http://www.cs.waikato.ac.nz/ml/weka/

  23. http://www.mathworks.com/support

  24. Hand, D.J., Yu, Y.: Idiots Bayes - not so stupid after all? International Statistical Review 69, 385–389 (2001)

    Article  MATH  Google Scholar 

  25. Zhang, H.: The optimality of naive Bayes. In: Proceedings of the Seventeenth Florida Artificial Intelligence Research Society Conference, pp. 562–567. AAAI Press, Menlo Park (2004a)

    Google Scholar 

  26. Beygelzimer, A., Langford, J., Zadrozny, B.: Weighted One-Against-All. In: Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), pp. 720–725 (2005)

    Google Scholar 

  27. Sulzmann, J.-N., Furnkranz, J., Hullermeier, E.: On pairwise naive bayes classifiers. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 371–381. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  28. Witten, I.H., Frank, E.: Data Mining. Morgan Kaufmann Publishers, San Francisco (2000)

    Google Scholar 

  29. de Oliveira, M., Valadas, R., Pacheco, A., Salvador, P.: Cluster analysis of Internet users based on hourly traffic utilization. IEICE Transactions on Communications E90-B(7), 1594–1607 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bessa Maia, J.E., Holanda Filho, R. (2009). One-Against-All Methodology for Features Selection and Classification of Internet Applications. In: Nunzi, G., Scoglio, C., Li, X. (eds) IP Operations and Management. IPOM 2009. Lecture Notes in Computer Science, vol 5843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04968-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04968-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04967-5

  • Online ISBN: 978-3-642-04968-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics