Advertisement

One-Against-All Methodology for Features Selection and Classification of Internet Applications

  • José Everardo Bessa Maia
  • Raimir Holanda Filho
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5843)

Abstract

Traffic classification by Internet applications, even on off-line mode, can be interesting for many applications such as attack identification, QoS prioritization, network capacity planning and also computer forensic tools. Into the classification problem context is well-known the fact that a higher number of discriminators not necessarily will increase the discrimination power. This work investigates a methodology for features selection and Internet traffic classification in which the problem to classify one among M classes is split in M one-against-all binary classification problems, with each binary problem adopting eventually a set of different discriminators. Different combinations of discriminators selection methods, classification methods and decision algorithms could be embedded into the methodology. To investigate the performance of this methodology we have used the Naïve Bayes classifier to select the set of discriminators and for classification. The proposed method intends to reduce the total number of different discriminators used into the classification problem. The methodology was tested for classification of traffic flows and the experimental results showed that we can reduce significantly the number of discriminators per class sustaining the same accuracy level.

Keywords

Traffic classification features selection statistical discriminators 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Moore, D., et al.: CoralReef software suite as a tool for system and network administrators. In: In Proceedings of the LISA 2001 15th Systems Administration Conference (2001)Google Scholar
  2. 2.
    Sariou, S., Gummadi, K., Dunn, R., Gribble, S., Levy, H.: An analysis of Internet content delivery systems. In SIGOPS Oper.Syst. Rev., 315–327 (2002)Google Scholar
  3. 3.
    Sen, S., Wang, D.: Analyzing peer-to-peer traffic across large networks. In: ACM SIGCOMM Internet Measurement Workshop (2002)Google Scholar
  4. 4.
    Sen, S., Spatscheck, O., Wang, D.: Accurate, Scalable In-Network Identification on P2P Traffic using Application Signatures. In: WWW 2004: Proceedings of the 13th International Conference on World Wide Web (2004)Google Scholar
  5. 5.
    Karagiannis, T., Broido, A., Faloutsos, M., Claffy, K.: Transport Layer Identification of P2P Traffic. In: Proceedings of IMC 2004 (2004)Google Scholar
  6. 6.
    Paxson, V.: Empirically derived analytic models of wide-area TCP connections. IEEE/ACM Trans. Netw., 316–336 (1994)Google Scholar
  7. 7.
    Auld, T., et al.: Bayesian Neural Networks for Internet Traffic Classification. IEEE Transactions on Neural Networks (2007)Google Scholar
  8. 8.
    Carmo, M.F.F., Maia, J.E.B., Holanda Filho, R., de Souza, J.N.: Attack Detection based on Statistical Discriminators. In: IEEE International Global Information Infrastructure Symposium, 2007, Marrakech. Proceedings of the IEEE International Global Information Infrastructure Symposium (2007)Google Scholar
  9. 9.
    Paulino, G., Maia, J.E.B., Holanda Filho, R., de Souza, J.N.: P2P Traffic Identification using Cluster Analysis. In: IEEE International Global Information Infrastructure Symposium, 2007, Marrakech. Proceedings of the IEEE International Global Information Infrastructure Symposium (2007)Google Scholar
  10. 10.
    Holanda Filho, R., Maia, J.E.B., do Carmo, M.F.F., Paulino, G.: An Internet Traffic Classification Methodology based on Statistical Discriminators. In: IEEE/IFIP Network Operations & Management Symposium, 2008, Salvador, Brazil. Proceedings of the NOMS 2008 (2008)Google Scholar
  11. 11.
    Moore, A., Zuev, D.: Internet Traffic Classification Using Bayesian Analysis Techniques. In: Proceedings of the 2005 ACM Sigmetrics International Conference on Measurements and Modeling of Computer Systems, Alberta, Canada (2005)Google Scholar
  12. 12.
    Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. John Wiley Sons, New York (1958)zbMATHGoogle Scholar
  13. 13.
    Johnson, D.: Applied Multivariate Methods for Data Analysis. Brooks/Cole Publishing Co. (1998)Google Scholar
  14. 14.
    Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley and Sons, Inc., Chichester (1990)Google Scholar
  15. 15.
    Jain, R.: The Art of Computer Systems Performance Analysis. John Wiley Sons, Inc., Chichester (1991)zbMATHGoogle Scholar
  16. 16.
    MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  17. 17.
    Moore, A., et al.: Discriminators for use in flow-based classification. RR-05.13 Department of Computer Science. University of London (2005)Google Scholar
  18. 18.
    Moore, A., et al.: Architecture of a Network Monitor. In: Passive & Active Measurement Workshop, PAM (2003)Google Scholar
  19. 19.
    Wei, L., et al.: Efficient application identification and the temporal and spatial stability of classification schema. Computer Networks 53, 790–809 (2009)zbMATHCrossRefGoogle Scholar
  20. 20.
    Kim, H., et al.: Internet Traffic Classification Demystified: Myths, Caveats, and the Best Practices. In: ACM CoNEXT 2008, Madrid, SPAIN, December 10-12 (2008)Google Scholar
  21. 21.
    WEKA: Data Mining Software in JavaGoogle Scholar
  22. 22.
  23. 23.
  24. 24.
    Hand, D.J., Yu, Y.: Idiots Bayes - not so stupid after all? International Statistical Review 69, 385–389 (2001)zbMATHCrossRefGoogle Scholar
  25. 25.
    Zhang, H.: The optimality of naive Bayes. In: Proceedings of the Seventeenth Florida Artificial Intelligence Research Society Conference, pp. 562–567. AAAI Press, Menlo Park (2004a)Google Scholar
  26. 26.
    Beygelzimer, A., Langford, J., Zadrozny, B.: Weighted One-Against-All. In: Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), pp. 720–725 (2005)Google Scholar
  27. 27.
    Sulzmann, J.-N., Furnkranz, J., Hullermeier, E.: On pairwise naive bayes classifiers. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 371–381. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  28. 28.
    Witten, I.H., Frank, E.: Data Mining. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
  29. 29.
    de Oliveira, M., Valadas, R., Pacheco, A., Salvador, P.: Cluster analysis of Internet users based on hourly traffic utilization. IEICE Transactions on Communications E90-B(7), 1594–1607 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • José Everardo Bessa Maia
    • 1
  • Raimir Holanda Filho
    • 1
  1. 1.Department of Statistics and ComputingState Univ. of Ceará – UECE, Master’s Course in Applied Computer Sciences, Univ. of Fortaleza - UNIFOR 

Personalised recommendations