Classification of Encrypted Internet Traffic Using Kullback-Leibler Divergence and Euclidean Distance
- 628 Downloads
Abstract
The limitations of traditional classification methods based on port number and payload inspection to classify encrypted or obfuscated Internet traffic have led to significant research efforts focusing on classification approaches based on Machine Learning techniques using Transport Layer statistical features. However, these approaches also have their own limitations, leading to the investigation of alternative approaches, including statistics-based approaches. Statistical approaches can be an alternative to machine learning ones because statistical approaches can operate in real time and do not need to be retrained each time a new type of traffic appears. In this article, we propose two statistical classifiers for encrypted Internet traffic based on Kullback-Leibler divergence and Euclidean distance, which are computed using the flow and packet size obtained from some of the protocols used by applications. In our experiments, we evaluate the two proposed classifiers and compare them with a classifier based on Support Vector Machine (SVM). During our study, we were able to classify the traffic by using few features without compromising the performance of the classifier. The experimental results illustrate the effectiveness of our models used for traffic classification.
Keywords
Traffic classification Encrypted Internet traffic Kullback-Leibler divergence Euclidean distanceReferences
- 1.Sun, G., Chen, T., Su, Y., Li, C.: Internet traffic classification based on incremental support vector machines. Mob. Netw. Appl. 23(4), 789–796 (2018)CrossRefGoogle Scholar
- 2.Dias, K.L., Pongelupe, M.A., Caminhas, W.M., de Errico, L.: An innovative approach for real-time network traffic classification. Comput. Netw. 158, 143–157 (2019)CrossRefGoogle Scholar
- 3.Cascarano, N., Ciminiera, L., Risso, F.: Optimizing deep packet inspection for high-speed traffic analysis. J. Netw. Syst. Manag. 19(1), 7–31 (2011)CrossRefGoogle Scholar
- 4.Zhang, J., Li, Z., Pu, Z., Xu, C.: Comparing prediction performance for crash injury severity among various machine learning and statistical methods. IEEE Access 6, 60079–60087 (2018)CrossRefGoogle Scholar
- 5.Gomes, J.V., Inácio, P.R.M., Pereira, M., Freire, M.M., Monteiro, P.P.: Identification of peer-to-peer VoIP sessions using entropy and codec properties. IEEE Trans. Parallel Distrib. Syst. 24(10), 2004–2014 (2013)CrossRefGoogle Scholar
- 6.Zhang, M., John, W., Claffy, K.C., Brownlee, N., Diego, U.C.S.: State of the art in traffic classification: a research review. In: 10th International Conference Passive Active Measurement Student Work, PAM 2009, pp. 3–4 (2009)Google Scholar
- 7.M. Neto, J. V. Gomes, Mário M. Freire, and P. R. M. Inácio, Real-time traffic classification based on statistical tests for matching signatures with packet length distributions, 19th IEEE Workshop on in Local and Metropolitan Area Networks (LANMAN), 2013. IEEE, 2013, pp. 1–6Google Scholar
- 8.V. P. Gomes et al., “Analysis of Peer-to-Peer traffic using a behavioural method based on entropy,” Conf. Proc. IEEE Int. Performance, Comput. Commun. Conf., pp. 201–208, 2008Google Scholar
- 9.Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)zbMATHGoogle Scholar
- 10.Delpha, C., Diallo, D., Youssef, A.: Kullback-Leibler divergence for fault estimation and isolation: application to gamma distributed data. Mech. Syst. Sig. Process. 93, 118–135 (2017)CrossRefGoogle Scholar
- 11.Galas, D.J., Dewey, G., Kunert-, J., Sakhanenko, N.A., Seattle, B., Sciences, H.: Expansion of the Kullback-Leibler divergence a new class of information metrics. Axioms 6(2), 8 (2017)CrossRefGoogle Scholar
- 12.Peng, L., Zhang, H., Chen, Y., Yang, B.: Imbalanced traffic identification using an imbalanced data gravitation-based classification model. Comput. Commun. 102, 177–189 (2017)CrossRefGoogle Scholar
- 13.Tongaonkar, A., Torres, R., Iliofotou, M., Keralapura, R., Nucci, A.: Towards self adaptive network traffic classification. Comput. Commun. 56, 35–46 (2015)CrossRefGoogle Scholar
- 14.Li, D., Hu, G., Wang, Y., Pan, Z.: Network traffic classification via non-convex multi-task feature learning. Neurocomputing 152, 322–332 (2015)CrossRefGoogle Scholar
- 15.Ertam, F., Avcı, E.: A new approach for internet traffic classification: GA-WK-ELM. Meas. J. Int. Meas. Confed. 95, 135–142 (2017)CrossRefGoogle Scholar
- 16.Schmidt, B., Al-fuqaha, A., Gupta, A., Kountanis, D.: Optimizing an artificial immune system algorithm in support of flow-Based internet traffic classification. Appl. Soft Comput. J. 54, 1–22 (2017)CrossRefGoogle Scholar
- 17.Shi, H., Li, H., Zhang, D., Cheng, C., Wu, W.: Efficient and robust feature extraction and selection for traffic classification. Comput. Netw. 119, 1–16 (2017)CrossRefGoogle Scholar
- 18.M., Pedro J and Ho, Purdy P and V., Nuno, A Kullback-Leibler Divergence Based Kernel For SVM Classification In Multimedia Applications, Advances in neural information processing systems, 1385-1392, 2004Google Scholar
- 19.T, Pang-Ning; S, Michael; K, Vipin, Introduction To Data Mining, Person, Addison Wesley, Pearson Education, 2006. lSBN 0-321-420Google Scholar
- 20.Gomes, J.V., Inácio, P.R.M., Pereira, M., Freire, M.M., Monteiro, P.P.: Exploring behavioral patterns through entropy in multimedia peer-to-peer traffic. Comput. J. 55(6), 740–755 (2012)CrossRefGoogle Scholar
- 21.I. Syarif, A. Prugel-Bennett, and G. Wills, SVM parameter optimization using grid search and genetic algorithm to improve classification performance, Telkomnika (Telecommunication Comput. Electron. Control)., 14(4):1502–1509, 2016Google Scholar
- 22.Tavara, S.: Parallel computing of support vector machines: a survey. ACM Comput. Surv. 51, 1–38 (2019)CrossRefGoogle Scholar
- 23.Marwala, T.: Support Vector Machines, pp. 97–112. In Handbook of Machine Learning, Wold Scientific (2018)Google Scholar
- 24.scikit-learn user guide Release 0.21.2. Available in https://scikit-learn.org/stable/modules/svm.html. Accessed in 06/05/2019
- 25.Library PSrecord 1.1. Available in https://pypi.org/project/psrecord/ . Accessed in 06/05/2019
- 26.Parveen, S., Singh, S.K., Singh, U., Kumar, D.: A comparative study of traditional and Kullback-Leibler divergence of survival functions estimators for the parameter of Lindley distribution. Austrian J. Stat. 48(5), 45–53 (2019)CrossRefGoogle Scholar
- 27.Xiang, Y., Li, K., Zhou, W.: Low-rate DDoS attacks detection and traceback by using new information metrics. IEEE Trans. Inf. Forensics Secur. 6(2), 426–437 (2011)CrossRefGoogle Scholar