Advertisement

Generation of Network Traffic Using WGAN-GP and a DFT Filter for Resolving Data Imbalance

  • WooHo LeeEmail author
  • BongNam Noh
  • YeonSu Kim
  • KiMoon JeongEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11874)

Abstract

The intrinsic features of Internet networks lead to imbalanced class distributions when datasets are conformed, phenomena called Class Imbalance and that is attaching an increasing attention in many research fields. In spite of performance losses due to Class Imbalance, this issue has not been thoroughly studied in Network Traffic Classification and some previous works are limited to few solutions and/or assumed misleading methodological approaches. In this study, we propose a method for generating network attack traffic to address data imbalance problems in training datasets. For this purpose, traffic data was analyzed based on deep packet inspection and features were extracted based on common traffic characteristics. Similar malicious traffic was generated for classes with low data counts using Wasserstein generative adversarial networks (WGAN) with a gradient penalty algorithm. The experiment demonstrated that the accuracy of each dataset was improved by approximately 5% and the false detection rate was reduced by approximately 8%. This study has demonstrated that enhanced learning and classification can be achieved by solving the problem of degraded performance caused by data imbalance in datasets used in deep learning based intrusion detection systems.

Keywords

Deep learning Intrusion detection Security Generative adversarial network 

References

  1. 1.
    Tama, B.A., Rhee, K.H.: Performance analysis of multiple classifier system in DoS attack detection. In: Kim, H.-W., Choi, D. (eds.) WISA 2015. LNCS, vol. 9503, pp. 339–347. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-31875-2_28CrossRefGoogle Scholar
  2. 2.
    Tama, B.A., Rhee, K.H.: A combination of PSO-based feature selection and tree-based classifiers ensemble for intrusion detection systems. In: Park, D.S., Chao, H.C., Jeong, Y.S., Park, J. (eds.) Advances in Computer Science and Ubiquitous Computing. LNEE, vol. 373, pp. 489–495. Springer, Singapore (2015).  https://doi.org/10.1007/978-981-10-0281-6_71CrossRefGoogle Scholar
  3. 3.
    Tama, B.A., Rhee, K.H.: Data mining techniques in DoS/DDoS attack detection: a literature review. Information 18(8), 3739–3747 (2015)Google Scholar
  4. 4.
    Zhu, X., Liu, Y., Li, J., Wan, T., Qin, Z.: Emotion classification with data augmentation using generative adversarial networks. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10939, pp. 349–360. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-93040-4_28CrossRefGoogle Scholar
  5. 5.
    Douzas, G., Bacao, F.: Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Syst. Appl. 91, 464–471 (2018)CrossRefGoogle Scholar
  6. 6.
    Mariani, G., et al.: BAGAN: data augmentation with balancing GAN. arXiv, preprint arXiv:1803.09655 (2018)
  7. 7.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  8. 8.
    Yao, H.P., Liu, Y.Q., Fang, C.: An abnormal network traffic detection algorithm based on big data analysis. Int. J. Comput. Commun. Control 11(4), 567–579 (2016)CrossRefGoogle Scholar
  9. 9.
    Li, Y., Ma, R., Jiao, R.: A hybrid malicious code detection method based on deep learning. Methods 9(5), 205–2016 (2015)Google Scholar
  10. 10.
    Tavallaee, M., et al.: A detailed analysis of the KDD CUP 99 data set. In: IEEE Computational Intelligence for Security and Defense Applications, CISDA. IEEE Symposium (2009)Google Scholar
  11. 11.
    Hariharan, B., Girshick, R.B.: Low-shot visual recognition by shrinking and hallucinating features. In: ICCV (2017)Google Scholar
  12. 12.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv, preprint arXiv:1701.07875 (2017)
  13. 13.
    Gulrajani, I., et al.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems (2017)Google Scholar
  14. 14.
    Kim, J.Y., Bu, S.J., Cho, S.B.: Malware detection using deep transferred generative adversarial networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S. (eds.) International Conference on Neural Information Processing. LNCS, vol. 10634, pp. 556–564. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-70087-8_58CrossRefGoogle Scholar
  15. 15.
    Sun, D., et al.: A new mimicking attack by LSGAN. Tools with artificial intelligence (ICTAI). In: IEEE 29th International Conference on IEEE (2017)Google Scholar
  16. 16.
    Yin, C., et al.: An enhancing framework for botnet detection using generative adversarial networks. In: 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD). IEEE (2018)Google Scholar
  17. 17.
    Li, D. et al.: Anomaly detection with generative adversarial networks for multivariate time series. arXiv, preprint arXiv:1809.04758 (2018)
  18. 18.
    Lin, Z., Shi, Y., Xue, Z.: IDSGAN: generative adversarial networks for attack generation against intrusion detection. arXiv, preprint arXiv:1809.02077 (2018)
  19. 19.
    Dainotti, A., Pescape, A., Claffy, K.: Issues and future directions in traffic classification. Netw. IEEE 26(1), 35–40 (2012)CrossRefGoogle Scholar
  20. 20.
    Creech, G., Hu, J.: Generation of a new IDS test dataset: time to retire the KDD collection. In: IEEE Wireless Communications and Networking Conference (WCNC), pp. 4487–4492 (2013)Google Scholar
  21. 21.
    Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS). IEEE (2015)Google Scholar
  22. 22.
    Yu, L., Zhang, W., Wang, J., Yu, Y.: SeqGAN: sequence generative adversarial nets with policy gradient. In: AAAI, pp. 2852–2858 (2017)Google Scholar
  23. 23.
    Kusner, M.J., Hernández-Lobato, J.M.: GANs for sequences of discrete elements with the gumbel-softmax distribution. arXiv preprint arXiv:1611.04051 (2016)
  24. 24.
    Bhaskara, V.S., et al.: Emulating malware authors for proactive protection using GANs over a distributed image visualization of dynamic file behavior. arXiv, preprint arXiv:1807.07525 (2018)
  25. 25.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  26. 26.
    Gómez, S.E., et al.: Exploratory study on class imbalance and solutions for network traffic classification. Neurocomputing 343, 100–119 (2019)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Chonnam National UniversityGwangjuRepublic of Korea
  2. 2.HPC Cloud Team in Korea Institute of Science and Technology InformationDaejeonSouth Korea

Personalised recommendations