Advertisement

Class Balanced Similarity-Based Instance Transfer Learning for Botnet Family Classification

  • Basil AlothmanEmail author
  • Helge Janicke
  • Suleiman Y. Yerima
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11198)

Abstract

The use of Transfer Learning algorithms for enhancing the performance of machine learning algorithms has gained attention over the last decade. In this paper we introduce an extension and evaluation of our novel approach Similarity Based Instance Transfer Learning (SBIT). The extended version is denoted Class Balanced SBIT (or CB-SBIT for short) because it ensures the dataset resulting after instance transfer does not contain class imbalance. We compare the performance of CB-SBIT against the original SBIT algorithm. In addition, we compare its performance against that of the classical Synthetic Minority Over-sampling Technique (SMOTE) using network traffic data. We also compare the performance of CB-SBIT against the performance of the open source transfer learning algorithm TransferBoost using text data. Our results show that CB-SBIT outperforms the original SBIT and SMOTE using varying sizes of network traffic data but falls short when compared to TransferBoost using text data.

Keywords

Similarity-based transfer learning Botnet detection SMOTE TransferBoost 

References

  1. 1.
    Alothman, B.: Raw network traffic data preprocessing and preparation for automatic analysis. In: International Conference on Cyber Incident Response, Coordination, Containment & Control (Cyber Incident) - 2018 (2018)Google Scholar
  2. 2.
    Alothman, B.: Similarity based instance transfer learning for botnet detection. Int. J. Intell. Comput. Res. (IJICR) 9, 880–889 (2018)Google Scholar
  3. 3.
    Chawla, N.V.: Data mining for imbalanced datasets: an overview. Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer, US (2010).  https://doi.org/10.1007/978-0-387-09823-4_45CrossRefGoogle Scholar
  4. 4.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002). http://dl.acm.org/citation.cfm?id=1622407.1622416CrossRefGoogle Scholar
  5. 5.
    Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Boosting for transfer learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 193–200. ICML 2007, ACM, New York, NY, USA (2007).  https://doi.org/10.1145/1273496.1273521
  6. 6.
    Draper-Gil, G., Lashkari, A.H., Mamun, M.S.I., Ghorbani, A.A.: Characterization of encrypted and VPN traffic using time-related features. In: ICISSP (2016)Google Scholar
  7. 7.
    Eaton, E., des Jardins, M.: Selective transfer between learning tasks using task-based boosting. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-11), pp. 337–342. AAAI Press (2011). Accessed 7–11 Aug 2011Google Scholar
  8. 8.
    Feldman, R., Sanger, J.: Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, New York (2006)CrossRefGoogle Scholar
  9. 9.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009).  https://doi.org/10.1145/1656274.1656278CrossRefGoogle Scholar
  10. 10.
    He, H., Ma, Y.: Imbalanced Learning: Foundations, Algorithms, and Applications, 1st edn. Wiley-IEEE Press (2013)Google Scholar
  11. 11.
    Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, New York (2011)CrossRefGoogle Scholar
  12. 12.
    Lang, K.: 20 newsgroups data set. http://www.ai.mit.edu/people/jrennie /20Newsgroups/
  13. 13.
    Liu, B., Xiao, Y., Hao, Z.: A selective multiple instance transfer learning method for text categorization problems. Knowl.-Based Syst. 141, 178–187 (2018).  https://doi.org/10.1016/j.knosys.2017.11.019, http://www.sciencedirect.com/science/article/pii/S0950705117305415CrossRefGoogle Scholar
  14. 14.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010).  https://doi.org/10.1109/TKDE.2009.191CrossRefGoogle Scholar
  15. 15.
    Samani, E.B.B., Jazi, H.H., Stakhanova, N., Ghorbani, A.A.: Towards effective feature selection in machine learning-based botnet detection approaches. In: 2014 IEEE Conference on Communications and Network Security, pp. 247–255 (2014)Google Scholar
  16. 16.
    Sun, G., Liang, L., Chen, T., Xiao, F., Lang, F.: Network traffic classification based on transfer learning. Comput. Electr. Eng. (2018).  https://doi.org/10.1016/j.compeleceng.2018.03.005, http://www.sciencedirect.com/science/article/pii/S004579061732829XCrossRefGoogle Scholar
  17. 17.
    Torrey, L., Shavlik, J.: Transfer learning. Handbook of Research on Machine Learning Applications, vol. 3, pp. 17–35. IGI Global (2009)Google Scholar
  18. 18.
    Weiss, S., Indurkhya, N., Zhang, T., Damerau, F.: Text Mining: Predictive Methods for Analyzing Unstructured Information. Springer, Berlin (2004)Google Scholar
  19. 19.
    Zhao, J., Shetty, S., Pan, J.W.: Feature-based transfer learning for network security. In: MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM) (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Basil Alothman
    • 1
    • 2
    Email author
  • Helge Janicke
    • 1
    • 2
  • Suleiman Y. Yerima
    • 1
    • 2
  1. 1.De Montfort UniversityLeicesterUK
  2. 2.Faculty of TechnologyDe Montfort UniversityLeicesterUK

Personalised recommendations