Encrypted Traffic Classification Based on an Improved Clustering Algorithm

  • Meng Zhang
  • Hongli Zhang
  • Bo Zhang
  • Gang Lu
Part of the Communications in Computer and Information Science book series (CCIS, volume 320)


Classification analysis of network traffic based on port number or payload is becoming increasingly difficult from security to quality of service measurements, because of using dynamic port numbers, masquerading and various cryptographic techniques to avoid detection. Research tends to analyze flow statistical features with machine learning techniques. Clustering approaches do not require complex training procedure and large memory cost. However, the performance of clustering algorithm like k-Means still have own disadvantages. We propose a novel approach of considering harmonic mean as distance matric, and evaluate it in terms of three metrics on real-world encrypted traffic. The result shows the classification has better performance compared with the previously.


traffic classification machine learning k-means clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Thomas, K., Konstantina, P., Michalis, F.: BLINC: Multilevel Traffic Classification in the Dark. Computer Communication Review 35(4), 229–240 (2005)CrossRefGoogle Scholar
  2. 2.
    Roughan, M., Sen, S., Spatscheck, O., Duffield, N.: Class-of service mapping for QoS: a statistical signature-based approach to IP traffic classification. In: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, New York, USA, pp. 135–148 (2004)Google Scholar
  3. 3.
    Karagiannis, T., Broido, A., Faloutsos, M., Claffy, K.C.: Transport Layer Identification of P2P Traffic. In: Proc. of IMC 2004, Taormina, Italy, pp. 121–134 (2004)Google Scholar
  4. 4.
    Dainotti, A., Pescapé, A., Claffy, K.C.: Issues and Future Directions in Traffic Classification. IEEE Network 26(4), 35–40 (2012)CrossRefGoogle Scholar
  5. 5.
    Moore, A.W., Zuev, D.: Internet Traffic Classification Using Bayesian Analysis Techniques. In: SIGMETRIC 2005, Banff, Canada, June 6-10, pp. 50–60 (2005)Google Scholar
  6. 6.
    Williams, N., Zander, S., Armitage, G.: A preliminary performance comparison of five machine learning algorithms for practical ip traffic flow classification. Computer Communication Review 36(5), 5–16 (2006)CrossRefGoogle Scholar
  7. 7.
    Este, A., Gringoli, F., Salgarelli, L.: Support vector machines for tcp traffic classification. Computer Networks 53(14), 2476–2490 (2009)zbMATHCrossRefGoogle Scholar
  8. 8.
    McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow Clustering Using Machine Learning Techniques. In: Barakat, C., Pratt, I. (eds.) PAM 2004. LNCS, vol. 3015, pp. 205–214. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Zander, S., Nguyen, T., Armitage, G.: Automated traffic classification and application identification using machine learning. In: Annual IEEE Conference on Local Computer Networks, Los Alamitos, CA, USA, pp. 250–257 (2005)Google Scholar
  10. 10.
    Jeffrey, E., Martin, A., Anirban, M.: Traffic classification using clustering algorithms. In: Proceedings of SIGCOMM 2006, New York, USA, pp. 281–286 (September 2006)Google Scholar
  11. 11.
    Jeffrey, E., Anirban, M., Martin, A., Carey, W.: Identifying and discriminating between web and peer-to-peer traffic in the network core. In: Proceedings of the 16th International Conference, WWW 2007, New York, USA, pp. 883–892 (May 2007)Google Scholar
  12. 12.
    Wu, K.L., Yang, M.S.: Alternative c-means clustering algorithms. Pattern Recognition 35(10), 2267–2278 (2002)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Meng Zhang
    • 1
  • Hongli Zhang
    • 1
  • Bo Zhang
    • 1
  • Gang Lu
    • 1
  1. 1.School of Computer and TechnologyHarbin Institute of TechnologyHarbinChina

Personalised recommendations