BotTokenizer: Exploring Network Tokens of HTTP-Based Botnet Using Malicious Network Traces

Qi, Biao; Shi, Zhixin; Wang, Yan; Wang, Jizhi; Wang, Qiwen; Jiang, Jianguo

doi:10.1007/978-3-319-75160-3_23

BotTokenizer: Exploring Network Tokens of HTTP-Based Botnet Using Malicious Network Traces

Biao Qi^16,17,
Zhixin Shi¹⁶,
Yan Wang^16,17,
Jizhi Wang^16,17,
Qiwen Wang^16,17 &
…
Jianguo Jiang¹⁶

Conference paper
First Online: 04 February 2018

1215 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10726))

Abstract

Nowadays, malicious software and especially botnets leverage HTTP protocol as their communication and command (C&C) channels to connect to the attackers and control compromised clients. Due to its large popularity and facility across firewall, the malicious traffic can blend with legitimate traffic and remains undetected. While network signature-based detection systems and models show extraordinary advantages, such as high detection efficiency and accuracy, their scalability and automatization still need to be improved.

In this work, we present BotTokenizer, a novel network signature-based detection system that aims to detect malicious HTTP C&C traffic. BotTokenizer automatically learns recognizable network tokens from known HTTP C&C communications from different botnet families by using words segmentation technologies. In essence, BotTokenizer implements a coarse-grained network signature generation prototype only relying on Uniform Resource Locators (URLs) in HTTP requests. Our evaluation results demonstrate that BotTokenizer performs very well on identifying HTTP-based botnets with an acceptable classification errors.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Snort. http://www.snort.org/.
2.
Suricata. http://suricata-ids.org/.
3.
ANUBIS. http://www.assiste.com/Anubis.html/.
4.
CWSandbox. http://www.cwsandbox.org/.
5.
Alexa. http://www.alexa.com/topsites/.
6.
MCFP. https://stratosphereips.org/category/dataset.html.
7.
VirusTotal. https://www.virustotal.com/.
8.
scikit-learn: http://scikit-learn.org/stable/.

References

Antonakakis, M., Demar, J., Stevens, K., Dagon, D.: Unveiling the network criminal infrastructure of TDSS/TDL4. Damballa Research Report 2012 (2012)
Google Scholar
Chiba, D., Yagi, T., Akiyama, M., Aoki, K., Hariu, T., Goto, S.: BotProfiler: profiling variability of substrings in HTTP requests to detect malware-infected hosts. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 758–765. IEEE (2015)
Google Scholar
Garcia, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)
Article Google Scholar
Goebel, J., Holz, T.: Rishi: identify bot contaminated hosts by IRC nickname evaluation. HotBots 7, 8–8 (2007)
Google Scholar
Goodman, N.: A survey of advances in botnet technologies. arXiv preprint arXiv:1702.01132 (2017)
Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: BotMiner: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, vol. 5, pp. 139–154 (2008)
Google Scholar
Gu, G., Yegneswaran, V., Porras, P., Stoll, J., Lee, W.: Active botnet probing to identify obscure command and control channels. In: Annual Computer Security Applications Conference, ACSAC 2009, pp. 241–253. IEEE (2009)
Google Scholar
Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic (2008)
Google Scholar
Han, X., Kheir, N., Balzarotti, D.: PhishEye: live monitoring of sandboxed phishing kits. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1402–1413. ACM (2016)
Google Scholar
Jang, J., Brumley, D., Venkataraman, S.: BitShred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of the 18th ACM Conference on Computer and Communications security, pp. 309–320. ACM (2011)
Google Scholar
Kim, H.A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: USENIX Security Symposium, San Diego, CA, vol. 286 (2004)
Google Scholar
Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.: Behavior-based spyware detection. In: USENIX Security, vol. 6 (2006)
Google Scholar
Li, Z., Sanghi, M., Chen, Y., Kao, M.Y., Chavez, B.: Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience. In: 2006 IEEE Symposium on Security and Privacy, pp. 15. IEEE (2006)
Google Scholar
Lu, W., Rammidi, G., Ghorbani, A.A.: Clustering botnet communication traffic based on n-gram feature selection. Comput. Commun. 34(3), 502–514 (2011)
Article Google Scholar
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 681–688. ACM (2009)
Google Scholar
Malan, D.J., Smith, M.D.: Host-based detection of worms through peer-to-peer cooperation. In: Proceedings of the 2005 ACM Workshop on Rapid Malcode, pp. 72–80. ACM (2005)
Google Scholar
Nelms, T., Perdisci, R., Ahamad, M.: ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates. In: USENIX Security, pp. 589–604 (2013)
Google Scholar
Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: 2005 IEEE Symposium on Security and Privacy, pp. 226–241. IEEE (2005)
Google Scholar
Perdisci, R., Ariu, D., Giacinto, G.: Scalable fine-grained behavioral clustering of HTTP-based malware. Comput. Netw. 57(2), 487–500 (2013)
Article Google Scholar
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In: NSDI, vol. 10, p. 14 (2010)
Google Scholar
Perdisci, R., et al.: VAMO: towards a fully automated malware clustering validity analysis. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 329–338. ACM (2012)
Google Scholar
Rafique, M.Z., Caballero, J.: FIRMA: malware clustering and network signature generation with mixed network behaviors. In: Stolfo, S.J., Stavrou, A., Wright, C.V. (eds.) RAID 2013. LNCS, vol. 8145, pp. 144–163. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41284-4_8
Chapter Google Scholar
Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., Hakimian, P.: Detecting P2P botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), pp. 174–180. IEEE (2011)
Google Scholar
Sakib, M.N., Huang, C.T.: Using anomaly detection based techniques to detect HTTP-based botnet C&C traffic. In: 2016 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2016)
Google Scholar
Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: OSDI, vol. 4, p. 4 (2004)
Google Scholar
Small, S., Mason, J., Monrose, F., Provos, N., Stubblefield, A.: To catch a predator: a natural language approach for eliciting malicious payloads. In: USENIX Security Symposium, pp. 171–184 (2008)
Google Scholar
Sourdis, I., Pnevmatikatos, D.: Fast, large-scale string match for a 10 Gbps FPGA-based network intrusion detection system. In: Y. K. Cheung, P., Constantinides, G.A. (eds.) FPL 2003. LNCS, vol. 2778, pp. 880–889. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45234-8_85
Chapter Google Scholar
Spitzner, L.: The honeynet project: trapping the hackers. IEEE Secur. Priv. 99(2), 15–23 (2003)
Article Google Scholar
Wang, K., Cretu, G., Stolfo, S.J.: Anomalous payload-based worm detection and signature generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006). https://doi.org/10.1007/11663812_12
Chapter Google Scholar
Wang, X., Zheng, K., Niu, X., Wu, B., Wu, C.: Detection of command and control in advanced persistent threat based on independent access. In: 2016 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2016)
Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries, pp. 254–255. ACM (1999)
Google Scholar
Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04444-1_15
Chapter Google Scholar
Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. ACM SIGCOMM Comput. Commun. Rev. 38(4), 171–182 (2008)
Article Google Scholar
Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 116–127. ACM (2007)
Google Scholar
Zand, A., Vigna, G., Yan, X., Kruegel, C.: Extracting probable command and control signatures for detecting botnets. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 1657–1662. ACM (2014)
Google Scholar
Zarras, A., Papadogiannakis, A., Gawlik, R., Holz, T.: Automated generation of models for fast and precise detection of http-based malware. In: 2014 Twelfth Annual International Conference on Privacy, Security and Trust (PST), pp. 249–256. IEEE (2014)
Google Scholar
Zeidanloo, H.R., Manaf, A.B.A.: Botnet detection by monitoring similar communication patterns. arXiv preprint arXiv:1004.1232 (2010)
Zeng, Y., Hu, X., Shin, K.G.: Detection of botnets using combined host-and network-level information. In: 2010 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 291–300. IEEE (2010)
Google Scholar
Zhang, J., Perdisci, R., Lee, W., Sarfraz, U., Luo, X.: Detecting stealthy P2P botnets using statistical traffic fingerprints. In: 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), pp. 121–132. IEEE (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Biao Qi, Zhixin Shi, Yan Wang, Jizhi Wang, Qiwen Wang & Jianguo Jiang
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Biao Qi, Yan Wang, Jizhi Wang & Qiwen Wang

Authors

Biao Qi
View author publications
You can also search for this author in PubMed Google Scholar
Zhixin Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jizhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qiwen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhixin Shi .

Editor information

Editors and Affiliations

Xidian University, Xi’an, China
Xiaofeng Chen
SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Dongdai Lin
Columbia University, New York, New York, USA
Moti Yung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qi, B., Shi, Z., Wang, Y., Wang, J., Wang, Q., Jiang, J. (2018). BotTokenizer: Exploring Network Tokens of HTTP-Based Botnet Using Malicious Network Traces. In: Chen, X., Lin, D., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2017. Lecture Notes in Computer Science(), vol 10726. Springer, Cham. https://doi.org/10.1007/978-3-319-75160-3_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-75160-3_23
Published: 04 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75159-7
Online ISBN: 978-3-319-75160-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics