Skip to main content

FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8145))

Abstract

The ever-increasing number of malware families and polymorphic variants creates a pressing need for automatic tools to cluster the collected malware into families and generate behavioral signatures for their detection. Among these, network traffic is a powerful behavioral signature and network signatures are widely used by network administrators. In this paper we present FIRMA, a tool that given a large pool of network traffic obtained by executing unlabeled malware binaries, generates a clustering of the malware binaries into families and a set of network signatures for each family. Compared with prior tools, FIRMA produces network signatures for each of the network behaviors of a family, regardless of the type of traffic the malware uses (e.g., HTTP, IRC, SMTP, TCP, UDP). We have implemented FIRMA and evaluated it on two recent datasets comprising nearly 16,000 unique malware binaries. Our results show that FIRMA’s clustering has very high precision (100% on a labeled dataset) and recall (97.7%). We compare FIRMA’s signatures with manually generated ones, showing that they are as good (often better), while generated in a fraction of the time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms 2(1) (2004)

    Google Scholar 

  2. Anubis: Analyzing unknown binaries, http://anubis.iseclab.org/

  3. Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS (2009)

    Google Scholar 

  5. Caballero, J., Grier, C., Kreibich, C., Paxson, V.: Measuring pay-per-install: The commoditization of malware distribution. In: Usenixsecurity (2011)

    Google Scholar 

  6. Caballero, J., Johnson, N.M., McCamant, S., Song, D.: Binary code extraction and interface identification for security applications. In: NDSS (2010)

    Google Scholar 

  7. Caballero, J., Yin, H., Liang, Z., Song, D.: Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In: CCS (2007)

    Google Scholar 

  8. Chvatal, V.: A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4(3) (1979)

    Google Scholar 

  9. Cui, W., Peinado, M., Wang, H.J., Locasto, M.: shieldgen: Automatic data patch generation for unknown vulnerabilities with informed probing, Oakland (2007)

    Google Scholar 

  10. Dreger, H., Feldmann, A., Mai, M., Paxson, V., Sommer, R.: Dynamic application-layer protocol analysis for network intrusion detection. In: Usenixsecurity (2006)

    Google Scholar 

  11. Graziano, M., Leita, C., Balzarotti, D.: Towards network containment in malware analysis systems. In: ACSAC (2012)

    Google Scholar 

  12. Grier, C., et al.: Manufacturing compromise: The emergence of exploit-as-a-service. In: CCS (2012)

    Google Scholar 

  13. Gu, G., Perdisci, R., Zhang, J., Lee, W.: Botminer: Clustering analysis of network traffic for protocol and structure independent botnet detection. In: Usenixsecurity (2008)

    Google Scholar 

  14. Guo, F., Ferrie, P., Chiueh, T.-C.: A study of the packer problem and its solutions. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 98–115. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Haffner, P., Sen, S., Spatscheck, O., Wang, D.: acas: Automated construction of application signatures. In: Minenet (2005)

    Google Scholar 

  16. Jang, J., Brumley, D., Venkataraman, S.: Bitshred: Feature hashing malware for scalable triage and semantic analysis. In: CCS (2011)

    Google Scholar 

  17. John, J.P., Moshchuk, A., Gribble, S.D., Krishnamurthy, A.: Studying spamming botnets using botlab. In: NSDI (2009)

    Google Scholar 

  18. Kim, H.-A., Karp, B.: Autograph: Toward automated, distributed worm signature detection. In: Usenixsecurity (2004)

    Google Scholar 

  19. Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.A.: Behavior-based spyware detection. In: Usenixsecurity (2006)

    Google Scholar 

  20. Kreibich, C., Crowcroft, J.: Honeycomb - creating intrusion detection signatures using honeypots. In: Hotnets (2003)

    Google Scholar 

  21. Kreibich, C., Weaver, N., Kanich, C., Cui, W., Paxson, V.: gq: Practical containment for measuring modern malware systems. In: IMC (2011)

    Google Scholar 

  22. Li, Z., Sanghi, M., Chavez, B., Chen, Y., Kao, M.-Y.: Hamsa: Fast signature generation for zero-day polymorphic worms with provable attack resilience, Oakland (2006)

    Google Scholar 

  23. The malicia project, http://malicia-project.com/ .

  24. Nappa, A., Rafique, M.Z., Caballero, J.: Driving in the cloud: An analysis of drive-by download operations and abuse reporting. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 1–20. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  25. Newsome, J., Karp, B., Song, D.: Polygraph: Automatically generating signatures for polymorphic worms, Oakland (2005)

    Google Scholar 

  26. Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI (2010)

    Google Scholar 

  27. Perdisci, R., Vamo, M.U.: Towards a fully automated malware clustering validity analysis. In: ACSAC (2012)

    Google Scholar 

  28. Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  29. Rieck, K., Schwenk, G., Limmer, T., Holz, T., Laskov, P.: Botzilla: Detecting the phoning home of malicious software. In: ACM Symposium on Applied Computing (2010)

    Google Scholar 

  30. Rossow, C., Dietrich, C.J.: proVeX: Detecting botnets with encrypted command and control channels. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 21–40. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  31. Rossow, C., Dietrich, C.J., Bos, H., Cavallaro, L., van Steen, M., Freiling, F.C., Pohlmann, N.: Sandnet: Network traffic analysis of malicious software. In: Badgers (2011)

    Google Scholar 

  32. Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: Osdi (2004)

    Google Scholar 

  33. Snort, http://www.snort.org/ .

  34. Suricata, http://suricata-ids.org/ .

  35. Tuck, N., Sherwood, T., Calder, B., Varghese, G.: Deterministic memory-efficient string matching algorithms for intrusion detection. In: Infocom (2004)

    Google Scholar 

  36. Vrable, M., Ma, J., Chen, J., Moore, D., Vandekieft, E., Snoeren, A.C., Voelker, G.M., Savage, S.: Scalability, fidelity, and containment in the potemkin virtual honeyfarm. In: SOSP (2005)

    Google Scholar 

  37. Wang, K., Cretu, G.F., Stolfo, S.J.: Anomalous payload-based worm detection and signature generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  38. Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  39. Wyke, J.: The zeroaccess botnet (2012), http://www.sophos.com/en-us/why-sophos/our-people/technical-papers/zeroaccess-botnet.aspx

  40. Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: Signatures and characteristics. In: Sigcomm (2008)

    Google Scholar 

  41. Yegneswaran, V., Giffin, J.T., Barford, P., Jha, S.: An architecture for generating semantics-aware signatures. In: Usenixsecurity (2005)

    Google Scholar 

  42. Yin, H., Song, D., Manuel, E., Kruegel, C., Kirda, E.: Panorama: Capturing system-wide information flow for malware detection and analysis. In: CCS (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rafique, M.Z., Caballero, J. (2013). FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors. In: Stolfo, S.J., Stavrou, A., Wright, C.V. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2013. Lecture Notes in Computer Science, vol 8145. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41284-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41284-4_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41283-7

  • Online ISBN: 978-3-642-41284-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics