FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors

Rafique, M. Zubair; Caballero, Juan

doi:10.1007/978-3-642-41284-4_8

FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors

M. Zubair Rafique¹⁹ &
Juan Caballero¹⁹

Conference paper

3427 Accesses
48 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8145))

Abstract

The ever-increasing number of malware families and polymorphic variants creates a pressing need for automatic tools to cluster the collected malware into families and generate behavioral signatures for their detection. Among these, network traffic is a powerful behavioral signature and network signatures are widely used by network administrators. In this paper we present FIRMA, a tool that given a large pool of network traffic obtained by executing unlabeled malware binaries, generates a clustering of the malware binaries into families and a set of network signatures for each family. Compared with prior tools, FIRMA produces network signatures for each of the network behaviors of a family, regardless of the type of traffic the malware uses (e.g., HTTP, IRC, SMTP, TCP, UDP). We have implemented FIRMA and evaluated it on two recent datasets comprising nearly 16,000 unique malware binaries. Our results show that FIRMA’s clustering has very high precision (100% on a labeled dataset) and recall (97.7%). We compare FIRMA’s signatures with manually generated ones, showing that they are as good (often better), while generated in a fraction of the time.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms 2(1) (2004)
Google Scholar
Anubis: Analyzing unknown binaries, http://anubis.iseclab.org/
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Chapter Google Scholar
Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS (2009)
Google Scholar
Caballero, J., Grier, C., Kreibich, C., Paxson, V.: Measuring pay-per-install: The commoditization of malware distribution. In: Usenixsecurity (2011)
Google Scholar
Caballero, J., Johnson, N.M., McCamant, S., Song, D.: Binary code extraction and interface identification for security applications. In: NDSS (2010)
Google Scholar
Caballero, J., Yin, H., Liang, Z., Song, D.: Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In: CCS (2007)
Google Scholar
Chvatal, V.: A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4(3) (1979)
Google Scholar
Cui, W., Peinado, M., Wang, H.J., Locasto, M.: shieldgen: Automatic data patch generation for unknown vulnerabilities with informed probing, Oakland (2007)
Google Scholar
Dreger, H., Feldmann, A., Mai, M., Paxson, V., Sommer, R.: Dynamic application-layer protocol analysis for network intrusion detection. In: Usenixsecurity (2006)
Google Scholar
Graziano, M., Leita, C., Balzarotti, D.: Towards network containment in malware analysis systems. In: ACSAC (2012)
Google Scholar
Grier, C., et al.: Manufacturing compromise: The emergence of exploit-as-a-service. In: CCS (2012)
Google Scholar
Gu, G., Perdisci, R., Zhang, J., Lee, W.: Botminer: Clustering analysis of network traffic for protocol and structure independent botnet detection. In: Usenixsecurity (2008)
Google Scholar
Guo, F., Ferrie, P., Chiueh, T.-C.: A study of the packer problem and its solutions. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 98–115. Springer, Heidelberg (2008)
Chapter Google Scholar
Haffner, P., Sen, S., Spatscheck, O., Wang, D.: acas: Automated construction of application signatures. In: Minenet (2005)
Google Scholar
Jang, J., Brumley, D., Venkataraman, S.: Bitshred: Feature hashing malware for scalable triage and semantic analysis. In: CCS (2011)
Google Scholar
John, J.P., Moshchuk, A., Gribble, S.D., Krishnamurthy, A.: Studying spamming botnets using botlab. In: NSDI (2009)
Google Scholar
Kim, H.-A., Karp, B.: Autograph: Toward automated, distributed worm signature detection. In: Usenixsecurity (2004)
Google Scholar
Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.A.: Behavior-based spyware detection. In: Usenixsecurity (2006)
Google Scholar
Kreibich, C., Crowcroft, J.: Honeycomb - creating intrusion detection signatures using honeypots. In: Hotnets (2003)
Google Scholar
Kreibich, C., Weaver, N., Kanich, C., Cui, W., Paxson, V.: gq: Practical containment for measuring modern malware systems. In: IMC (2011)
Google Scholar
Li, Z., Sanghi, M., Chavez, B., Chen, Y., Kao, M.-Y.: Hamsa: Fast signature generation for zero-day polymorphic worms with provable attack resilience, Oakland (2006)
Google Scholar
The malicia project, http://malicia-project.com/ .
Nappa, A., Rafique, M.Z., Caballero, J.: Driving in the cloud: An analysis of drive-by download operations and abuse reporting. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 1–20. Springer, Heidelberg (2013)
Chapter Google Scholar
Newsome, J., Karp, B., Song, D.: Polygraph: Automatically generating signatures for polymorphic worms, Oakland (2005)
Google Scholar
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI (2010)
Google Scholar
Perdisci, R., Vamo, M.U.: Towards a fully automated malware clustering validity analysis. In: ACSAC (2012)
Google Scholar
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)
Chapter Google Scholar
Rieck, K., Schwenk, G., Limmer, T., Holz, T., Laskov, P.: Botzilla: Detecting the phoning home of malicious software. In: ACM Symposium on Applied Computing (2010)
Google Scholar
Rossow, C., Dietrich, C.J.: proVeX: Detecting botnets with encrypted command and control channels. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 21–40. Springer, Heidelberg (2013)
Chapter Google Scholar
Rossow, C., Dietrich, C.J., Bos, H., Cavallaro, L., van Steen, M., Freiling, F.C., Pohlmann, N.: Sandnet: Network traffic analysis of malicious software. In: Badgers (2011)
Google Scholar
Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: Osdi (2004)
Google Scholar
Snort, http://www.snort.org/ .
Suricata, http://suricata-ids.org/ .
Tuck, N., Sherwood, T., Calder, B., Varghese, G.: Deterministic memory-efficient string matching algorithms for intrusion detection. In: Infocom (2004)
Google Scholar
Vrable, M., Ma, J., Chen, J., Moore, D., Vandekieft, E., Snoeren, A.C., Voelker, G.M., Savage, S.: Scalability, fidelity, and containment in the potemkin virtual honeyfarm. In: SOSP (2005)
Google Scholar
Wang, K., Cretu, G.F., Stolfo, S.J.: Anomalous payload-based worm detection and signature generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006)
Chapter Google Scholar
Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009)
Chapter Google Scholar
Wyke, J.: The zeroaccess botnet (2012), http://www.sophos.com/en-us/why-sophos/our-people/technical-papers/zeroaccess-botnet.aspx
Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: Signatures and characteristics. In: Sigcomm (2008)
Google Scholar
Yegneswaran, V., Giffin, J.T., Barford, P., Jha, S.: An architecture for generating semantics-aware signatures. In: Usenixsecurity (2005)
Google Scholar
Yin, H., Song, D., Manuel, E., Kruegel, C., Kirda, E.: Panorama: Capturing system-wide information flow for malware detection and analysis. In: CCS (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

IMDEA Software Institute, Spain
M. Zubair Rafique & Juan Caballero

Authors

M. Zubair Rafique
View author publications
You can also search for this author in PubMed Google Scholar
Juan Caballero
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Columbia University, 1214 Amsterdam Avenue, 10027, NY, USA
Salvatore J. Stolfo
Department of Computer Science, George Mason University, 22030, Fairfax, VA, USA
Angelos Stavrou
Department of Computer Science, Portland State University, 97201, Portland, OR, USA
Charles V. Wright

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rafique, M.Z., Caballero, J. (2013). FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors. In: Stolfo, S.J., Stavrou, A., Wright, C.V. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2013. Lecture Notes in Computer Science, vol 8145. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41284-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-41284-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41283-7
Online ISBN: 978-3-642-41284-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics