AMAL: High-Fidelity, Behavior-Based Automated Malware Analysis and Classification

Mohaisen, Aziz; Alrawi, Omar

doi:10.1007/978-3-319-15087-1_9

Aziz Mohaisen¹⁵ &
Omar Alrawi¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8909))

Included in the following conference series:

International Workshop on Information Security Applications

1433 Accesses
11 Citations

Abstract

This paper introduces AMAL, an operational automated and behavior-based malware analysis and labeling (classification and clustering) system that addresses many limitations and shortcomings of the existing academic and industrial systems. AMAL consists of two sub-systems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually-vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4,000 samples) and large-scale datasets (more than 115,000 samples)—collected and analyzed by AutoMal over 13 months—show AMAL’s effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5 % and recall of 99.6 % for certain families’ classification, and more than 98 % of precision and recall for unsupervised clustering. Several benchmarks, costs estimates and measurements highlight and support the merits and features of AMAL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

MySQL, May 2013. http://www.mysql.com/
Yara Project: a malware identification and classification tool, May 2013. http://bit.ly/3hbs3d
Antonakakis, M., Perdisci, R., Dagon, D., Lee, W., Feamster, N.: Building a dynamic reputation system for dns. In: USENIX Security Symposium (2010)
Google Scholar
Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou II, N., Dagon, D.: Detecting malware domains at the upper dns hierarchy. In: USENIX Security Symposium (2011)
Google Scholar
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Chapter Google Scholar
Bayer, U., Comparetti, P.M., Hlauschek, C., Krügel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS (2009)
Google Scholar
Bilge, L., Balzarotti, D., Robertson, W.K., Kirda, E., Kruegel, C.: Disclosure: detecting botnet command and control servers through large-scale netflow analysis. In: ACSAC (2012)
Google Scholar
Bilge, L., Kirda, E., Kruegel, C., Balduzzi, M.: Exposure: finding malicious domains using passive dns analysis. In: NDSS (2011)
Google Scholar
Falliere, N., Chien, E.: Zeus: king of the bots. Symantec Security Response, November 2009. http://bit.ly/3VyFV1
Gorecki, C., Freiling, F.C., Kührer, M., Holz, T.: Trumanbox: improving dynamic malware analysis by emulating the internet. In: SSS (2011)
Google Scholar
Gu, G., Perdisci, R., Zhang, J., Lee, W.: Botminer: clustering analysis of network traffic for protocol- and structure-independent botnet detection. In: USENIX Security Symposium (2008)
Google Scholar
Holz, T., Gorecki, C., Rieck, K., Freiling, F.C.: Measuring and detecting fast-flux service networks. In: NDSS (2008)
Google Scholar
Hong, C.-Y., Yu, F., Xie, Y.: Populated ip addresses: classification and applications. In: ACM CCS, pp. 329–340 (2012)
Google Scholar
Jacob, G., Hund, R., Kruegel, C., Holz, T.: Jackstraws: picking command and control connections from bot traffic. In: USENIX Security Symposium (2011)
Google Scholar
Kinable, J., Kostakis, O.: Malware classification based on call graph clustering. J. Comput. Virol. 7(4), 233–245 (2011)
Article Google Scholar
Mohaisen, A., Alrawi, O., Larson, M.: Amal: High-fidelity, behavior-based automated malware analysis and classification. Technical report, Verisign Labs (2013)
Google Scholar
Nazario, J., Holz, T.: As the net churns: fast-flux botnet observations. In: MALWARE, pp. 24–31 (2008)
Google Scholar
Park, Y., Reeves, D., Mulukutla, V., Sundaravel, B.: Fast malware classification by automated behavioral graph matching. In: CSIIR Workshop. ACM (2010)
Google Scholar
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: USENIX NSDI (2010)
Google Scholar
Provos, N., McNamee, D., Mavrommatis, P., Wang, K., Modadugu, N., et al.: The ghost in the browser analysis of web-based malware. In: USENIX HotBots (2007)
Google Scholar
Ramilli, M., Bishop, M.: Multi-stage delivery of malware. In: MALWARE (2010)
Google Scholar
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)
Chapter Google Scholar
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Google Scholar
Rossow, C., Dietrich, C.J., Grier, C., Kreibich, C., Paxson, V., Pohlmann, N., Bos, H., van Steen, M.: Prudent practices for designing malware experiments: status quo and outlook. In: IEEE Sec. and Privacy (2012)
Google Scholar
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: IEEE Symposium on Security and Privacy (2010)
Google Scholar
Strackx, R., Piessens, F.: Fides: selectively hardening software application components against kernel-level or process-level malware. In: ACM CCS (2012)
Google Scholar
Strayer, W.T., Lapsley, D.E., Walsh, R., Livadas, C.: Botnet detection based on network behavior. In: Botnet Detection (2008)
Google Scholar
Tian, R., Batten, L., Versteeg, S.: Function length as a tool for malware classification. In: IEEE MALWARE (2008)
Google Scholar
VMWare. Virtual Machine Disk Format (VMDK), May 2013. http://bit.ly/e1zJkZ
Zhao, H., Xu, M., Zheng, N., Yao, J., Ho, Q.: Malicious executables classification based on behavioral factor analysis. In: IC4E (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Verisign Labs, Reston, VA, USA
Aziz Mohaisen
Qatar Computing Research Institute, Doha, Qatar
Omar Alrawi

Authors

Aziz Mohaisen
View author publications
You can also search for this author in PubMed Google Scholar
Omar Alrawi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aziz Mohaisen .

Editor information

Editors and Affiliations

Pukyong National University, Busan, Korea, Republic of (South Korea)
Kyung-Hyune Rhee
School of Computer Science and Engineering, Soongsil University, Seoul, Korea, Republic of (South Korea)
Jeong Hyun Yi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mohaisen, A., Alrawi, O. (2015). AMAL: High-Fidelity, Behavior-Based Automated Malware Analysis and Classification. In: Rhee, KH., Yi, J. (eds) Information Security Applications. WISA 2014. Lecture Notes in Computer Science(), vol 8909. Springer, Cham. https://doi.org/10.1007/978-3-319-15087-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-15087-1_9
Published: 22 January 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15086-4
Online ISBN: 978-3-319-15087-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics