Analyzing File-to-File Relation Network in Malware Detection

Chen, Lingwei; Hardy, William; Ye, Yanfang; Li, Tao

doi:10.1007/978-3-319-26190-4_28

Lingwei Chen²⁰,
William Hardy²⁰,
Yanfang Ye²⁰ &
…
Tao Li²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9418))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1644 Accesses
10 Citations

Abstract

Due to its major threats to Internet security, malware detection is of great interest to both the anti-malware industry and researchers. Currently, features beyond file content are starting to be leveraged for malware detection (e.g., file-to-file relations), which provide invaluable insight about the properties of file samples. However, we still have much to understand about the relationships of malware and benign files. In this paper, based on the file-to-file relation network, we design several new and robust graph-based features for malware detection and reveal its relationship characteristics. Based on the designed features and two findings, we first apply Malicious Score Inference Algorithm (MSIA) to select the representative samples from the large unknown file collection for labeling, and then use Belief Propagation (BP) algorithm to detect malware. To the best of our knowledge, this is the first investigation of the relationship characteristics for the file-to-file relation network in malware detection using social network analysis. A comprehensive experimental study on a large collection of file sample relations obtained from the clients of anti-malware software of Comodo Security Solutions Incorporation is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our proposed methods outperform other alternate data mining based detection techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Chapter Google Scholar
Chau, D., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: tera-scale graph mining for malware detection. In: SIAM International Conference on Data Mining (SDM), pp. 131–142 (2011)
Google Scholar
Chen, L., Li, T., Abdulhayoglu, M., Ye, Y.: Intelligent malware detection based on file relation graphs. In: 9th IEEE International Conference on Sematic Computing, pp. 85–92 (2015)
Google Scholar
Chen, K., Zhu, P., Xiong, Y.: Mining spam accounts with user influence. In: International Conference on ISCC-C, pp. 167–173 (2013)
Google Scholar
Computer Security Institute: 12th annual edition of the CSI computer crime and security survey. Technical report, Computer Security Institute (2007)
Google Scholar
Diestel, R.: Graph Theory, vol. 173, 4th edn. Springer, Heidelberg (2010)
Google Scholar
Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A survey on automated dynamic malware analysis techniques and tools. ACM CSUR 44(2), 6:1–6:42 (2008)
Google Scholar
Filiol, E., Jacob, G., Liard, M.L.: Evaluation methodology and theoretical model for antiviral behavioural detection strategies. J. Comput. Virol 3(1), 27–37 (2007)
Article Google Scholar
Hu, X., Tang, J., Zhang, Y., Liu, H.: Social spammer detection in microblogging. In: Proceedings of the 23rd IJCAI, pp. 2633–2639 (2013)
Google Scholar
Karampatziakis, N., Stokes, J.W., Thomas, A., Marinescu, M.: Using file relationships in malware classification. In: Flegel, U., Markatos, E., Robertson, W. (eds.) DIMVA 2012. LNCS, vol. 7591, pp. 1–20. Springer, Heidelberg (2013)
Chapter Google Scholar
Kephart, J., Arnold, W.: Automatic extraction of computer virus signatures. In: Proceedings of 4th Virus Bulletin International Conference, pp. 178–184 (1994)
Google Scholar
Lin, C., Zhou, Y., Chen, K., He, J., Yang, X., Song, L.: Analysis and identification of spamming behaviors in Sina Weibo microblog. In: SNAKDD 2013 (2013)
Google Scholar
Masud, M.M., Al-Khateeb, T.M., Hamlen, K.W., Gao, J., Khan, L., Han, J., Thuraisingham, B.: Cloud-based malware detection for evolving data streams. ACM TMIS 2(3), 16:1–16:27 (2008)
Google Scholar
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM, pp. 29–42 (2007)
Google Scholar
Moh, T.-S., Murmann, A.J.: Can you judge a man by his friends? Enhancing spammer detection on the twitter microblogging platform using friends and followers. In: Prasad, S.K., Vin, H.M., Sahni, S., Jaiswal, M.P., Thipakorn, B. (eds.) ICISTM 2010. CCIS, vol. 54, pp. 210–220. Springer, Heidelberg (2010)
Chapter Google Scholar
Noorshams, N., Wainwright, M.J.: Belief propagation for continuous state spaces: stochastic message-passing with quantitative guarantees. J. Mach. Learn. Res. 14(1), 2799–2835 (2013)
MATH MathSciNet Google Scholar
Park, Y., Zhang, Q., Reeves, D., Mulukutla, V.: AntiBot: clustering common semantic patterns for bot detection. In: IEEE 34th Annual Computer Software and Applications Conference, pp. 262–272 (2010)
Google Scholar
Scott, J.: Social Networks Analysis: A Hand Book, 2nd edn. SAGE Publications Ltd, Thousand Oaks (2000)
Google Scholar
Sung, A., Xu, J., Chavez, P., Mukkamala, S.: Static analyzer of vicious executables (save). In: Proceedings of the 20th ACSAC, pp. 326–334 (2004)
Google Scholar
Tang, R., Lu, L., Zhuang, Y., Fong, S.: Not every friend on a social network can be trusted: an online trust indexing algorithm. In: IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp. 280–285 (2012)
Google Scholar
Ting, I.H., Wang, S.L.: Content matters: a study of hate groups detection based on social networks analysis and web mining. In: IEEE/ACM ASONAM, pp. 1196–1201 (2013)
Google Scholar
Tamersoy, A., Roundy, K.A., Chau, D.: Guilt by association: large scale malware detection by mining file-relation graphs. In: ACM SIGKDD (2014)
Google Scholar
Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the Third ACM WSDM, pp. 261–270 (2010)
Google Scholar
Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving twitter spammers. In: Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection, pp. 318–337 (2011)
Google Scholar
Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammer’s social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of the 21st International Conference on World Wide Web (WWW 2012), pp. 71–80 (2012)
Google Scholar
Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q.: An intelligent PE-malware detection system based on association mining. J. Comput. Virol. 4, 323–334 (2008)
Article Google Scholar
Ye, Y., Wang, D., Li, T., Ye, D.: IMDS: Intelligent malware detection system. In: Proceedings of the 13th ACM SIGKDD, pp. 1043–1047 (2007)
Google Scholar
Ye, Y., Li, T., Zhu, S., Zhuang, W., Tas, E., Gupta, U., Abdulhayoglu, M.: Combining file content and file relations for cloud based malware detection. In: Proceedings of the 17th ACM SIGKDD, pp. 222–230 (2011)
Google Scholar
Yedidia, J. S., Freeman, W.T., Weiss, Y.: Understanding belief propagation and its generalizations. Mltsubishl Electric Research Laboratories (2001)
Google Scholar
Zhang, C., Niu, K., He, Z.: Dynamic detection of spammers in Weibo. In: 4th IEEE IC-NIDC, pp. 112–116 (2014)
Google Scholar

Download references

Acknowledgments

The authors would also like to thank the anti-malware experts of Comodo Security Lab for the data collection, as well as the helpful discussions and supports.

Author information

Authors and Affiliations

Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, 26506, USA
Lingwei Chen, William Hardy & Yanfang Ye
School of Computer Science, Florida International University, Miami, FL, 33199, USA
Tao Li

Authors

Lingwei Chen
View author publications
You can also search for this author in PubMed Google Scholar
William Hardy
View author publications
You can also search for this author in PubMed Google Scholar
Yanfang Ye
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanfang Ye .

Editor information

Editors and Affiliations

Tsinghua University, Bijing, China
Jianyong Wang
Poznan University of Economics, Poznan, Poland
Wojciech Cellary
Florida Atlantic University, Boca Raton, Florida, USA
Dingding Wang
Victoria University, Melbourne, Australia
Hua Wang
School of Computing & Information, Florida International University, Miami, Florida, USA
Shu-Ching Chen
Florida International University, Miami, Florida, USA
Tao Li
Victoria University, Melbourne, Victoria, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, L., Hardy, W., Ye, Y., Li, T. (2015). Analyzing File-to-File Relation Network in Malware Detection. In: Wang, J., et al. Web Information Systems Engineering – WISE 2015. WISE 2015. Lecture Notes in Computer Science(), vol 9418. Springer, Cham. https://doi.org/10.1007/978-3-319-26190-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-26190-4_28
Published: 25 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26189-8
Online ISBN: 978-3-319-26190-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics