Abstract
Program binaries typically contain a significant amount of library functions taken from standard libraries or free open-source software packages . Automatically identifying such library functions not only enhances the quality and efficiency of threat analysis and reverse engineering tasks, but also improves their accuracy by avoiding false correlations between irrelevant code bases. Furthermore, such automation has a strong positive impact in other applications such as clone detection, function fingerprinting, authorship attribution, vulnerability analysis, and malware analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Weka: Machine Learning Software. https://weka.wikispaces.com/. Accessed: January 2017.
The C Language Library, Cplusplus website. http://www.cplusplus.com/reference/clibrary/, 2011. Accessed: May, 2017.
NIST/SEMATECH e-Handbook of Statistical Methods. http://www.itl.nist.gov/div898/handbook/, 2013. Accessed: 2015.
MongoDB. https://www.mongodb.com/, 2015. Accessed: 2015.
EXEINFO PE. http://exeinfo.atwebpages.com/, 2019. Accessed: June 2019.
Hex-Rays IDA Pro. https://www.hex-rays.com/products/ida/, 2019. Accessed: June 2019.
Morton B Brown and Wilfrid Joseph Dixon. BMDP statistical software. Univ. of California Press, 1983.
Thomas H Cormen. Introduction to algorithms. MIT Press, 2009.
Chris Eagle. The IDA pro book: the unofficial guide to the world’s most popular disassembler. No Starch Press, 2011. http://www.amazon.ca/The-IDA-Pro-Book-Disassembler/dp/1593272898.
Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys (CSUR), 44(2):6, 2012.
Kimberly L Elmore and Michael B Richman. Euclidean distance as a similarity metric for principal component analysis. Monthly Weather Review, 129(3):540–549, 2001.
Sebastian Eschweiler, Khaled Yakdan, and Elmar Gerhards-Padilla. discovRE: Efficient cross-architecture identification of bugs in binary code. In Proceedings of the 23rd Symposium on Network and Distributed System Security (NDSS), 2016.
Mohammad Reza Farhadi, Benjamin CM Fung, Yin Bun Fung, Philippe Charland, Stere Preda, and Mourad Debbabi. Scalable code clone search for malware analysis. Digital Investigation, 15:46–60, 2015.
Qian Feng, Rundong Zhou, Chengcheng Xu, Yao Cheng, Brian Testa, and Heng Yin. Scalable Graph-based Bug Search for Firmware Images. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 480–491. ACM, 2016.
Eibe Frank, Yong Wang, Stuart Inglis, Geoffrey Holmes, and Ian H Witten. Using model trees for classification. Machine Learning, 32(1):63–76, 1998.
Hugo Gascon, Fabian Yamaguchi, Daniel Arp, and Konrad Rieck. Structural detection of Android malware using embedded call graphs. In Proceedings of the 2013 ACM workshop on Artificial intelligence and security, pages 45–54. ACM, 2013.
Christopher Griffin. Graph Theory: Penn State Math 485 Lecture Notes, 2012. http://www.personal.psu.edu/cxg286/Math485.pdf.
Ilfak Guilfanov. IDA fast library identification and recognition technology (FLIRT Technology): In-depth. https://www.hex\-rays.com/products/ida/tech/flirt/in_depth.shtml, 2012.
Shohei Hido and Hisashi Kashima. A linear-time graph kernel. In Ninth IEEE International Conference on Data Mining (ICDM’09), pages 179–188. IEEE, 2009.
Xin Hu, Tzi-cker Chiueh, and Kang G Shin. Large-scale malware indexing using function-call graphs. In Proceedings of the 16th ACM conference on Computer and communications security (CCS), pages 611–620. ACM, 2009.
Emily R Jacobson, Nathan Rosenblum, and Barton P Miller. Labeling library functions in stripped binaries. In Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools (PASTE), pages 1–8. ACM, 2011.
Min Gyung Kang, Pongsin Poosankam, and Heng Yin. Renovo: A hidden code extractor for packed executables. In Proceedings of the 2007 ACM Workshop on Recurring Malcode (WORM), pages 46–53. ACM, 2007.
Christopher Kruegel, Engin Kirda, Darren Mutz, William Robertson, and Giovanni Vigna. Polymorphic worm detection using structural information of executables. In International Workshop on Recent Advances in Intrusion Detection (RAID), pages 207–226. Springer, 2005.
Lorenzo Livi and Antonello Rizzi. The graph matching problem. Pattern Analysis and Applications, 16(3):253–283, 2013.
Lorenzo Martignoni, Mihai Christodorescu, and Somesh Jha. Omniunpack: Fast, generic, and safe unpacking of malware. In Twenty-Third Annual Computer Security Applications Conference (ACSAC), pages 431–441. IEEE, 2007.
Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 27(8):1226–1238, 2005.
Jing Qiu, Xiaohong Su, and Peijun Ma. Using reduced execution flow graph to identify library functions in binary code. IEEE Transactions on Software Engineering (TSE), 42(2):187–202, 2016.
Babak Bashari Rad, Maslin Masrom, and Suahimi Ibrahim. Opcodes histogram for classifying metamorphic portable executables malware. In e-Learning and e-Technologies in Education (ICEEE), 2012 International Conference on, pages 209–213. IEEE, 2012.
M Ramaswami and R Bhaskaran. A study on feature selection techniques in educational data mining. arXiv preprint arXiv:0912.3924, 2009.
Danny Roobaert, Grigoris Karakoulas, and Nitesh V Chawla. Information Gain, Correlation and Support Vector Machines. In Feature Extraction, pages 463–470. Springer, 2006.
Annie H Toderici and Mark Stamp. Chi-squared distance and metamorphic virus detection. Journal of Computer Virology and Hacking Techniques, 9(1):1–14, 2013.
Eric R Ziegel. Probability and Statistics for Engineering and the Sciences. Technometrics, 2012.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Alrabaee, S. et al. (2020). Library Function Identification. In: Binary Code Fingerprinting for Cybersecurity. Advances in Information Security, vol 78. Springer, Cham. https://doi.org/10.1007/978-3-030-34238-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-34238-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34237-1
Online ISBN: 978-3-030-34238-8
eBook Packages: Computer ScienceComputer Science (R0)