Advertisement

MobileFindr: Function Similarity Identification for Reversing Mobile Binaries

  • Yibin Liao
  • Ruoyan Cai
  • Guodong Zhu
  • Yue Yin
  • Kang Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11098)

Abstract

Identifying binary code at function level has been applied to a broad range of software security applications and reverse engineering tasks, including patch analysis, vulnerability assessment, code plagiarism detection, malware analysis, etc. However, various anti-reverse engineering techniques (e.g., obfuscation, anti-emulator, etc.) employed by the mobile apps make existing approaches ineffective when performing function identification. In this paper, we propose MobileFindr, an on-device trace-based function similarity identification framework on the mobile platform. MobileFindr runs on real mobile devices and mitigates many prevalent anti-reversing techniques by extracting function execution behaviors via dynamic instrumentation, then characterizing functions with collected behaviors and performing function matching via distance calculation. We have evaluated MobileFindr using real-world top-ranked mobile frameworks and applications. The experimental results showed that MobileFindr outperforms existing state-of-the-art tools in terms of better obfuscation resilience and accuracy.

Keywords

Reverse engineering Similarity identification Dynamic instrumentation 

References

  1. 1.
    Android studio - debug your app. https://developer.android.com/studio/debug/index.html. Accessed 30 Jan 2018
  2. 2.
    Apktool - a tool for reverse engineering android apk files. https://ibotpeaches.github.io/Apktool/. Accessed 30 Jan 2018
  3. 3.
    Bingrep. https://github.com/hada2/bingrep. Accessed 30 Jan 2018
  4. 4.
    Clutch 2.0.4. https://github.com/KJCracks/Clutch/releases/tag/2.0.4. Accessed 30 Jan 2018
  5. 5.
    dex2jar. https://github.com/pxb1988/dex2jar. Accessed 30 Jan 2018
  6. 6.
    Disable aslr on ios applications. http://www.securitylearn.net/2013/05/23/disable-aslr-on-ios-applications/. Accessed 30 Jan 2018
  7. 7.
    Frida. https://www.frida.re/. Accessed 30 Jan 2018
  8. 8.
    Hex-rays decompiler. https://www.hex-rays.com/products/decompiler/index.shtml. Accessed 30 Jan 2018
  9. 9.
  10. 10.
    Jd-gui. http://jd.benow.ca/. Accessed 30 Jan 2018
  11. 11.
    ldid. http://iphonedevwiki.net/index.php/Ldid. Accessed 30 Jan 2018
  12. 12.
    The lldb debugger. https://lldb.llvm.org/. Accessed 30 Jan 2018
  13. 13.
    More complex = less secure: Miss a test path and you could get hacked. http://www.mccabe.com/pdf/MoreComplexEqualsLessSecure-McCabe.pdf. Accessed 30 Jan 2018
  14. 14.
    Nearpy. https://github.com/pixelogik/NearPy. Accessed 30 Jan 2018
  15. 15.
    smali/baksmali wiki. https://github.com/JesusFreke/smali/wiki. Accessed 30 Jan 2018
  16. 16.
    Top 10 libraries for ios developers. https://www.raywenderlich.com/177482/top-10-ios-developer-libraries. Accessed 30 Jan 2018
  17. 17.
    Zynamics bindiff. https://www.zynamics.com/bindiff.html. Accessed 30 Jan 2018
  18. 18.
    Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th Annual IEEE Symposium on Foundations of Computer Science, 2006. FOCS 2006, pp. 459–468. IEEE (2006)Google Scholar
  19. 19.
    Brumley, D., Poosankam, P., Song, D., Zheng, J.: Automatic patch-based exploit generation is possible: techniques and implications. In: IEEE Symposium on Security and Privacy 2008. SP 2008, pp. 143–157. IEEE (2008)Google Scholar
  20. 20.
    David, Y., Partush, N., Yahav, E.: Similarity of binaries through re-optimization. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 79–94. ACM (2017)Google Scholar
  21. 21.
    Egele, M., Woo, M., Chapman, P., Brumley, D.: Blanket execution: dynamic similarity testing for program binaries and components. USENIX (2014)Google Scholar
  22. 22.
    Eschweiler, S., Yakdan, K., Gerhards-Padilla, E.: discovRE: efficient cross-architecture identification of bugs in binary code. In: NDSS (2016)Google Scholar
  23. 23.
    Feng, Q., Zhou, R., Xu, C., Cheng, Y., Testa, B., Yin, H.: Scalable graph-based bug search for firmware images. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 480–491. ACM (2016)Google Scholar
  24. 24.
    Flake, H.: Structural comparison of executable objects. In: Proceedings of the International GI Workshop on Detection of Intrusions and Malware & Vulnerability Assessment, number P-46 in Lecture Notes in Informatics, pp. 161–174. Citeseer (2004)Google Scholar
  25. 25.
    Gao, D., Reiter, M.K., Song, D.: BinHunt: automatically finding semantic differences in binary programs. In: Chen, L., Ryan, M.D., Wang, G. (eds.) ICICS 2008. LNCS, vol. 5308, pp. 238–255. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-88625-9_16CrossRefGoogle Scholar
  26. 26.
    Gibler, C., Stevens, R., Crussell, J., Chen, H., Zang, H., Choi, H.: Adrob: examining the landscape and impact of android application plagiarism. In: Proceeding of the 11th Annual International Conference on Mobile Systems, Applications, and Services, pp. 431–444. ACM (2013)Google Scholar
  27. 27.
    Godefroid, P., Levin, M.Y., Molnar, D.A., et al.: Automated whitebox fuzz testing. In: NDSS, vol. 8, pp. 151–166 (2008)Google Scholar
  28. 28.
    Herremans, D.: MorpheuS: automatic music generation with recurrent pattern constraints and tension profiles (2016)Google Scholar
  29. 29.
    Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM - software protection for the masses. In: Wyseur, B. (ed.) Proceedings of the IEEE/ACM 1st International Workshop on Software Protection, SPRO 2015, Firenze, Italy, 19th May 2015, pp. 3–9. IEEE (2015).  https://doi.org/10.1109/SPRO.2015.10
  30. 30.
    Kirat, D., Vigna, G.: Malgene: automatic extraction of malware analysis evasion signature. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 769–780. ACM (2015)Google Scholar
  31. 31.
    Lindorfer, M., Di Federico, A., Maggi, F., Comparetti, P.M., Zanero, S.: Lines of malicious code: insights into the malicious software industry. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 349–358. ACM (2012)Google Scholar
  32. 32.
    Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 872–881. ACM (2006)Google Scholar
  33. 33.
    Luk, C.K., et al.: Pin: building customized program analysis tools with dynamic instrumentation. In: ACM SIGPLAN notices, vol. 40, pp. 190–200. ACM (2005)Google Scholar
  34. 34.
    Luo, L., Ming, J., Wu, D., Liu, P., Zhu, S.: Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 389–400. ACM (2014)Google Scholar
  35. 35.
    Ming, J., Xu, D., Jiang, Y., Wu, D.: BinSim: trace-based semantic binary diffing via system call sliced segment equivalence checking. In: Proceedings of the 26th USENIX Security Symposium, pp. 253–270. USENIX Association (2017)Google Scholar
  36. 36.
    Moser, A., Kruegel, C., Kirda, E.: Exploring multiple execution paths for malware analysis. In: IEEE Symposium on Security and Privacy 2007. SP 2007, pp. 231–245. IEEE (2007)Google Scholar
  37. 37.
    Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Twenty-Third Annual Computer Security Applications Conference 2007. ACSAC 2007, pp. 421–430. IEEE (2007)Google Scholar
  38. 38.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)Google Scholar
  39. 39.
    Oh, J.: Fight against 1-day exploits: diffing binaries vs anti-diffing binaries. Black Hat (2009)Google Scholar
  40. 40.
    Petsas, T., Voyatzis, G., Athanasopoulos, E., Polychronakis, M., Ioannidis, S.: Rage against the virtual machine: hindering dynamic analysis of android malware. In: Proceedings of the Seventh European Workshop on System Security, p. 5. ACM (2014)Google Scholar
  41. 41.
    Pewny, J., Garmany, B., Gawlik, R., Rossow, C., Holz, T.: Cross-architecture bug search in binary executables. In: 2015 IEEE Symposium on Security and Privacy (SP), pp. 709–724. IEEE (2015)Google Scholar
  42. 42.
    Sharma, R., Schkufza, E., Churchill, B., Aiken, A.: Data-driven equivalence checking. In: ACM SIGPLAN Notices, vol. 48, pp. 391–406. ACM (2013)Google Scholar
  43. 43.
    Wang, X., Jhi, Y.C., Zhu, S., Liu, P.: Behavior based software theft detection. In: Proceedings of the 16th ACM Conference on Computer and Communications Security, pp. 280–290. ACM (2009)Google Scholar
  44. 44.
    Wang, X., Jhi, Y.C., Zhu, S., Liu, P.: Detecting software theft via system call based birthmarks. In: Annual Computer Security Applications Conference 2009. ACSAC 2009, pp. 149–158. IEEE (2009)Google Scholar
  45. 45.
    Xu, D., Ming, J., Wu, D.: Cryptographic function detection in obfuscated binaries via bit-precise symbolic loop mapping. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 921–937. IEEE (2017)Google Scholar
  46. 46.
    Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 363–376. ACM (2017)Google Scholar
  47. 47.
    Xue, L., Zhou, Y., Chen, T., Luo, X., Gu, G.: Malton: towards on-device non-invasive mobile malware analysis for art. In: 26th USENIX Security Symposium (USENIX Security 17). ACM (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.University of GeorgiaAthensUSA

Personalised recommendations