Advertisement

Journal of Computer Science and Technology

, Volume 34, Issue 2, pp 437–455 | Cite as

On Identifying and Explaining Similarities in Android Apps

  • Li LiEmail author
  • Tegawendé F. Bissyandé
  • Hao-Yu Wang
  • Jacques Klein
Regular Paper
  • 14 Downloads

Abstract

App updates and repackaging are recurrent in the Android ecosystem, filling markets with similar apps that must be identified. Despite the existence of several approaches to improving the scalability of detecting repackaged/cloned apps, researchers and practitioners are eventually faced with the need for a comprehensive pairwise comparison (or simultaneously multiple app comparisons) to understand and validate the similarities among apps. In this work, we present the design and implementation of our research-based prototype tool called SimiDroid for multi-level similarity comparison of Android apps. SimiDroid is built with the aim to support the comprehension of similarities/changes among app versions and among repackaged apps. In particular, we demonstrate the need and usefulness of such a framework based on different case studies implementing different dissection scenarios for revealing various insights on how repackaged apps are built. We further show that the similarity comparison plugins implemented in SimiDroid yield more accurate results than the state of the art.

Keywords

Android similarity analysis app clone 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2019_1918_MOESM1_ESM.pdf (708 kb)
ESM 1 (PDF 708 kb)

References

  1. [1]
    Dong F, Wang H Y, Li L, Guo Y, Xu G A, Zhang S D. How do mobile apps violate the behavioral policy of advertisement libraries? In Proc. the 19th Workshop on Mobile Computing Systems and Applications, February 2018, pp.75-80.Google Scholar
  2. [2]
    Dong F, Wang H Y, Li L, Guo Y, Bissyandé T F, Liu T M, Xu G A, Klein J. FraudDroid: Automated ad fraud detection for Android apps. In Proc. the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, November 2018, pp.257-268.Google Scholar
  3. [3]
    Li L, Li D, Bissyandé T F, Klein J, Le Traon Y, Lo D, Cavallaro L. Understanding Android app piggybacking: A systematic study of malicious code grafting. IEEE Transactions on Information Forensics and Security, 2017, 12(6): 1269-1284.CrossRefGoogle Scholar
  4. [4]
    Wang H Y, Liu Z, Guo Y, Chen X Q, Zhang M, Xu G A, Hong J. An explorative study of the mobile app ecosystem from app developers’ perspective. In Proc. the 26th International Conference on World Wide Web, April 2017, pp.163-172.Google Scholar
  5. [5]
    Wang H Y, Li H, Li L, Guo Y, Xu G A. Why are Android apps removed from Google play? A large-scale empirical study. In Proc. the 15th International Conference on Mining Software Repositories, May 2018, pp.231-242.Google Scholar
  6. [6]
    Chen J, Alalfi M H, Dean T R, Zou Y. Detecting Android malware using clone detection. Journal of Computer Science and Technology, 2015, 30(5): 942-956.CrossRefGoogle Scholar
  7. [7]
    Wang H Y, Guo Y, Ma Z A, Chen X Q. WuKong: A scalable and accurate two-phase approach to Android app clone detection. In Proc. the 2015 International Symposium on Software Testing and Analysis, July 2015, pp.71-82.Google Scholar
  8. [8]
    Chen K, Liu P, Zhang Y J. Achieving accuracy and scalability simultaneously in detecting application clones on Android markets. In Proc. the 36th International Conference on Software Engineering, May 2014, pp.175-186.Google Scholar
  9. [9]
    Zhou W, Zhou Y J, Jiang X X, Ning P. Detecting repackaged smartphone applications in third-party Android marketplaces. In Proc. the 2nd ACM Conference on Data and Application Security and Privacy, February 2012, pp.317-326.Google Scholar
  10. [10]
    Li L, Li D Y, Bissyandé T F, Klein J, Cai H P, Lo D, Traon L Y. On locating malicious code in piggybacked Android apps. Journal of Computer Science and Technology, 2017, 32(6): 1108-1124.CrossRefGoogle Scholar
  11. [11]
    Li L, Bissyandé T F, Papadakis M, Rasthofer S, Bartel A, Octeau D, Klein J, Traon L. Static analysis of Android apps: A systematic literature review. Information and Software Technology, 2017, 88: 67-95.CrossRefGoogle Scholar
  12. [12]
    Tian K, Yao D F, Ryder B G, Tan G. Analysis of code heterogeneity for high-precision classification of repackaged malware. In Proc. the 2016 IEEE Security and Privacy Workshops, May 2016, pp.262-271.Google Scholar
  13. [13]
    Guan Q L, Huang H Q, Luo W Q, Zhu S C. Semantics-based repackaging detection for mobile apps. In Proc. the 8th International Symposium on Engineering Secure Software and Systems, April 2016, pp.89-105.Google Scholar
  14. [14]
    Wu X P, Zhang D F, Su X, Li W W. Detect repackaged android application based on HTTP traffic similarity. Security and Communication Networks, 2015, 8(13): 2257-2266.CrossRefGoogle Scholar
  15. [15]
    Sun M. S, Li M M, Lui J. DroidEagle: Seamless detection of visually similar Android apps. In Proc. the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, June 2015, Article No. 9.Google Scholar
  16. [16]
    Jiao S B, Cheng Y, Ying L Y, Su P R, Feng D G. A rapid and scalable method for Android application repackaging detection. In Proc. the 11th International Conference on Information Security Practice and Experience, May 2015, pp.349-364.Google Scholar
  17. [17]
    Aldini A, Martinelli F, Saracino A, Sgandurra D. Detection of repackaged mobile applications through a collaborative approach. Concurrency and Computation: Practice and Experience, 2015, 27(11): 2818-2838.CrossRefGoogle Scholar
  18. [18]
    Soh C, Tan H B. K, Arnatovich Y L, Wang L. Detecting clones in Android applications through analyzing user interfaces. In Proc. the 23rd International Conference on Program Comprehension, May 2015, pp.163-173.Google Scholar
  19. [19]
    Gonzalez H, Kadir A A, Stakhanova N, Alzahrani A J, Ghorbani A A. Exploring reverse engineering symptoms in Android apps. In Proc. the 8th European Workshop on System Security, April 2015, Article No. 7.Google Scholar
  20. [20]
    Chen K, Wang P, Lee Y J, Wang X F, Zhang N, Huang H Q, Zou W, Liu P. Finding unknown malice in 10 seconds: Mass vetting for new threats at the Google-Play scale. In Proc. the 24th USENIX Security Symposium, August 2015, pp.659-674.Google Scholar
  21. [21]
    Zhou W, Wang Z, Zhou Y J, Jiang X X. DIVILAR: Diversifying intermediate language for anti-repackaging on Android platform. In Proc. the 4th ACM Conference on Data and Application Security and Privacy, March 2014, pp.199-210.Google Scholar
  22. [22]
    Gonzalez H, Stakhanova N, Ghorbani A A. DroidKin: Lightweight detection of Android apps similarity. In Proc. the 10th International Conference on Security and Privacy in Communication Systems, September 2014, pp.436-453.Google Scholar
  23. [23]
    Deshotels L, Notani V, Lakhotia A. DroidLegacy: Automated familial classification of Android malware. In Proc. ACM SIGPLAN on Program Protection and Reverse Engineering Workshop, January 2014, Article No. 3.Google Scholar
  24. [24]
    Mojica I J, Adams B, Nagappan M, Dienst S, Berger T, Hassan A E. A large-scale empirical study on software reuse in mobile apps. IEEE Software, 2014, 31(2): 78-86.CrossRefGoogle Scholar
  25. [25]
    Vásquez L M, Holtzhauer A, Bernal-Cárdenas C, Poshyvanyk D. Revisiting Android reuse studies in the context of code obfuscation and library usages. In Proc. the 11th Working Conference on Mining Software Repositories, May 2014, pp.242-251.Google Scholar
  26. [26]
    Crussell J, Gibler C, Chen H. AnDarwin: Scalable detection of Android application clones based on semantics. IEEE Transactions on Mobile Computing, 2015, 14(10): 2007-2019.CrossRefGoogle Scholar
  27. [27]
    Shao Y R, Luo X P, Qian C X, Zhu P F, Zhang L. Towards a scalable resource-driven approach for detecting repackaged Android applications. In Proc. the 30th Annual Computer Security Applications Conference, December 2014, pp.56-65.Google Scholar
  28. [28]
    Zhang F F, Huang H Q, Zhu S C, Wu D H, Liu P. View-Droid: Towards obfuscation-resilient mobile application repackaging detection. In Proc. the 7th ACM Conference on Security and Privacy in Wireless & Mobile Networks, July 2014, pp.25-36.Google Scholar
  29. [29]
    Ren C G, Chen K, Liu P. Droidmarking: Resilient software watermarking for impeding Android application repackaging. In Proc. the 29th ACM/IEEE International Conference on Automated Software Engineering, September 2014, pp.635-646.Google Scholar
  30. [30]
    Sun X, Zhongyang Y B, Xin Z, Mao B, Xie L. Detecting code reuse in Android applications using component-based control flow graph. In Proc. the 29th IFIP TC 11 International Conference on ICT Systems Security and Privacy Protection, December 2014, pp.142-155.Google Scholar
  31. [31]
    Lindorfer M, Volanis S, Sisto A, Neugschwandtner M, Athanasopoulos E, Maggi F, Platzer C, Zanero S, Ioannidis S. AndRadar: Fast discovery of Android applications in alternative markets. In Proc. International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, July 2014, pp.51-71.Google Scholar
  32. [32]
    Kywe S M, Li Y J, Deng R H, Hong J. Detecting camouflaged applications on mobile application markets. In Proc. the 17th International Conference on Information Security and Cryptology, December 2014, pp.241-254.Google Scholar
  33. [33]
    Lin Y D, Lai Y C, Chen C H, Tsai H C. Identifying Android malicious repackaged applications by thread-grained system call sequences. Computers & Security, 2013, 39(B): 340-350.CrossRefGoogle Scholar
  34. [34]
    Zhou W, Zhou Y J, Grace M, Jiang X X, Zou S H. Fast, scalable detection of piggybacked mobile applications. In Proc. the 3rd ACM Conference on Data and Application Security and Privacy, February 2013, pp.185-196.Google Scholar
  35. [35]
    Vidas T, Christin N. Sweetening Android lemon markets: Measuring and combating malware in application marketplaces. In Proc. the 3rd ACM Conference on Data and Application Security and Privacy, February 2013, pp.197-208.Google Scholar
  36. [36]
    Crussell J, Gibler C, Chen H. AnDarwin: Scalable detection of semantically similar Android applications. In Proc. the 18th European Symposium on Research in Computer Security, September 2013, pp.182-199.Google Scholar
  37. [37]
    Zheng M, Sun M S, Lui J. DroidAnalytics: A signature based analytic system to collect, extract, analyze and associate android malware. arXiv:1302.7212, 2013. https://arxiv.org/pdf/1302.7212.pdf, September 2018.
  38. [38]
    Zhou W, Zhang X W, Jiang X X. Appink: Watermarking Android apps for repackaging deterrence. In Proc. the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, May 2013, pp.1-12.Google Scholar
  39. [39]
    Gibler C, Stevens R, Crussell J, Chen H, Zang H, Choi H. AdRob: Examining the landscape and impact of Android application plagiarism. In Proc. the 11th Annual International Conference on Mobile Systems, Applications, and Services, June 2013, pp.431-444.Google Scholar
  40. [40]
    Hanna S, Huang L, Wu E, Li S, Chen C, Song D. Juxtapp: A scalable system for detecting code reuse among Android applications. In Proc. the 9th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, July 2012, pp.62-81.Google Scholar
  41. [41]
    Crussell J, Gibler C, Chen H. Attack of the clones: Detecting cloned applications on Android markets. In Proc. the 17th European Symposium on Research in Computer Security, September 2012, pp.37-54.Google Scholar
  42. [42]
    Potharaju R, Newell A, Nita R C, Zhang X Y. Plagiarizing smartphone applications: Attack strategies and defense techniques. In Proc. the 4th International Symposium on Engineering Secure Software and Systems, February 2012, pp.106-120.Google Scholar
  43. [43]
    Wu D J, Mao C H, Wei T E, Lee H M, Wu K P. DroidMat: Android malware detection through manifest and API calls tracing. In Proc. the 7th Asia Joint Conference on Information Security, August 2012, pp.62-69.Google Scholar
  44. [44]
    Ruiz I J M, Nagappan M, Adams B, Hassan A E. Understanding reuse in the Android market. In Proc. the 20th IEEE International Conference on Program Comprehension, June 2012, pp.113-122.Google Scholar
  45. [45]
    Desnos A. Android: Static analysis using similarity distance. In Proc. the 45th Hawaii International Conference on System Sciences, February 2012, pp.5394-5403.Google Scholar
  46. [46]
    Zhauniarovich Y, Gadyatskaya O, Crispo B, La S F, Moser E. FSquaDRA: Fast detection of repackaged applications. In Proc. the 28th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy, July 2014, pp.130-145.Google Scholar
  47. [47]
    Gao J, Li L, Kong P F, Bissyandé T F, Klein J. On vulnerability evolution in Android apps. In Proc. the 40th International Conference on Software Engineering: Companion Proceedings, May 2018, pp.276-277.Google Scholar
  48. [48]
    Kong P F, Li L, Gao J, Liu K, Bissyandé T F, Klein J. Automated testing of Android apps: A systematic literature review. IEEE Transactions on Reliability. doi: https://doi.org/10.1109/TR.2018.2865733.
  49. [49]
    Li L, Bissyandé T F, Klein J, Le T Y. An investigation into the use of common libraries in Android apps. In Proc. the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering, March 2016, pp.403-414.Google Scholar
  50. [50]
    Viennot N, Garcia E, Nieh J. A measurement study of Google play. In Proc. ACM International Conference on Measurement and Modeling of Computer Systems, June 2014, pp.221-233.Google Scholar
  51. [51]
    Ma Z, Wang H Y, Guo Y, Chen X Q. LibRadar: Fast and accurate detection of third-party libraries in Android apps. In Proc. the 38th ACM/IEEE International Conference on Software Engineering Companion, May 2016, pp.653-656.Google Scholar
  52. [52]
    Wang H Y, Guo Y. Understanding third-party libraries in mobile app analysis. In Proc. the 39th IEEE/ACM International Conference on Software Engineering Companion, May 2017, pp.515-516.Google Scholar
  53. [53]
    Li L, Bissyandé T F, Octeau D, Klein J. DroidRA: Taming reflection to support whole-program analysis of Android apps. In Proc. the 25th International Symposium on Software Testing and Analysis, July 2016, pp.318-329.Google Scholar
  54. [54]
    Lam P, Bodden E, Lhoták O, Hendren L. The Soot framework for Java program analysis: A retrospective. In Proc. Cetus Users and Compiler Infrastructure Workshop, October 2011, Article No. 35.Google Scholar
  55. [55]
    Bartel A, Klein J, Le Traon Y, Monperrus M. Dexpler: Converting Android Dalvik bytecode to jimple for static analysis with soot. In Proc. the ACM SIGPLAN International Workshop on State of the Art in Java Program Analysis, June 2012, pp.27-38.Google Scholar
  56. [56]
    Li L, Gao J, Hurier M, Kong P F, Bissyandé T F, Bartel A, Klein J, Traon Y L. Androzoo++: Collecting millions of Android apps and their metadata for the research community. arXiv:1709.05281, 2017. https://arxiv.org/pdf/1709.05281.pdf, September 2018.
  57. [57]
    Sebastián M, Rivera R, Kotzias P, Caballero J. AVCLASS: A tool for massive malware labeling. In Proc. the 19th International Symposium on Research in Attacks, Intrusions, and Defenses, September 2016, pp.230-253.Google Scholar

Copyright information

© Springer Science+Business Media, LLC & Science Press, China 2019

Authors and Affiliations

  • Li Li
    • 1
    Email author
  • Tegawendé F. Bissyandé
    • 2
  • Hao-Yu Wang
    • 3
  • Jacques Klein
    • 2
  1. 1.Faculty of Information TechnologyMonash UniversityMelbourneAustralia
  2. 2.Interdisciplinary Centre for Security, Reliability and TrustUniversity of LuxembourgLuxembourgLuxembourg
  3. 3.School of Computer ScienceBeijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations