Skip to main content

Code Between the Lines: Semantic Analysis of Android Applications

  • Conference paper
  • First Online:
ICT Systems Security and Privacy Protection (SEC 2020)

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 580))

Abstract

Static and dynamic program analysis are the key concepts researchers apply to uncover security-critical implementation weaknesses in Android applications. As it is often not obvious in which context problematic statements occur, it is challenging to assess their practical impact. While some flaws may turn out to be bad practice but not undermine the overall security level, others could have a serious impact. Distinguishing them requires knowledge of the designated app purpose.

In this paper, we introduce a machine learning-based system that is capable of generating natural language text describing the purpose and core functionality of Android apps based on their actual code. We design a dense neural network that captures the semantic relationships of resource identifiers, string constants, and API calls contained in apps to derive a high-level picture of implemented program behavior. For arbitrary applications, our system can predict precise, human-readable keywords and short phrases that indicate the main use-cases apps are designed for.

We evaluate our solution on 67,040 real-world apps and find that with a precision between 69% and 84% we can identify keywords that also occur in the developer-provided description in Google Play. To avoid incomprehensible black box predictions, we apply a model explaining algorithm and demonstrate that our technique can substantially augment inspections of Android apps by contributing contextual information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Our implementation is available at: https://github.com/sg10/apk-verbalizer.

References

  1. Cao, Y., et al.: EdgeMiner: automatically detecting implicit control flow transitions through the android framework. In: Network and Distributed System Security Symposium - NDSS 2015. The Internet Society (2015)

    Google Scholar 

  2. Gao, H., et al.: AutoPer: automatic recommender for runtime-permission in android applications. In: 43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019, Milwaukee, WI, USA, 15–19 July 2019, vol. 1, pp. 107–116. IEEE (2019)

    Google Scholar 

  3. Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: International Conference on Software Engineering - ICSE 2014, pp. 1025–1035. ACM (2014)

    Google Scholar 

  4. Hamedani, M.R., Shin, D., Lee, M., Cho, S., Hwang, C.: AndroClass: an effective method to classify android applications by applying deep neural networks to comprehensive features. Wirel. Commun. Mob. Comput. 2018, 1250359:1–1250359:21 (2018)

    Google Scholar 

  5. Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: MalDozer: automatic framework for android malware detection using deep learning. Digital Invest. 24, S48–S59 (2018)

    Article  Google Scholar 

  6. Kowalczyk, E., Memon, A.M., Cohen, M.B.: Piecing together app behavior from multiple artifacts: a case study. In: Symposium on Software Reliability Engineering - ISSRE 2015, pp. 438–449. IEEE Computer Society (2015)

    Google Scholar 

  7. Kuznetsov, K., Avdiienko, V., Gorla, A., Zeller, A.: Checking app user interfaces against app descriptions. In: Workshop on App Market Analytics - WAMA, pp. 1–7. ACM (2016)

    Google Scholar 

  8. Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Neural Information Processing Systems - NIPS 2017, pp. 4765–4774 (2017)

    Google Scholar 

  9. Pan, X., et al.: FlowCog: context-aware semantics extraction and analysis of information flow leaks in android apps. In: USENIX Security 2018, pp. 1669–1685. USENIX Association (2018)

    Google Scholar 

  10. Qu, Z., Rastogi, V., Zhang, X., Chen, Y., Zhu, T., Chen, Z.: AutoCog: measuring the description-to-permission fidelity in android applications. In: Conference on Computer and Communications Security - CCS 2014, pp. 1354–1365. ACM (2014)

    Google Scholar 

  11. Takahashi, T., Ban, T.: Android application analysis using machine learning techniques. In: Sikos, L.F. (ed.) AI in Cybersecurity. ISRL, vol. 151, pp. 181–205. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98842-9_7

    Chapter  Google Scholar 

  12. Vásquez, M.L., Holtzhauer, A., Poshyvanyk, D.: On automatically detecting similar Android apps. In: International Conference on Program Comprehension - ICPC 2016, pp. 1–10. IEEE Computer Society (2016)

    Google Scholar 

  13. Viennot, N., Garcia, E., Nieh, J.: A measurement study of Google Play. In: Measurement and Modeling of Computer Systems - SIGMETRICS 2014, pp. 221–233. ACM (2014)

    Google Scholar 

  14. Watanabe, T., Akiyama, M., Sakai, T., Mori, T.: Understanding the inconsistencies between text descriptions and the use of privacy-sensitive resources of mobile apps. In: Symposium On Usable Privacy and Security - SOUPS 2015, pp. 241–255. USENIX Association (2015)

    Google Scholar 

  15. Zhang, M., Duan, Y., Feng, Q., Yin, H.: Towards automatic generation of security-centric descriptions for Android apps. In: Conference on Computer and Communications Security - CCS 2015, pp. 518–529. ACM (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Feichtner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feichtner, J., Gruber, S. (2020). Code Between the Lines: Semantic Analysis of Android Applications. In: Hölbl, M., Rannenberg, K., Welzer, T. (eds) ICT Systems Security and Privacy Protection. SEC 2020. IFIP Advances in Information and Communication Technology, vol 580. Springer, Cham. https://doi.org/10.1007/978-3-030-58201-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58201-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58200-5

  • Online ISBN: 978-3-030-58201-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics