Code Between the Lines: Semantic Analysis of Android Applications

Feichtner, Johannes; Gruber, Stefan

doi:10.1007/978-3-030-58201-2_12

Johannes Feichtner^18,19 &
Stefan Gruber¹⁸

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 580))

Included in the following conference series:

IFIP International Conference on ICT Systems Security and Privacy Protection

1101 Accesses
1 Citations

Abstract

Static and dynamic program analysis are the key concepts researchers apply to uncover security-critical implementation weaknesses in Android applications. As it is often not obvious in which context problematic statements occur, it is challenging to assess their practical impact. While some flaws may turn out to be bad practice but not undermine the overall security level, others could have a serious impact. Distinguishing them requires knowledge of the designated app purpose.

In this paper, we introduce a machine learning-based system that is capable of generating natural language text describing the purpose and core functionality of Android apps based on their actual code. We design a dense neural network that captures the semantic relationships of resource identifiers, string constants, and API calls contained in apps to derive a high-level picture of implemented program behavior. For arbitrary applications, our system can predict precise, human-readable keywords and short phrases that indicate the main use-cases apps are designed for.

We evaluate our solution on 67,040 real-world apps and find that with a precision between 69% and 84% we can identify keywords that also occur in the developer-provided description in Google Play. To avoid incomprehensible black box predictions, we apply a model explaining algorithm and demonstrate that our technique can substantially augment inspections of Android apps by contributing contextual information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Android Malware Detection Through a Pre-trained Model for Code Understanding

Semantic Code Search in Software Repositories using Neural Machine Translation

Hybroid: Toward Android Malware Detection and Categorization with Program Code and Network Traffic

Notes

1.
Our implementation is available at: https://github.com/sg10/apk-verbalizer.

References

Cao, Y., et al.: EdgeMiner: automatically detecting implicit control flow transitions through the android framework. In: Network and Distributed System Security Symposium - NDSS 2015. The Internet Society (2015)
Google Scholar
Gao, H., et al.: AutoPer: automatic recommender for runtime-permission in android applications. In: 43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019, Milwaukee, WI, USA, 15–19 July 2019, vol. 1, pp. 107–116. IEEE (2019)
Google Scholar
Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: International Conference on Software Engineering - ICSE 2014, pp. 1025–1035. ACM (2014)
Google Scholar
Hamedani, M.R., Shin, D., Lee, M., Cho, S., Hwang, C.: AndroClass: an effective method to classify android applications by applying deep neural networks to comprehensive features. Wirel. Commun. Mob. Comput. 2018, 1250359:1–1250359:21 (2018)
Google Scholar
Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: MalDozer: automatic framework for android malware detection using deep learning. Digital Invest. 24, S48–S59 (2018)
Article Google Scholar
Kowalczyk, E., Memon, A.M., Cohen, M.B.: Piecing together app behavior from multiple artifacts: a case study. In: Symposium on Software Reliability Engineering - ISSRE 2015, pp. 438–449. IEEE Computer Society (2015)
Google Scholar
Kuznetsov, K., Avdiienko, V., Gorla, A., Zeller, A.: Checking app user interfaces against app descriptions. In: Workshop on App Market Analytics - WAMA, pp. 1–7. ACM (2016)
Google Scholar
Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Neural Information Processing Systems - NIPS 2017, pp. 4765–4774 (2017)
Google Scholar
Pan, X., et al.: FlowCog: context-aware semantics extraction and analysis of information flow leaks in android apps. In: USENIX Security 2018, pp. 1669–1685. USENIX Association (2018)
Google Scholar
Qu, Z., Rastogi, V., Zhang, X., Chen, Y., Zhu, T., Chen, Z.: AutoCog: measuring the description-to-permission fidelity in android applications. In: Conference on Computer and Communications Security - CCS 2014, pp. 1354–1365. ACM (2014)
Google Scholar
Takahashi, T., Ban, T.: Android application analysis using machine learning techniques. In: Sikos, L.F. (ed.) AI in Cybersecurity. ISRL, vol. 151, pp. 181–205. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98842-9_7
Chapter Google Scholar
Vásquez, M.L., Holtzhauer, A., Poshyvanyk, D.: On automatically detecting similar Android apps. In: International Conference on Program Comprehension - ICPC 2016, pp. 1–10. IEEE Computer Society (2016)
Google Scholar
Viennot, N., Garcia, E., Nieh, J.: A measurement study of Google Play. In: Measurement and Modeling of Computer Systems - SIGMETRICS 2014, pp. 221–233. ACM (2014)
Google Scholar
Watanabe, T., Akiyama, M., Sakai, T., Mori, T.: Understanding the inconsistencies between text descriptions and the use of privacy-sensitive resources of mobile apps. In: Symposium On Usable Privacy and Security - SOUPS 2015, pp. 241–255. USENIX Association (2015)
Google Scholar
Zhang, M., Duan, Y., Feng, Q., Yin, H.: Towards automatic generation of security-centric descriptions for Android apps. In: Conference on Computer and Communications Security - CCS 2015, pp. 518–529. ACM (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Applied Information Processing and Communications (IAIK), Graz University of Technology, Inffeldgasse 16a, 8010, Graz, Austria
Johannes Feichtner & Stefan Gruber
Secure Information Technology Center – Austria (A-SIT), Seidlgasse 22, 1030, Vienna, Austria
Johannes Feichtner

Authors

Johannes Feichtner
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Gruber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johannes Feichtner .

Editor information

Editors and Affiliations

University of Maribor, Maribor, Slovenia
Marko Hölbl
Goethe University Frankfurt, Frankfurt, Germany
Kai Rannenberg
University of Maribor, Maribor, Slovenia
Tatjana Welzer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feichtner, J., Gruber, S. (2020). Code Between the Lines: Semantic Analysis of Android Applications. In: Hölbl, M., Rannenberg, K., Welzer, T. (eds) ICT Systems Security and Privacy Protection. SEC 2020. IFIP Advances in Information and Communication Technology, vol 580. Springer, Cham. https://doi.org/10.1007/978-3-030-58201-2_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-58201-2_12
Published: 14 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58200-5
Online ISBN: 978-3-030-58201-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Code Between the Lines: Semantic Analysis of Android Applications

Abstract

Access this chapter

Similar content being viewed by others

Android Malware Detection Through a Pre-trained Model for Code Understanding

Semantic Code Search in Software Repositories using Neural Machine Translation

Hybroid: Toward Android Malware Detection and Categorization with Program Code and Network Traffic

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Code Between the Lines: Semantic Analysis of Android Applications

Abstract

Access this chapter

Similar content being viewed by others

Android Malware Detection Through a Pre-trained Model for Code Understanding

Semantic Code Search in Software Repositories using Neural Machine Translation

Hybroid: Toward Android Malware Detection and Categorization with Program Code and Network Traffic

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation