Annotation-Based Static Analysis for Personal Data Protection

  • Kalle HjerppeEmail author
  • Jukka Ruohonen
  • Ville Leppänen
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 576)


This paper elaborates the use of static source code analysis in the context of data protection. The topic is important for software engineering in order for software developers to improve the protection of personal data during software development. To this end, the paper proposes a design of annotating classes and functions that process personal data. The design serves two primary purposes: on one hand, it provides means for software developers to document their intent; on the other hand, it furnishes tools for automatic detection of potential violations. This dual rationale facilitates compliance with the General Data Protection Regulation (GDPR) and other emerging data protection and privacy regulations. In addition to a brief review of the state-of-the-art of static analysis in the data protection context and the design of the proposed analysis method, a concrete tool is presented to demonstrate a practical implementation for the Java programming language.


Data protection Privacy Static analysis Java GDPR 


  1. 1.
    Alzaidi, A., Alshehri, S., Buhari, S.M.: DroidRista: a highly precise static data flow analysis framework for android applications. Int. J. Inf. Secur. 1, 1–14 (2019)Google Scholar
  2. 2.
    Arzt, S., et al.: Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. ACM SIGPLAN Not. 49(6), 259–269 (2014)CrossRefGoogle Scholar
  3. 3.
    Ayewah, N., Pugh, W., Hovemeyer, D., Morgenthaler, J.D., Penix, J.: Using static analysis to find bugs. IEEE Softw. 25(5), 22–29 (2008). Scholar
  4. 4.
    Berghel, H.: Equifax and the latest round of identity theft roulette. Computer 50(12), 72–76 (2017). Scholar
  5. 5.
    Boehm, B.W., Brown, J.R., Lipow, M.: Quantitative evaluation of software quality. In: Proceedings of the 2nd International Conference on Software Engineering, pp. 592–605. IEEE Computer Society Press (1976)Google Scholar
  6. 6.
    Calcagno, C., et al.: Moving fast with software verification. In: Havelund, K., Holzmann, G., Joshi, R. (eds.) NFM 2015. LNCS, vol. 9058, pp. 3–11. Springer, Cham (2015). Scholar
  7. 7.
    Calcagno, C., Distefano, D., O’Hearn, P., Yang, H.: Compositional shape analysis by means of bi-abduction. In: ACM SIGPLAN Notices, vol. 44, pp. 289–300. ACM (2009)CrossRefGoogle Scholar
  8. 8.
    Chatzieleftheriou, G., Katsaros, P.: Test-driving static analysis tools in search of C code vulnerabilities. In: 2011 IEEE 35th Annual Computer Software and Applications Conference Workshops, pp. 96–103. IEEE (2011)Google Scholar
  9. 9.
    Chess, B., McGraw, G.: Static analysis for security. IEEE Secur. Priv. 2(6), 76–79 (2004). Scholar
  10. 10.
    Chowdhury, I., Chan, B., Zulkernine, M.: Security metrics for source code structures. In: Proceedings of the Fourth International Workshop on Software Engineering for Secure Systems, pp. 57–64. ACM (2008)Google Scholar
  11. 11.
    Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 238–252. ACM (1977)Google Scholar
  12. 12.
    Cousot, P., Cousot, R.: Modular static program analysis. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 159–179. Springer, Heidelberg (2002). Scholar
  13. 13.
    Danezis, G., et al.: Privacy and data protection by design-from policy to engineering. arXiv preprint arXiv:1501.03726 (2015)
  14. 14.
    Distefano, D., Fähndrich, M., Logozzo, F., O’Hearn, P.W.: Scaling static analyses at Facebook. Commun. ACM 62(8), 62–70 (2019)CrossRefGoogle Scholar
  15. 15.
    Enck, W., et al.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. (TOCS) 32(2), 5 (2014)CrossRefGoogle Scholar
  16. 16.
    Evans, D., Larochelle, D.: Improving security using extensible lightweight static analysis. IEEE Softw. 19(1), 42–51 (2002)CrossRefGoogle Scholar
  17. 17.
    Ferrara, P., Olivieri, L., Spoto, F.: Tailoring taint analysis to GDPR. In: Medina, M., Mitrakas, A., Rannenberg, K., Schweighofer, E., Tsouroulas, N. (eds.) APF 2018. LNCS, vol. 11079, pp. 63–76. Springer, Cham (2018). Scholar
  18. 18.
    Ferrara, P., Spoto, F.: Static analysis for GDPR compliance. In: ITASEC 2018 (2018).
  19. 19.
    Ferrara, P., Tripp, O., Pistoia, M.: MorphDroid: fine-grained privacy verification. In: Proceedings of the 31st Annual Computer Security Applications Conference, pp. 371–380. ACM (2015)Google Scholar
  20. 20.
    Fuchs, A.P., Chaudhuri, A., Foster, J.S.: Scandroid: automated security certification of android. ACM, Technical report (2009)Google Scholar
  21. 21.
    Gerl, A., Bennani, N., Kosch, H., Brunie, L.: LPL, towards a GDPR-compliant privacy language: formal definition and usage. In: Hameurlain, A., Wagner, R. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXVII. LNCS, vol. 10940, pp. 41–80. Springer, Heidelberg (2018). Scholar
  22. 22.
    Hjerppe, K., Ruohonen, J., Leppänen, V.: The general data protection regulation: requirements, architectures, and constraints. In: 2019 IEEE 27th International Requirements Engineering Conference (RE), p. (to appear). IEEE (2019).
  23. 23.
    Holvitie, J., Leppänen, V.: DebtFlag: technical debt management with a development environment integrated tool. In: Proceedings of the 4th International Workshop on Managing Technical Debt, pp. 20–27. IEEE Press (2013)Google Scholar
  24. 24.
    Lin, J., Liu, B., Sadeh, N., Hong, J.I.: Modeling users’ mobile app privacy preferences: restoring usability in a sea of permission settings. In: 10th Symposium On Usable Privacy and Security (SOUPS 2014), pp. 199–212. USENIX Association, Menlo Park (2014)Google Scholar
  25. 25.
    Louridas, P.: Static code analysis. IEEE Softw. 23(4), 58–61 (2006). Scholar
  26. 26.
    Mäkelä, S., Leppänen, V.: Client-based cohesion metrics for Java programs. Sci. Comput. Program. 74(5–6), 355–378 (2009)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Myers, A.C., Liskov, B.: Protecting privacy using the decentralized label model. ACM Trans. Softw. Eng. Methodol. (TOSEM) 9(4), 410–442 (2000)CrossRefGoogle Scholar
  28. 28.
    Nielson, F., Nielson, H.R., Hankin, C.: Principles of program analysis. Springer, Heidelberg (2015). Scholar
  29. 29.
    Palsberg, J., Jay, C.B.: The essence of the visitor pattern. In: Proceedings of The Twenty-Second Annual International Computer Software and Applications Conference (Compsac 1998) (Cat. No.98CB 36241), pp. 9–15, August 1998Google Scholar
  30. 30.
    Pfretzschner, B., ben Othmane, L.: Identification of dependency-based attacks on Node.js. In: Proceedings of the 12th International Conference on Availability, Reliability and Security, ARES 2017, pp. 68:1–68:6. ACM (2017)Google Scholar
  31. 31.
    Ramezanifarkhani, T., Owe, O., Tokas, S.: A secrecy-preserving language for distributed and object-oriented systems. J. Log. Algebr. Methods Program. 99, 1–25 (2018)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Rocha, H., Valente, M.T.: How annotations are used in Java: an empirical study. In: SEKE, pp. 426–431 (2011)Google Scholar
  33. 33.
    Romano, D., Pinzger, M.: Using source code metrics to predict change-prone Java interfaces. In: 2011 27th IEEE International Conference on Software Maintenance (ICSM), pp. 303–312. IEEE (2011)Google Scholar
  34. 34.
    Ruohonen, J.: An empirical analysis of vulnerabilities in Python packages for web applications. In: Proceedings of the 9th International Workshop on Empirical Software Engineering in Practice (IWESEP 2018), pp. 25–30. IEEE, Nara (2018)Google Scholar
  35. 35.
    Sadeghi, A., Bagheri, H., Garcia, J., Malek, S.: A taxonomy and qualitative comparison of program analysis techniques for security assessment of android software. IEEE Trans. Softw. Eng. 43(6), 492–530 (2016)CrossRefGoogle Scholar
  36. 36.
    Schneider, G.: Is privacy by construction possible? In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11244, pp. 471–485. Springer, Cham (2018). Scholar
  37. 37.
    The European Union: Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation) (2016)Google Scholar
  38. 38.
    Wang, X., Continella, A., Yang, Y., He, Y., Zhu, S.: LeakDoctor: toward automatically diagnosing privacy leaks in mobile applications. Proc. ACM Interact. Mob. Wearable Ubiquit. Technol. 3(1), 28 (2019)Google Scholar
  39. 39.
    Yang, Z., Yang, M., Zhang, Y., Gu, G., Ning, P., Wang, X.S.: AppIntent: analyzing sensitive data transmission in android for privacy leakage detection. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, pp. 1043–1054. ACM (2013)Google Scholar
  40. 40.
    Zimmermann, M., Staicu, C., Tenny, C., Pradel, M.: Small world with high risks: a study of security threats in the NPM ecosystem. In: Proceedings of the 28th USENIX Security Symposium. USENIX, Santa Clara (2019)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2020

Authors and Affiliations

  1. 1.Geniem OyTampereFinland
  2. 2.University of TurkuTurkuFinland

Personalised recommendations