Skip to main content

\(\mathsf {BackFlow}\): Backward Context-Sensitive Flow Reconstruction of Taint Analysis Results

  • Conference paper
  • First Online:
Book cover Verification, Model Checking, and Abstract Interpretation (VMCAI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11990))

Abstract

Taint analysis detects if data coming from a source, such as user input, flows into a sink, such as an SQL query, unsanitized (not properly escaped). Both static and dynamic taint analyses have been widely applied to detect injection vulnerabilities in real world software. A main drawback of static analysis is that it could produce false alarms. In addition, it is extremely time-consuming to manually explain the flow of tainted data from the results of the analysis, to understand why a specific warning was raised. This paper formalizes \(\mathsf {BackFlow}\), a context-sensitive taint flow reconstructor that, starting from the results of a taint-analysis engine, reconstructs how tainted data flows inside the program and builds paths connecting sources to sinks. \(\mathsf {BackFlow}\) has been implemented on Julia’s static taint analysis. Experimental results on a set of standard benchmarks show that, when \(\mathsf {BackFlow}\) produces a taint graph for an injection warning, then there is empirical evidence that such warning is a true alarm. Moreover \(\mathsf {BackFlow}\) scales to real world programs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Equifax#May%E2%80%93July_2017_data_breach.

  2. 2.

    https://techcrunch.com/2016/10/10/hackers-release-source-code-for-a-powerful-ddos-app-called-mirai/.

  3. 3.

    Note that this is an empirical result, since theoretically \(\mathsf {BackFlow}\) might produce taint graphs for false alarms, and fail to produce taint graphs for true alarms.

  4. 4.

    https://jgrapht.org/.

  5. 5.

    The user manual can be retrieved at https://static.juliasoft.com/docs/latest/pdf/EclipsePluginUserGuide.pdf.

  6. 6.

    https://www.owasp.org/index.php/Category:OWASP_WebGoat_Project.

  7. 7.

    We were not able to find a distribution of \(\mathtt {jugjobs}\) with a version number.

References

  1. Andersen, L.: Program analysis and specialization for the C programming language. Ph.D. thesis, University of Copenhagen (1994)

    Google Scholar 

  2. Arkin, B., Stender, S., McGraw, G.: Software penetration testing. IEEE Secur. Priv. 3(1), 84–87 (2005)

    Article  Google Scholar 

  3. Arzt, S., et al.: FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: Proceedings of PLDI 2014. ACM (2014)

    Google Scholar 

  4. Balzarotti, D., et al.: Saner: composing static and dynamic analysis to validate sanitization in web applications. In: Proceedings of S&P 2008. IEEE (2008)

    Google Scholar 

  5. Barbon, G., Cortesi, A., Ferrara, P., Pistoia, M., Tripp, O.: Privacy analysis of android apps: implicit flows and quantitative analysis. In: Saeed, K., Homenda, W. (eds.) CISIM 2015. LNCS, vol. 9339, pp. 3–23. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24369-6_1

    Chapter  Google Scholar 

  6. Bryant, R.: Symbolic Boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. 24(3), 293–318 (1992)

    Article  MathSciNet  Google Scholar 

  7. Buro, S., Mastroeni, I.: Abstract code injection. In: Dillig, I., Palsberg, J. (eds.) Verification, Model Checking, and Abstract Interpretation. LNCS, vol. 10747, pp. 116–137. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73721-8_6

    Chapter  Google Scholar 

  8. Clause, J., Li, W., Orso, A.: Dytan: a generic dynamic taint analysis framework. In: Proceedings of ISSTA 2007. ACM (2007)

    Google Scholar 

  9. Cortesi, A., Ferrara, P., Pistoia, M., Tripp, O.: Datacentric semantics for verification of privacy policy compliance by mobile applications. In: D’Souza, D., Lal, A., Larsen, K.G. (eds.) VMCAI 2015. LNCS, vol. 8931, pp. 61–79. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46081-8_4

    Chapter  MATH  Google Scholar 

  10. Costantini, G., Ferrara, P., Cortesi, A.: A suite of abstract domains for static analysis of string values. Softw.: Pract. Exp. 45(1), 245–287 (2015)

    Google Scholar 

  11. Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of POPL 1977. ACM (1977)

    Google Scholar 

  12. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Proceedings of POPL 1979. ACM (1979)

    Google Scholar 

  13. Csallner, C., Smaragdakis, Y.: Check ‘n’ crash: combining static checking and testing. In: Proceedings of ICSE 2005. ACM (2005)

    Google Scholar 

  14. Denning, D.E.: A lattice model of secure information flow. Commun. ACM 19(5), 236–243 (1976)

    Article  MathSciNet  Google Scholar 

  15. Deutsch, A.: Interprocedural may-alias analysis for pointers: beyond k-limiting. In: Proceedings of PLDI 1994. ACM (1994)

    Google Scholar 

  16. Enck, W., et al.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. 32(2), 5:1–5:29 (2014)

    Article  Google Scholar 

  17. Ferrara, P.: Generic combination of heap and value analyses in abstract interpretation. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 302–321. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54013-4_17

    Chapter  MATH  Google Scholar 

  18. Ferrara, P.: A generic framework for heap and value analyses of object-oriented programming languages. Theor. Comput. Sci. 631, 43–72 (2016)

    Article  MathSciNet  Google Scholar 

  19. Ferrara, P., Olivieri, L., Spoto, F.: BackFlow: backward context-sensitive flow reconstruction of taint analysis results (2019). https://doi.org/10.5281/zenodo.3539240

  20. Ferrara, P., Tripp, O., Pistoia, M.: MorphDroid: fine-grained privacy verification. In: Proceedings of ACSAC 2015. ACM (2015)

    Google Scholar 

  21. Grove, D., DeFouw, G., Dean, J., Chambers, C.: Call graph construction in object-oriented languages. In: Proceedings of OOPSLA 1997. ACM (1997)

    Google Scholar 

  22. Hind, M.: Pointer analysis: haven’t we solved this problem yet? In: Proceedings of PASTE 2001. ACM (2001)

    Google Scholar 

  23. Huang, W., Dong, Y., Milanova, A.: Type-based taint analysis for Java web applications. In: Gnesi, S., Rensink, A. (eds.) FASE 2014. LNCS, vol. 8411, pp. 140–154. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54804-8_10

    Chapter  Google Scholar 

  24. Jovanovic, N., Kruegel, C., Kirda, E.: Pixy: a static analysis tool for detecting web application vulnerabilities. In: Proceeding of S&P 2006. IEEE (2006)

    Google Scholar 

  25. Leek, T.R., Brown, R.E., Zhivich, M.A., Leek, T.R., Brown, R.E.: Coverage maximization using dynamic taint tracing. Technical report, MIT Lincoln Laboratory (2007)

    Google Scholar 

  26. Livshits, V.B., Lam, M.S.: Finding security vulnerabilities in Java applications with static analysis. In: Proceedings of USENIX Security 2005. USENIX Association (2005)

    Google Scholar 

  27. Myers, A.C.: JFlow: practical mostly-static information flow control. In: Proceedings of POPL 1999. ACM (1999)

    Google Scholar 

  28. Newsome, J., Song, D.: Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of NDSS 2005. The Internet Society (2005)

    Google Scholar 

  29. Nikolić, Đ., Spoto, F.: Definite expression aliasing analysis for Java bytecode. In: Roychoudhury, A., D’Souza, M. (eds.) ICTAC 2012. LNCS, vol. 7521, pp. 74–89. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32943-2_6

    Chapter  Google Scholar 

  30. Nikolic, D., Spoto, F.: Reachability analysis of program variables. ACM Trans. Program. Lang. Syst. 35(4), 14:1–14:68 (2014)

    MATH  Google Scholar 

  31. Palsberg, J., Schwartzbach, M.I.: Object-oriented type inference. In: Proceedings of OOPSLA 1991. ACM (1991)

    Google Scholar 

  32. Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via graph reachability. In: Proceedings of POPL 1995. ACM (1995)

    Google Scholar 

  33. Sabelfeld, A., Myers, A.C.: Language-based information-flow security. IEEE J. Sel. A. Commun. 21(1), 5–19 (2006)

    Article  Google Scholar 

  34. Secci, S., Spoto, F.: Pair-sharing analysis of object-oriented programs. In: Hankin, C., Siveroni, I. (eds.) SAS 2005. LNCS, vol. 3672, pp. 320–335. Springer, Heidelberg (2005). https://doi.org/10.1007/11547662_22

    Chapter  MATH  Google Scholar 

  35. Spoto, F.: The Julia static analyzer for Java. In: Rival, X. (ed.) SAS 2016. LNCS, vol. 9837, pp. 39–57. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53413-7_3

    Chapter  Google Scholar 

  36. Spoto, F., et al.: Static identification of injection attacks in Java. ACM Trans. Program. Lang. Syst. (TOPLAS) 41, 18 (2019)

    Article  Google Scholar 

  37. Sridharan, M., Artzi, S., Pistoia, M., Guarnieri, S., Tripp, O., Berg, R.: F4F: taint analysis of framework-based web applications. In: Proceedings of OOPSLA 2011. ACM (2011)

    Google Scholar 

  38. Tip, F., Palsberg, J.: Scalable propagation-based call graph construction algorithms. In: Proceedings of OOPSLA 2000. ACM (2000)

    Google Scholar 

  39. Tripp, O., Ferrara, P., Pistoia, M.: Hybrid security analysis of web JavaScript code via dynamic partial evaluation. In: Proceedings of ISSTA 2014. ACM (2014)

    Google Scholar 

  40. Tripp, O., Guarnieri, S., Pistoia, M., Aravkin, A.: ALETHEIA: improving the usability of static security analysis. In: Proceedings of CCS 2014. ACM (2014)

    Google Scholar 

  41. Tripp, O., Pistoia, M., Fink, S.J., Sridharan, M., Weisman, O.: TAJ: effective taint analysis of web applications. In: Proceedings of PLDI 2009. ACM (2009)

    Google Scholar 

  42. Vogt, P., Nentwich, F., Jovanovic, N., Kirda, E., Kruegel, C., Vigna, G.: Cross-site scripting prevention with dynamic data tainting and static analysis. In: Proceedings of NDSS 2005. The Internet Society (2007)

    Google Scholar 

  43. Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Proceedings of PLDI 2007. ACM (2007)

    Google Scholar 

  44. Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: Proceedings of CCS 2007. ACM (2007)

    Google Scholar 

  45. Zanioli, M., Ferrara, P., Cortesi, A.: SAILS: static analysis of information leakage with sample. In: Proceedings of SAC 2012. ACM (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pietro Ferrara .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ferrara, P., Olivieri, L., Spoto, F. (2020). \(\mathsf {BackFlow}\): Backward Context-Sensitive Flow Reconstruction of Taint Analysis Results. In: Beyer, D., Zufferey, D. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2020. Lecture Notes in Computer Science(), vol 11990. Springer, Cham. https://doi.org/10.1007/978-3-030-39322-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-39322-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-39321-2

  • Online ISBN: 978-3-030-39322-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics