Skip to main content

Learning Shape Analysis

  • Conference paper
  • First Online:
Static Analysis (SAS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10422))

Included in the following conference series:

Abstract

We present a data-driven verification framework to automatically prove memory safety of heap-manipulating programs. Our core contribution is a novel statistical machine learning technique that maps observed program states to (possibly disjunctive) separation logic formulas describing the invariant shape of (possibly nested) data structures at relevant program locations. We then attempt to verify these predictions using a program verifier, where counterexamples to a predicted invariant are used as additional input to the shape predictor in a refinement loop. We have implemented our techniques in Locust, an extension of the GRASShopper verification tool. Locust is able to automatically prove memory safety of implementations of classical heap-manipulating programs such as insertionsort, quicksort and traversals of nested data structures.

P. Kohli—Now at Google DeepMind.

D. Tarlow—Now at Google Brain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We will discuss the role of \(\ldots \) in Sect. 3.2.

  2. 2.

    While we have experimented with avoiding this simplification to side-step the need for feature engineering by operating directly on input graphs [28], the resulting system was substantially harder to train and slightly less precise, as it had to learn domain knowledge from the training data.

  3. 3.

    Here, we discard non-pointer values.

  4. 4.

    Note that the softmax function used in the \(E\) case is the generalization of the sigmoid function to many values.

  5. 5.

    Conceivably, these could be provided by users in a pre-formal language and translated to separation logic using an interactive elaboration procedure. Alternatively, given a test suite, Platypus could predict the initial precondition as well.

References

  1. Abdulla, P.A., Holík, L., Jonsson, B., Lengál, O., Trinh, C.Q., Vojnar, T.: Verification of heap manipulating programs with ordered data by extended forest automata. Acta Inf. 53(4), 357–385 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  2. Allamanis, M., Tarlow, D., Gordon, A.D., Wei, Y.: Bimodal modelling of source code and natural language. In: ICML 2015, pp. 2123–2132 (2015)

    Google Scholar 

  3. Alur, R., Cerný, P.: Streaming transducers for algorithmic verification of single-pass list-processing programs. In: POPL 2011, pp. 599–610 (2011)

    Google Scholar 

  4. Bakir, G.H., Hofmann, T., Schölkopf, B., Smola, A.J., Taskar, B., Vishwanathan, S.V.N.: Predicting Structured Data. MIT Press (2007)

    Google Scholar 

  5. Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.W., Wies, T., Yang, H.: Shape analysis for composite data structures. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 178–192. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73368-3_22

    Chapter  Google Scholar 

  6. Bhargavan, K., Fournet, C., Kohlweiss, M., Pironti, A., Strub, P.: Implementing TLS with verified cryptographic security. In: SP 2013, pp. 445–459 (2013)

    Google Scholar 

  7. Bouajjani, A., Drăgoi, C., Enea, C., Sighireanu, M.: Abstract domains for automated reasoning about list-manipulating programs with infinite data. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 1–22. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27940-9_1

    Chapter  Google Scholar 

  8. Bradley, A.R.: SAT-based model checking without unrolling. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidelberg (2011). doi:10.1007/978-3-642-18275-4_7

    Chapter  Google Scholar 

  9. Brotherston, J., Distefano, D., Petersen, R.L.: Automated cyclic entailment proofs in separation logic. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 131–146. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22438-6_12

    Chapter  Google Scholar 

  10. Brotherston, J., Gorogiannis, N.: Cyclic abduction of inductively defined safety and termination preconditions. In: Müller-Olm, M., Seidl, H. (eds.) SAS 2014. LNCS, vol. 8723, pp. 68–84. Springer, Cham (2014). doi:10.1007/978-3-319-10936-7_5

    Google Scholar 

  11. Brotherston, J., Gorogiannis, N., Petersen, R.L.: A generic cyclic theorem prover. In: Jhala, R., Igarashi, A. (eds.) APLAS 2012. LNCS, vol. 7705, pp. 350–367. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35182-2_25

    Chapter  Google Scholar 

  12. Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional shape analysis by means of bi-abduction. J. ACM 58(6), 26 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  13. Dudka, K., Peringer, P., Vojnar, T.: Predator: a practical tool for checking manipulation of dynamic data structures using separation logic. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 372–378. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_29

    Chapter  Google Scholar 

  14. Ernst, M.D., Perkins, J.H., Guo, P.J., McCamant, S., Pacheco, C., Tschantz, M.S., Xiao, C.: The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1–3), 35–45 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Garg, P., Löding, C., Madhusudan, P., Neider, D.: ICE: a robust framework for learning invariants. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 69–87. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_5

    Google Scholar 

  16. Garg, P., Neider, D., Madhusudan, P., Roth, D.: Learning invariants using decision trees and implication counterexamples. In: POPL 2016, pp. 499–512 (2016)

    Google Scholar 

  17. Gotsman, A., Berdine, J., Cook, B.: Interprocedural shape analysis with separated heap abstractions. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 240–260. Springer, Heidelberg (2006). doi:10.1007/11823230_16

    Chapter  Google Scholar 

  18. Haase, C., Ishtiaq, S., Ouaknine, J., Parkinson, M.J.: SeLoger: a tool for graph-based reasoning in separation logic. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 790–795. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_55

    Chapter  Google Scholar 

  19. Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts, M.L., Setty, S.T.V., Zill, B.: IronFleet: proving practical distributed systems correct. In: SOSP 2015, pp. 1–17 (2015)

    Google Scholar 

  20. Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Effectively-propositional reasoning about reachability in linked data structures. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 756–772. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_53

    Chapter  Google Scholar 

  21. Itzhaky, S., Bjørner, N., Reps, T., Sagiv, M., Thakur, A.: Property-directed shape analysis. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 35–51. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_3

    Google Scholar 

  22. Jung, Y., Kong, S., David, C., Wang, B., Yi, K.: Automatically inferring loop invariants via algorithmic learning. MSCS 25(4), 892–915 (2015)

    MathSciNet  MATH  Google Scholar 

  23. Kennedy, A.J., Vytiniotis, D.: Every bit counts: the binary representation of typed data and programs. JFP 22(4–5), 529–573 (2012)

    MATH  Google Scholar 

  24. Klein, G., Andronick, J., Elphinstone, K., Murray, T.C., Sewell, T., Kolanski, R., Heiser, G.: Comprehensive formal verification of an OS microkernel. TOCS 32(1), 2 (2014)

    Article  Google Scholar 

  25. Le, Q.L., Gherghina, C., Qin, S., Chin, W.-N.: Shape analysis via second-order bi-abduction. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 52–68. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_4

    Google Scholar 

  26. Le, Q.L., Sun, J., Chin, W.-N.: Satisfiability modulo heap-based programs. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 382–404. Springer, Cham (2016). doi:10.1007/978-3-319-41528-4_21

    Google Scholar 

  27. Leroy, X.: Formal verification of a realistic compiler. CACM 52(7), 107–115 (2009)

    Article  Google Scholar 

  28. Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.: Gated graph sequence neural networks. In: ICLR 2016 (2016)

    Google Scholar 

  29. Magill, S., Tsai, M., Lee, P., Tsay, Y.: Automatic numeric abstractions for heap-manipulating programs. In: POPL 2010, pp. 211–222 (2010)

    Google Scholar 

  30. McMillan, K.L.: Lazy abstraction with Interpolants. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006). doi:10.1007/11817963_14

    Chapter  Google Scholar 

  31. Moy, Y., Marché, C.: Modular inference of subprogram contracts for safety checking. J. Symb. Comput. 45(11), 1184–1211 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Mühlberg, J.T., White, D.H., Dodds, M., Lüttgen, G., Piessens, F.: Learning assertions to verify linked-list programs. In: Calinescu, R., Rumpe, B. (eds.) SEFM 2015. LNCS, vol. 9276, pp. 37–52. Springer, Cham (2015). doi:10.1007/978-3-319-22969-0_3

    Chapter  Google Scholar 

  33. O’Hearn, P., Reynolds, J., Yang, H.: Local reasoning about programs that alter data structures. In: Fribourg, L. (ed.) CSL 2001. LNCS, vol. 2142, pp. 1–19. Springer, Heidelberg (2001). doi:10.1007/3-540-44802-0_1

    Chapter  Google Scholar 

  34. Navarro Pérez, J.A., Rybalchenko, A.: Separation logic modulo theories. In: Shan, C. (ed.) APLAS 2013. LNCS, vol. 8301, pp. 90–106. Springer, Cham (2013). doi:10.1007/978-3-319-03542-0_7

    Chapter  Google Scholar 

  35. Piskac, R., Wies, T., Zufferey, D.: GRASShopper. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 124–139. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54862-8_9

  36. Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: LICS 2002, pp. 55–74 (2002)

    Google Scholar 

  37. Sagiv, S., Reps, T.W., Wilhelm, R.: Parametric shape analysis via 3-valued logic. TOPLAS 24(3), 217–298 (2002)

    Article  Google Scholar 

  38. Sharma, R., Aiken, A.: From invariant checking to invariant inference using randomized search. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 88–105. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_6

    Google Scholar 

  39. Sharma, R., Gupta, S., Hariharan, B., Aiken, A., Liang, P., Nori, A.V.: A data driven approach for algebraic loop invariants. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 574–592. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37036-6_31

    Chapter  Google Scholar 

  40. Sharma, R., Gupta, S., Hariharan, B., Aiken, A., Nori, A.V.: Verification as learning geometric concepts. In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 388–411. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38856-9_21

    Chapter  Google Scholar 

  41. Sharma, R., Nori, A.V., Aiken, A.: Interpolants as classifiers. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 71–87. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31424-7_11

    Chapter  Google Scholar 

  42. Yorsh, G., Ball, T., Sagiv, M.: Testing, abstraction, theorem proving: better together! In: ISSTA 2006, pp. 145–156 (2006)

    Google Scholar 

  43. Zhu, H., Petri, G., Jagannathan, S.: Automatically learning shape specifications. In: PLDI 2016 (2016)

    Google Scholar 

Download references

Acknowledgements

We thank Martin Hruška and Quang Loc Le for help with running their tools S2/HIP, Forester, and Predator for the experiments. We also thank Thomas Wies for valuable feedback on drafts of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Brockschmidt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Brockschmidt, M., Chen, Y., Kohli, P., Krishna, S., Tarlow, D. (2017). Learning Shape Analysis. In: Ranzato, F. (eds) Static Analysis. SAS 2017. Lecture Notes in Computer Science(), vol 10422. Springer, Cham. https://doi.org/10.1007/978-3-319-66706-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66706-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66705-8

  • Online ISBN: 978-3-319-66706-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics