Learning Shape Analysis

Brockschmidt, Marc; Chen, Yuxin; Kohli, Pushmeet; Krishna, Siddharth; Tarlow, Daniel

doi:10.1007/978-3-319-66706-5_4

Marc Brockschmidt¹⁴,
Yuxin Chen¹⁵,
Pushmeet Kohli¹⁷,
Siddharth Krishna¹⁶ &
…
Daniel Tarlow¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10422))

Included in the following conference series:

International Static Analysis Symposium

1497 Accesses
6 Citations

Abstract

We present a data-driven verification framework to automatically prove memory safety of heap-manipulating programs. Our core contribution is a novel statistical machine learning technique that maps observed program states to (possibly disjunctive) separation logic formulas describing the invariant shape of (possibly nested) data structures at relevant program locations. We then attempt to verify these predictions using a program verifier, where counterexamples to a predicted invariant are used as additional input to the shape predictor in a refinement loop. We have implemented our techniques in Locust, an extension of the GRASShopper verification tool. Locust is able to automatically prove memory safety of implementations of classical heap-manipulating programs such as insertionsort, quicksort and traversals of nested data structures.

P. Kohli—Now at Google DeepMind.

D. Tarlow—Now at Google Brain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We will discuss the role of \(\ldots \) in Sect. 3.2.
2.
While we have experimented with avoiding this simplification to side-step the need for feature engineering by operating directly on input graphs [28], the resulting system was substantially harder to train and slightly less precise, as it had to learn domain knowledge from the training data.
3.
Here, we discard non-pointer values.
4.
Note that the softmax function used in the \(E\) case is the generalization of the sigmoid function to many values.
5.
Conceivably, these could be provided by users in a pre-formal language and translated to separation logic using an interactive elaboration procedure. Alternatively, given a test suite, Platypus could predict the initial precondition as well.

References

Abdulla, P.A., Holík, L., Jonsson, B., Lengál, O., Trinh, C.Q., Vojnar, T.: Verification of heap manipulating programs with ordered data by extended forest automata. Acta Inf. 53(4), 357–385 (2016)
Article MathSciNet MATH Google Scholar
Allamanis, M., Tarlow, D., Gordon, A.D., Wei, Y.: Bimodal modelling of source code and natural language. In: ICML 2015, pp. 2123–2132 (2015)
Google Scholar
Alur, R., Cerný, P.: Streaming transducers for algorithmic verification of single-pass list-processing programs. In: POPL 2011, pp. 599–610 (2011)
Google Scholar
Bakir, G.H., Hofmann, T., Schölkopf, B., Smola, A.J., Taskar, B., Vishwanathan, S.V.N.: Predicting Structured Data. MIT Press (2007)
Google Scholar
Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.W., Wies, T., Yang, H.: Shape analysis for composite data structures. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 178–192. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73368-3_22
Chapter Google Scholar
Bhargavan, K., Fournet, C., Kohlweiss, M., Pironti, A., Strub, P.: Implementing TLS with verified cryptographic security. In: SP 2013, pp. 445–459 (2013)
Google Scholar
Bouajjani, A., Drăgoi, C., Enea, C., Sighireanu, M.: Abstract domains for automated reasoning about list-manipulating programs with infinite data. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 1–22. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27940-9_1
Chapter Google Scholar
Bradley, A.R.: SAT-based model checking without unrolling. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidelberg (2011). doi:10.1007/978-3-642-18275-4_7
Chapter Google Scholar
Brotherston, J., Distefano, D., Petersen, R.L.: Automated cyclic entailment proofs in separation logic. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 131–146. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22438-6_12
Chapter Google Scholar
Brotherston, J., Gorogiannis, N.: Cyclic abduction of inductively defined safety and termination preconditions. In: Müller-Olm, M., Seidl, H. (eds.) SAS 2014. LNCS, vol. 8723, pp. 68–84. Springer, Cham (2014). doi:10.1007/978-3-319-10936-7_5
Google Scholar
Brotherston, J., Gorogiannis, N., Petersen, R.L.: A generic cyclic theorem prover. In: Jhala, R., Igarashi, A. (eds.) APLAS 2012. LNCS, vol. 7705, pp. 350–367. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35182-2_25
Chapter Google Scholar
Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional shape analysis by means of bi-abduction. J. ACM 58(6), 26 (2011)
Article MathSciNet MATH Google Scholar
Dudka, K., Peringer, P., Vojnar, T.: Predator: a practical tool for checking manipulation of dynamic data structures using separation logic. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 372–378. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_29
Chapter Google Scholar
Ernst, M.D., Perkins, J.H., Guo, P.J., McCamant, S., Pacheco, C., Tschantz, M.S., Xiao, C.: The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1–3), 35–45 (2007)
Article MathSciNet MATH Google Scholar
Garg, P., Löding, C., Madhusudan, P., Neider, D.: ICE: a robust framework for learning invariants. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 69–87. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_5
Google Scholar
Garg, P., Neider, D., Madhusudan, P., Roth, D.: Learning invariants using decision trees and implication counterexamples. In: POPL 2016, pp. 499–512 (2016)
Google Scholar
Gotsman, A., Berdine, J., Cook, B.: Interprocedural shape analysis with separated heap abstractions. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 240–260. Springer, Heidelberg (2006). doi:10.1007/11823230_16
Chapter Google Scholar
Haase, C., Ishtiaq, S., Ouaknine, J., Parkinson, M.J.: SeLoger: a tool for graph-based reasoning in separation logic. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 790–795. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_55
Chapter Google Scholar
Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts, M.L., Setty, S.T.V., Zill, B.: IronFleet: proving practical distributed systems correct. In: SOSP 2015, pp. 1–17 (2015)
Google Scholar
Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Effectively-propositional reasoning about reachability in linked data structures. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 756–772. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_53
Chapter Google Scholar
Itzhaky, S., Bjørner, N., Reps, T., Sagiv, M., Thakur, A.: Property-directed shape analysis. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 35–51. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_3
Google Scholar
Jung, Y., Kong, S., David, C., Wang, B., Yi, K.: Automatically inferring loop invariants via algorithmic learning. MSCS 25(4), 892–915 (2015)
MathSciNet MATH Google Scholar
Kennedy, A.J., Vytiniotis, D.: Every bit counts: the binary representation of typed data and programs. JFP 22(4–5), 529–573 (2012)
MATH Google Scholar
Klein, G., Andronick, J., Elphinstone, K., Murray, T.C., Sewell, T., Kolanski, R., Heiser, G.: Comprehensive formal verification of an OS microkernel. TOCS 32(1), 2 (2014)
Article Google Scholar
Le, Q.L., Gherghina, C., Qin, S., Chin, W.-N.: Shape analysis via second-order bi-abduction. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 52–68. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_4
Google Scholar
Le, Q.L., Sun, J., Chin, W.-N.: Satisfiability modulo heap-based programs. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 382–404. Springer, Cham (2016). doi:10.1007/978-3-319-41528-4_21
Google Scholar
Leroy, X.: Formal verification of a realistic compiler. CACM 52(7), 107–115 (2009)
Article Google Scholar
Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.: Gated graph sequence neural networks. In: ICLR 2016 (2016)
Google Scholar
Magill, S., Tsai, M., Lee, P., Tsay, Y.: Automatic numeric abstractions for heap-manipulating programs. In: POPL 2010, pp. 211–222 (2010)
Google Scholar
McMillan, K.L.: Lazy abstraction with Interpolants. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006). doi:10.1007/11817963_14
Chapter Google Scholar
Moy, Y., Marché, C.: Modular inference of subprogram contracts for safety checking. J. Symb. Comput. 45(11), 1184–1211 (2010)
Article MathSciNet MATH Google Scholar
Mühlberg, J.T., White, D.H., Dodds, M., Lüttgen, G., Piessens, F.: Learning assertions to verify linked-list programs. In: Calinescu, R., Rumpe, B. (eds.) SEFM 2015. LNCS, vol. 9276, pp. 37–52. Springer, Cham (2015). doi:10.1007/978-3-319-22969-0_3
Chapter Google Scholar
O’Hearn, P., Reynolds, J., Yang, H.: Local reasoning about programs that alter data structures. In: Fribourg, L. (ed.) CSL 2001. LNCS, vol. 2142, pp. 1–19. Springer, Heidelberg (2001). doi:10.1007/3-540-44802-0_1
Chapter Google Scholar
Navarro Pérez, J.A., Rybalchenko, A.: Separation logic modulo theories. In: Shan, C. (ed.) APLAS 2013. LNCS, vol. 8301, pp. 90–106. Springer, Cham (2013). doi:10.1007/978-3-319-03542-0_7
Chapter Google Scholar
Piskac, R., Wies, T., Zufferey, D.: GRASShopper. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 124–139. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54862-8_9
Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: LICS 2002, pp. 55–74 (2002)
Google Scholar
Sagiv, S., Reps, T.W., Wilhelm, R.: Parametric shape analysis via 3-valued logic. TOPLAS 24(3), 217–298 (2002)
Article Google Scholar
Sharma, R., Aiken, A.: From invariant checking to invariant inference using randomized search. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 88–105. Springer, Cham (2014). doi:10.1007/978-3-319-08867-9_6
Google Scholar
Sharma, R., Gupta, S., Hariharan, B., Aiken, A., Liang, P., Nori, A.V.: A data driven approach for algebraic loop invariants. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 574–592. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37036-6_31
Chapter Google Scholar
Sharma, R., Gupta, S., Hariharan, B., Aiken, A., Nori, A.V.: Verification as learning geometric concepts. In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 388–411. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38856-9_21
Chapter Google Scholar
Sharma, R., Nori, A.V., Aiken, A.: Interpolants as classifiers. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 71–87. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31424-7_11
Chapter Google Scholar
Yorsh, G., Ball, T., Sagiv, M.: Testing, abstraction, theorem proving: better together! In: ISSTA 2006, pp. 145–156 (2006)
Google Scholar
Zhu, H., Petri, G., Jagannathan, S.: Automatically learning shape specifications. In: PLDI 2016 (2016)
Google Scholar

Download references

Acknowledgements

We thank Martin Hruška and Quang Loc Le for help with running their tools S2/HIP, Forester, and Predator for the experiments. We also thank Thomas Wies for valuable feedback on drafts of this paper.

Author information

Authors and Affiliations

Microsoft Research, Cambridge, UK
Marc Brockschmidt & Daniel Tarlow
ETH Zürich, Zürich, Switzerland
Yuxin Chen
New York University, New York, USA
Siddharth Krishna
Microsoft Research, Redmond, USA
Pushmeet Kohli

Authors

Marc Brockschmidt
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Pushmeet Kohli
View author publications
You can also search for this author in PubMed Google Scholar
Siddharth Krishna
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Tarlow
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Brockschmidt .

Editor information

Editors and Affiliations

Dipartimento di Matematica, University of Padova, Padua, Italy
Francesco Ranzato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brockschmidt, M., Chen, Y., Kohli, P., Krishna, S., Tarlow, D. (2017). Learning Shape Analysis. In: Ranzato, F. (eds) Static Analysis. SAS 2017. Lecture Notes in Computer Science(), vol 10422. Springer, Cham. https://doi.org/10.1007/978-3-319-66706-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-66706-5_4
Published: 19 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66705-8
Online ISBN: 978-3-319-66706-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics