Skip to main content

Inferring Dataflow Properties of User Defined Table Processors

  • Conference paper
Static Analysis (SAS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5673))

Included in the following conference series:

Abstract

In SCOPE, a SQL style cloud-level data-mining scripting language, table processing capabilities are often provided by user defined .NET methods. The SCOPE compiler can optimize a query plan if it knows certain dataflow relations between the input and output tables, such as column independence, column equality, or that a column’s values are non-null. This paper presents an automated analysis for inferring such relations from implementations of SCOPE table processing methods. Since most table processing methods are written as .NET iterators, our analysis must accurately deal with the resulting state-machine implementing such iterators. Other complications addressed are naming and estimating column numbers, aliasing and escaping, and the inference of universally quantified loop invariants.

We prototyped the analysis as Scooby, a static analyzer for .NET iterators. Scooby is able to discover useful properties for typical SCOPE programs automatically and efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abadi, M., Banerjee, A., Heintze, N., Riecke, J.G.: A core calculus of dependency. In: Proc. 26th ACM Symp. on Principles of Programming Languages (POPL), pp. 147–160. ACM Press, New York (1999)

    Google Scholar 

  2. Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Addison Wesley, Reading (1986)

    MATH  Google Scholar 

  3. Allen, R., Kennedy, K.: Compiler Optimization for Modern Architectures: a Dependence-based Approach. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  4. Ball, T., Rajamani, S.K.: The slam project: debugging system software via static analysis. In: POPL 2002: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 1–3. ACM, New York (2002)

    Google Scholar 

  5. Barnett, M., Fändrich, M., Garbervetsky, D., Logozzo, F.: Annotations for (more) precise points-to analysis. In: IWACO 2007: ECOOP International Workshop on Aliasing, Confinement and Ownership in object-oriented programming (July 2007)

    Google Scholar 

  6. Blanchet, B.: Escape analysis: correctness proof, implementation and experimental results. In: POPL 1998: Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 25–37. ACM, New York (1998)

    Google Scholar 

  7. Chaiken, R., Jenkins, B., Larson, P., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: Scope: easy and efficient parallel processing of massive data sets. PVLDB 1(2), 1265–1276 (2008)

    Google Scholar 

  8. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 26(1), 64–69 (1983)

    Article  Google Scholar 

  9. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Proceedings of POPL 1979 (1979)

    Google Scholar 

  10. Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: POPL 1978 (1978)

    Google Scholar 

  11. Ernst, M.D.: Dynamically Discovering Likely Program Invariants. Ph.D thesis, University of Washington (2000)

    Google Scholar 

  12. Fähndrich, M.A., Leino, K.R.M.: Declaring and checking non-null types in an Object-Oriented language. In: OOPSLA 2003, pp. 302–312. ACM Press, New York (2003)

    Google Scholar 

  13. Flanagan, C., Leino, K.R.M.: Houdini, an annotation assistant for ESC/Java. In: Oliveira, J.N., Zave, P. (eds.) FME 2001. LNCS, vol. 2021, pp. 500–517. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  14. Flanagan, C., Saxe, J.B.: Avoiding exponential explosion: generating compact verification conditions. In: Proceedings of POPL 2001, pp. 193–205. ACM, New York (2001)

    Google Scholar 

  15. Gulwani, S., McCloskey, B., Tiwari, A.: Lifting abstract interpreters to quantified logical domains. In: POPL 2008. ACM Press, New York (2008)

    Google Scholar 

  16. Halbwachs, N., Péron, M.: Discovering properties about arrays in simple programs. SIGPLAN Not. 43(6), 339–348 (2008)

    Article  Google Scholar 

  17. ECMA Int. Standard ECMA-355, Common Language Infrastructure (June 2006)

    Google Scholar 

  18. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: European Conference on Computer Systems (EuroSys), Lisbon, Portugal, March 21-23. Microsoft Research, Silicon Valley (2007)

    Google Scholar 

  19. Karr, M.: On affine relationships among variables of a program. Acta Informatica 6(2), 133–151 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  20. Rustan, K., Leino, M.: Efficient weakest preconditions. Inf. Process. Lett. 93(6), 281–288 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  21. Logozzo, F., Fähndrich, M.A.: On the relative completeness of bytecode analysis versus source code analysis. In: Hendren, L. (ed.) CC 2008. LNCS, vol. 4959, pp. 197–212. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  22. Logozzo, F., Fähndrich, M.A.: Pentagons: A weakly relational abstract domain for the efficient validation of array accesses. In: SAC 2008 (2008)

    Google Scholar 

  23. Miné, A.: Weakly Relational Numerical Abstract Domains. Ph.D thesis, École Polythechnique (2004)

    Google Scholar 

  24. Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  25. Myers, A.C.: Jflow: practical mostly-static information flow control. In: POPL 1999: Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 228–241. ACM, New York (1999)

    Google Scholar 

  26. Pnueli, A., Xu, J., Zuck, L.: Liveness with (0, 1, infty)-counter abstraction. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 107–122. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  27. Sankaranarayanan, S., Ivancic, F., Gupta, A.: Program analysis using symbolic ranges. In: Riis Nielson, H., Filé, G. (eds.) SAS 2007. LNCS, vol. 4634, pp. 366–383. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  28. Saraswat, V., Nystrom, N., Palsberg, J., Grothoff, C.: Constraint types for object oriented languages. In: Proceedings of of OOPSLA 2008, ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (2008)

    Google Scholar 

  29. Smith, G.: A new type system for secure information flow. In: CSFW14, pp. 115–125. IEEE Computer Society Press, Los Alamitos (2001)

    Google Scholar 

  30. Steensgaard, B.: Points-to analysis in almost linear time. In: POPL 1996: Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 32–41. ACM, New York (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xia, S., Fähndrich, M., Logozzo, F. (2009). Inferring Dataflow Properties of User Defined Table Processors. In: Palsberg, J., Su, Z. (eds) Static Analysis. SAS 2009. Lecture Notes in Computer Science, vol 5673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03237-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03237-0_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03236-3

  • Online ISBN: 978-3-642-03237-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics