CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs
This paper describes the C Intermediate Language: a highlevel representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.
Compared to C, CIL has fewer constructs. It breaks down certain complicated constructs of C into simpler ones, and thus it works at a lower level than abstract-syntax trees. But CIL is also more high-level than typical intermediate languages (e.g., three-address code) designed for compilation. As a result, what we have is a representation that makes it easy to analyze and manipulate C programs, and emit them in a form that resembles the original source. Moreover, it comes with a front-end that translates to CIL not only ANSI C programs but also those using Microsoft C or GNU C extensions.
We describe the structure of CIL with a focus on how it disambiguates those features of C that we found to be most confusing for program analysis and transformation. We also describe a whole-program merger based on structural type equality, allowing a complete project to be viewed as a single compilation unit. As a representative application of CIL, we show a transformation aimed at making code immune to stack-smashing attacks. We are currently using CIL as part of a system that analyzes and instruments C programs with run-time checks to ensure type safety. CIL has served us very well in this project, and we believe it can usefully be applied in other situations as well.
KeywordsAbstract Syntax Composite Type Merging Algorithm Abstract Syntax Tree Intermediate Language
- 2.Luca Cardelli, James Donahue, Mick Jordan, Bill Kalsow, and Greg Nelson. The Modula-3 type system. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages, pages 202–212, January 1989.Google Scholar
- 3.Microsoft Corporation. The AST Toolkit. http://research.microsoft.com/sbt/asttoolkit/ast.asp.
- 4.Crispan Cowan, Calton Pu, Dave Maier, Jonathan Walpole, Peat Bakke, Steve Beattie, Aaron Grier, Perry Wagle, Qian Zhang, and Heather Hinton. StackGuard: Automatic adaptive detection and prevention of buffer-over.owa ttacks. In Proceedings of the 7th USENIX Security Conference, pages 63–78, January 1998.Google Scholar
- 5.Edison Design Group. The C++ Front End. http://www.edg.com/cpp.html.
- 6.ISO/IEC. ISO/IEC 9899:1999(E) Programming Languages — C.Google Scholar
- 7.BrianW. Kernighan and Dennis M. Ritchie. The C Programming Language (second edition). Prentice-Hall, Englewood Cliffs, N. J., 1988.Google Scholar
- 8.Holger Kienle and Urs Hölzle. Introduction to the SUIF 2.0 compiler system. Technical Report TRCS97-22, University of California, Santa Barbara. Computer Science Dept., December 10, 1997.Google Scholar
- 9.Bell Labs. ckit: A Front End for C in SML. http://cm.bell-labs.com/cm/cs/what/smlnj/doc/ckit/overview.html.
- 10.Calvin Lin, Samuel Guyer, Daniel Jimenez, and Teck Bok Tok. C-Breeze. http://www.cs.utexas.edu/users/c-breeze/.
- 11.Paul McJones and Andy Hisgen. The Topaz system: Distributed multiprocessor personal computing. In Proceedings of the IEEE Workshop on Workstation Operating Systems, November 1987.Google Scholar
- 12.George C. Necula, Scott McPeak, and Westley Weimer. CCured: Type-safe retrofitting of legacy code. In Proceedings of the 29th Annual ACM Symposium on Principles of Programming Languages, January 2002.Google Scholar
- 13.Standard Performance Evaluation Corportation. SPEC 95 Benchmarks. July 1995. http://www.spec.org/osg/cpu95/CINT95.
- 14.Robert Wilson, Robert French, Christopher Wilson, Saman Amarasinghe, Jennifer Anderson, Steve Tjiang, Shih-Wei Liao, Chau-Wen Tseng, Mary Hall, Monica Lam, and John Hennessy. The SUIF compiler system: a parallelizing and optimizing research compiler. Technical Report CSL-TR-94-620, Stanford University, Computer Systems Laboratory, May 1994.Google Scholar