Syntax Analysis

  • Torben Ægidius Mogensen
Part of the Undergraduate Topics in Computer Science book series (UTICS)


Where lexical analysis splits the input into tokens, the purpose of syntax analysis (also known as parsing) is to recombine these tokens. Not back into a list of characters, but into something that reflects the structure of the text. This “something” is typically a data structure called the syntax tree of the text. As the name indicates, this is a tree structure. The leaves of this tree are the tokens found by the lexical analysis, and if the leaves are read from left to right, the sequence is the same as in the input text. Hence, what is important in the syntax tree is how these leaves are combined to form the structure of the tree and how the interior nodes of the tree are labelled. In addition to finding the structure of the input text, the syntax analysis must also reject invalid texts by reporting syntax errors. As syntax analysis is less local in nature than lexical analysis, more advanced methods are required. We, however, use the same basic strategy: A notation suitable for human understanding is transformed into a machine-like low-level notation suitable for efficient execution. This process is called parser generation.


Regular Expression Abstract Syntax Syntactic Category Input Symbol Syntax Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aasa, A.: Precedences in specification and implementations of programming languages. In: Maluszyński, J., Wirsing, M. (eds.) Proceedings of the Third International Symposium on Programming Language Implementation and Logic Programming. LNCS, vol. 528, pp. 183–194. Springer, Berlin (1991). CrossRefGoogle Scholar
  2. 2.
    Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers; Principles, Techniques and Tools. Addison-Wesley, Reading (2007) Google Scholar
  3. 3.
    Chomsky, N.: Three models for the description of language. IRE Trans. Inf. Theory 2(3), 113–124 (1956) MATHCrossRefGoogle Scholar
  4. 4.
    Backus, J.W., Bauer, F.L., Green, J., Katz, C., McCarthy, J., Perlis, A.J., Rutishauser, H., Samelson, K., Vauquois, B., Wegstein, J.H., van Wijngaarden, A., Woodger, M.: Revised report on the algorithmic language Algol 60. Commun. ACM 6(1), 1–17 (1963) CrossRefGoogle Scholar
  5. 5.
    Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages and Computation, 2nd edn. Addison-Wesley, Reading (2001) MATHGoogle Scholar
  6. 6.
    Jensen, K., Wirth, N.: Pascal User Manual and Report, 2nd edn. Springer, Berlin (1975) MATHCrossRefGoogle Scholar
  7. 7.
    Kerninghan, B.W., Ritchie, D.M.: The C Programming Language. Prentice-Hall, New York (1978) Google Scholar
  8. 8.
    Parr, T.: The Definitive ANTLR Reference: Building Domain-Specific Languages, 1st edn. Pragmatic Programmers. The Pragmatic Bookshelf, Lewisville (2007) Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of CopenhagenCopenhagenDenmark

Personalised recommendations