Syntactic Pattern Analysis

Flasiński, Mariusz

doi:10.1007/978-3-319-40022-8_8

Mariusz Flasiński²

8164 Accesses

Abstract

In syntactic pattern analysis, also called syntactic pattern recognition [97, 104], reasoning is performed on the basis of structural representations which describe things and phenomena belonging to the world. A set of such structural representations, called (structural) patterns, constitutes the database of an AI system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Such structural patterns can be of the form of strings or graphs. Therefore, two types of generative grammars are considered: string grammars and graph grammars. In syntactic pattern recognition tree grammars, which generate tree structures, are also defined. Since a tree is a particular case of a graph, we do not introduce tree grammars in the monograph. The reader is referred, e.g., to [104].
2.
Let us remember that the notions of word and sentence are treated symbolically in formal language theory. For example, if we define a grammar which generates single words of the English language, then letters are terminal symbols. Then, English words which consist of these letters are called words (sentences) of a formal language. However, if we define a grammar which generates sentences of the English language, then English words can be treated as terminal symbols. Then sentences of the English language are called words (sentences) of a formal language. As we see later on, words (sentences) of a formal language generated by a grammar can represent any string structures, e.g., stock charts, charts in medicine (ECG, EEG), etc. Since in this section we use examples from a natural language, strings consisting of symbols are called sentences.
3.
We omit the terminal symbol of a full stop in all examples in order to simplify our considerations. Of course, if we use generative grammars for Natural Language Processing (NLP), we should use a full stop symbol.
4.
We call them productions, because they are used for generating—“producing”—sentences of a language.
5.
Nonterminal symbols are usually denoted by capital letters.
6.
In fact, such grammars are right regular grammars. In left regular grammars a nonterminal symbol occurs (if it occurs) before a terminal symbol.
7.
There are also grammars which have a stronger generative power in the Chomsky hierarchy, namely context-sensitive grammars and unrestricted (type-0) grammars. Their definitions are contained in Appendix E.
8.
Claude Elwood Shannon—a professor of the Massachusetts Institute of Technology, a mathematician and electronic engineer, the “father” of information theory and computer science.
9.
The idea of a finite-state automaton is based on the model of Markov chain which is introduced in Appendix B for genetic algorithms.
10.
Michael Oser Rabin—a professor of Harvard University and the Hebrew University of Jerusalem, a Ph.D. student of Alonzo Church. His outstanding achievements concern automata theory, computational complexity theory, cryptography (Miller-Rabin test), and pattern recognition (Rabin-Karp algorithm). In 1976 he was awarded the Turing Award (together with Dana Scott).
11.
Dana Stewart Scott—a professor of computer science, philosophy, and mathematics at Carnegie Mellon University and Oxford University, a Ph.D. student of Alonzo Church. His excellent work concerns automata theory, semantics of programming languages, modal logic, and model theory (a proof of the independence of the continuum hypothesis). In 1976 he was awarded the Turing Award.
12.
The languages \(L_{1}, L_{2}\), and \(L_{3}\) introduced in the previous section are regular languages.
13.
The input of the automaton is the place where the expression to be analyzed is placed. If there is some expression at the input, then the automaton reads the expression one element (a terminal symbol) at a time and it performs the proper transitions.
14.
The other transition means that the automaton has read an element which is different from those denoting transitions coming out from the current state.
15.
The state-transition function is not necessarily a function in the mathematical sense of this notion.
16.
In order to simplify our considerations we introduce here a specific case of a pushdown automaton, i.e. an LL(k) automaton, which analyzes languages generated by LL(k) context-free grammars. These grammars are defined formally in Appendix E.
17.
In computer science a stack is a specific structure of a data memory with certain operations, which works in the following way. Data elements can be added only to the top of the stack and they can be taken off only from the top. A stack of books put one on another is a good example of a stack. If we want to add a new book to the stack, we have to put it on the top of a stack. If we want to get some book, then we have to take off all books which are above the book we are interested in.
18.
This sequence of symbols has a fixed length. The length of the sequence is a parameter of the automaton. In the case of LL(k) automata, k is the length of the sequence, which is analyzed in a single working step of the automaton.
19.
The automaton \(A_{4}\) is an LL(2) automaton.
20.
The right-hand side of the production, aSb, is put on the stack “from back to front”, i.e., firstly (at the bottom of the stack) symbol b is put, then symbol S, then finally (at the top of the stack) symbol a.
21.
Philip M. Lewis—a professor of electronic engineering and computer science at the Massachusetts Institute of Technology and the State University of New York, a scientist at General Electric Research and Development Center. His work concerns automata theory, concurrency theory, distributed systems, and compiler design.
22.
Richard Edwin Stearns—a professor of mathematics and computer science at the State University of New York, a scientist at General Electric. He was awarded the Turing Award in 1993. He has contributed to the foundations of computational complexity theory (with Juris Hartmanis). His achievements concern the theory of algorithms, automata theory, and game theory.
23.
Donald Ervin Knuth—a professor of computer science at Stanford University. The “father” of the analysis of algorithms. He is known as the author of the best-seller “The Art of Computer Programming” and the designer of the Tex computer typesetting system. Professor D. Knuth is also known for his good sense of humor (e.g., his famous statement: “Beware of bugs in the above code; I have only proved it correct, not tried it.”). He was awarded the Turing Award in 1974.
24.
Robert W. Floyd—a computer scientist, physicist, and BA in liberal arts. He was 33 when he became a full professor at Stanford University (without a Ph.D. degree). His work concerns automata theory, semantics of programming languages, formal program verification, and graph theory (Floyd-Warshall algorithm).
25.
Daniel J. Rosenkrantz—a professor of the State University of New York, a scientist at General Electric, the Editor-in-Chief of the prestigious Journal of the ACM. His achievements concern compiler design and the theory of algorithms.
26.
Alfred Vaino Aho—a physicist, an electronic engineer, and an eminent computer scientist, a professor of Columbia University and a scientist at Bell Labs. His work concerns compiler design, and the theory of algorithms. He is known as the author of the excellent books (written with J.D. Ullman and J.E. Hopcroft) Data Structures and Algorithms and The Theory of Parsing, Translation, and Compiling.
27.
Similarly to the logic-based methods discussed in Sect. 6.1.
28.
Therefore, transducers are called also automata with output.
29.
King-Sun Fu—a professor of electrical engineering and computer science at Purdue University, Stanford University and University of California, Berkeley. The “father” of syntactic pattern recognition, the first president of the International Association for Pattern Recognition (IAPR), and the author of excellent monographs, including Syntactic Pattern Recognition and Applications, Prentice-Hall 1982. After his untimely death in 1985 IAPR established the biennial King-Sun Fu Prize for a contribution to pattern recognition.
30.
Taylor L. Booth—a professor of mathematics and computer science at the University of Connecticut. His research concerns Markov chains, formal language theory, and undecidability. A founder and the first President of the Computing Sciences Accreditation Board (CSAB).
31.
Markov chains are defined formally in Appendix B.2.
32.
Vladimir Iosifovich Levenshtein—a professor of computer science and mathematics at the Keldysh Institute of Applied Mathematics in Moscow and the Steklov Mathematical Institute. In 2006 he was awarded the IEEE Richard W. Hamming Medal.
33.
The example is based on a model introduced in: Flasiński M.: Use of graph grammars for the description of mechanical parts. Computer-Aided Design 27 (1995), pp. 403–433, Elsevier.
34.
During a technological process this corresponds to milling a V-slot in the raw material.
35.
Grzegorz Rozenberg—a professor of Leiden University, the University of Colorado at Boulder, and the Polish Academy of Sciences in Warsaw, an eminent computer scientist and mathematician. His research concerns formal language theory, concurrent systems, and natural computing. Prof. G. Rozenberg was the president of the European Association for Theoretical Computer Science for 11 years.
36.
A similar semantic network has been introduced in Sect. 7.1.
37.
The old edge has pointed out an aunt—\(C(\mathsf{sister},out)\).
38.
This was shown in the 1980s during research into the membership problem for graph languages, which was led (independently) by G. Turan and F.J. Brandenburg.

Author information

Authors and Affiliations

Information Technology Systems Department, Jagiellonian University, Kraków, Poland
Mariusz Flasiński

Authors

Mariusz Flasiński
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mariusz Flasiński .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Flasiński, M. (2016). Syntactic Pattern Analysis. In: Introduction to Artificial Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-319-40022-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-40022-8_8
Published: 06 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40020-4
Online ISBN: 978-3-319-40022-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics