Abstract
XML is at once a document format and a semistructed data model, and has become a de-facto standard for exchanging data on the Internet. XML documents can alternatively be viewed as labeled trees, and tree automata are natural mechanisms for a wide range of processing tasks on XML documents. In this talk, I survey applications of automata in XML processing with an emphasis on those directions of work that so far have had the greatest practical impact. The talk will consist of three parts. In the first, I will discuss XML validation. The standard schema formalisms for XML, Document Type Definitions and XML Schema, are regular tree grammars at their core. These official standards of the World Wide Web Consortium are well-founded in automata theory and formal language theory, and are designed to incorporate special restrictions to facilitate the creation of automata for document validation. The second part will cover XML stream processing techniques and XML publish-subscribe systems, an area in which a number of exciting automata-based systems have been built. The third and final part covers XML query processing using automata, and applications in Web information extraction.
Chapter PDF
Similar content being viewed by others
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koch, C. (2009). Applications of Automata in XML Processing. In: Maneth, S. (eds) Implementation and Application of Automata. CIAA 2009. Lecture Notes in Computer Science, vol 5642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02979-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-02979-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02978-3
Online ISBN: 978-3-642-02979-0
eBook Packages: Computer ScienceComputer Science (R0)