Incremental Validation of XML Documents

  • Yannis Papakonstantinou
  • Victor Vianu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2572)


We investigate the incremental validation of XML documents with respect to DTDs and XML Schemas, under updates consisting of element tag renamings, insertions and deletions. DTDs are modeled as extended context-free grammars and XML Schemas are abstracted as “specialized DTDs”, allowing to decouple element types from element tags. For DTDs, we exhibit an O(m log n) incremental validation algorithm using an auxiliary structure of size O(n), where n is the size of the document and m the number of updates. For specialized DTDs, we provide an O(m log2 n) incremental algorithm, again using an auxiliary structure of size O(n). This is a significant improvement over brute-force re-validation from scratch.


Binary Tree Regular Expression Regular Language Parse Tree Incremental Validation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [BKMW01]
    A. Bruggemann-Klein, M. Murata, and D. Wood. Regular tree and regular hedge languages over non-ranked alphabets. HKUST-TCSC-2001-05, HKUST 2001. Available at
  2. [BKW98]
    A. Bruggemann-Klein and D. Wood. One-unambiguous regular languages. Information and Computation, 142(2):182–206, 1998.CrossRefMathSciNetGoogle Scholar
  3. [BM99]
    C. Beeri and T. Milo. Schemas for integration and translation of structured and semi-structured data. In Int’l. Conf. on Database Theory, pages 296–313, 1999.Google Scholar
  4. [CDSS98]
    S. Cluet, C. Delobel, J. Simeon, and K. Smaga. Your mediators need data conversion! In Proc. ACM SIGMOD, 177–188, 1998.Google Scholar
  5. [CLR]
    T. Cormen and C. Leiserson and R. Rivest. Introduction to Algorithms, Mc Graw-Hill, 1992.Google Scholar
  6. [DS95]
    G. Dong and J. Su. Space-bounded foies. In Proc. ACM PODS, 139–150, 1995.Google Scholar
  7. [GM80]
    C. Ghezzi and D. Mandrioli. Augmenting parsers to support incrementality. JACM, 27(3), 1980.Google Scholar
  8. [GMUW01]
    H. Garcia-Molina, J. Ullman, and J. Widom. Database Systems: The Complete Book. Prentice Hall, 2001.Google Scholar
  9. [HI02]
    B. Hesse and N. Immerman. Complete problems for dynamic complexity classes. Proc.IEEE LICS, 313–322, 2002.Google Scholar
  10. [JG82]
    F. Jalili and J. Gallier. Building friendly parsers. In Proc. ACM POPL, 1982.Google Scholar
  11. [Lar95]
    J. Larcheveque. Optimal incremental parsing. ACM Transactions on Programming Languages and Systems, 17(1), 1995.Google Scholar
  12. [Li95]
    W. Li. A simple and efficient incremental LL(1) parsing. In Theory and Practice of Informatics, 1995.Google Scholar
  13. [Lin93]
    G. Linden. Incremental updates in structured documents, 1993. Licentiate Thesis, Report C-1993-19, Department of Computer Science, University of Helsinki.Google Scholar
  14. [Loh01]
    M. Lohrey. On the parallel complexity of tree automata. In Proceedings of the 12th RTA, LNCS 2051, 2001.Google Scholar
  15. [MPS90]
    A. Murching, Y. Prasant, and Y. Srikant. Incremental recursive descent parsing. Computer Languages, 15(4), 1990.Google Scholar
  16. [MSVT94]
    P.B. Miltersen, S. Subramanian, J.S. Vitter, and R. Tamassia. Complexity models for incremental computation. TCS, 130(1):203–236, 1994.zbMATHCrossRefMathSciNetGoogle Scholar
  17. [Nev02]
    F. Neven. Automata, logic and XML. In Computer Science Logic, 2–26, 2002.Google Scholar
  18. [Pet95]
    L. Petrone. Reusing batch parsers as incremental parsers. In Proc. FSTTCS, 1995.Google Scholar
  19. [PI97]
    S. Patnaik and N. Immerman. Dyn-FO: A parallel, dynamic complexity class. JCSS, 55(2), 1997.Google Scholar
  20. [PV00]
    Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In Proc. ACM PODS, 35–46, 2000.Google Scholar
  21. [Seg02]
    L. Segoufin. Personal communication, 2002.Google Scholar
  22. [Vol99]
    H. Vollmer. Introduction to Circuit Complexity. Springer Verlag, 1999.Google Scholar
  23. [W3C98]
    W3C. The extensible markup language (XML), 1998. W3C Recomendation available at
  24. [W3C01]
    W3C. XML schema definition, 2001. W3C Recomendation available at
  25. [WG98]
    T. Wagner and S. Graham. Efficient and flexible incremental parsing. ACM Transactions on Programming Languages and Systems, 20(2), 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Yannis Papakonstantinou
    • 1
  • Victor Vianu
    • 1
  1. 1.Computer Science and EngineeringUniversity of California at San DiegoUSA

Personalised recommendations