XML Schema Containment Checking Based on Semi-implicit Techniques

  • Akihiko Tozawa
  • Masami Hagiya
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2759)


XML schemas are computer languages defining grammars for XML (Extensible Markup Languages) documents. Containment checking for XML schemas has many applications, and is thus important. Since XML schemas are related to the class of tree regular languages, their containment checking is reduced to the language containment problem for non-deterministic tree automata (NTAs). However, an NTA for a practical XML schema has 102−103 states for which the textbook algorithm based on naive determinization is expensive. Thus we in this paper consider techniques based on BDDs (binary decision diagrams). We used semi-implicit encoding which encodes a set of subsets of states as a BDD, rather than encoding a set of states by it. The experiment on several real-world XML schemas proves that our containment checker can answer problems that cannot be solved by previously known algorithms.


Boolean Function Binary Tree Regular Expression Binary Decision Diagram Tree Automaton 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [BKR97]
    Morten Biehl, Nils Klarlund, and Theis Rauhe. Algorithms for guided tree automata. In First International Workshop on Implementing Automata, WIA’ 96, London, Ontario, Canada, LNCS 1260. Springer Verlag, 1997.Google Scholar
  2. [Bry86]
    Randal E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions and Computers, C-35(8):677–691, August 1986.CrossRefGoogle Scholar
  3. [CGP99]
    Edmund M. Clarke, Jr., Orna Grumberg, and Doron A. Peled. Model Checking. MIT press, 1999.Google Scholar
  4. [CMS02]
    Aske Simon Christensen, Anders Muller, and Michael I. Schwartzbach. Static analysis for dynamic XML. In Proceedings of 1st Workshop on Programming Languages Technology for XML (PLAN-X 2002), 2002.Google Scholar
  5. [Fin01]
    Bernd Finkbeiner. Language containment checking with nondeterministic BDDs. In 7th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, volume 2031 of LNCS, pages 24–38, 2001.CrossRefGoogle Scholar
  6. [HJJ+95]_Jesper G. Henriksen, Jakob L. Jensen, Michael E. Jørgensen, Nils Klarlund, Robert Paige, Theis Rauhe, and Anders Sandholm. Mona: Monadic secondorder logic in practice. In Tools and Algorithms for the Construction and Analysis of Systems, volume 1019 of LNCS, pages 89–110. Springer, 1995.Google Scholar
  7. [HM02]
    Haruo Hosoya and Makoto Murata. Validation and boolean operations for attribute-element constraints. In Proceedings of 1st Workshop on Programming Languages Technology for XML (PLAN-X 2002), 2002.Google Scholar
  8. [HVP00]
    Haruo Hosoya, Jerome Vouillon, and Benjamin C. Pierce. Regular expression types for XML. In Proceedings of the International Conference on Functional Programming (ICFP), pages 11–22, Sep., 2000.Google Scholar
  9. [KS01]
    G.M. Kuper and J. Simeon. Subsumption for XML types. In Proceedings of International Conference on Database Theory (ICDT), Jan., 2001.Google Scholar
  10. [KSM02]
    Nils Klarlund, Michael I. Schwartzbach, and Anders Møller. The DSD schema language. Automated Software Engineering Journal, to appear, 2002.Google Scholar
  11. [MLM01]
    Makoto Murata, Dongwon Lee, and Murali Mani. Taxonomy of XML schema languages using formal language theory. In Proceedings of Extreme Markup Language 2001, Montreal, pages 153–166, 2001.Google Scholar
  12. [MS94]
    Alain J. Mayer and Larry J. Stockmeyer. The complexity of word problems-this time with interleaving. Information and Computation, 115(2):293–311, 1994.CrossRefMathSciNetGoogle Scholar
  13. [Ora01]
    Oraganization for Advancement of Structured Information Standards (OASIS). RELAX NG, 2001.
  14. [rel]
    RELAX (REgular LAnguage description for XML).
  15. [Sei90]
    Hermut Seidl. Deciding equivalence of finite tree automata. SIAM Journal of Computing, 19(3):424–437, June 1990.zbMATHCrossRefMathSciNetGoogle Scholar
  16. [TBK95]
    Hervé J. Touati, Robert K. Brayton, and Robert Kurshan. Testing language containment for ω-automata using BDDs. Information and Computation, 118(1):101–109, April 1995.zbMATHCrossRefMathSciNetGoogle Scholar
  17. [THB95]
    S. Tasiran, R. Hojati, and R. K. Brayton. Language containment using non-deteministic omega-automata. In Proc. of CHARME’95, volume 987 of LNCS. Springer-Verlag, 1995.Google Scholar
  18. [Wor00]
    World Wide Web Consortium. XHTML1.0, 2000.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Akihiko Tozawa
    • 1
  • Masami Hagiya
    • 2
  1. 1.IBM Research, Tokyo Research LaboratoryIBM Japan ltd.Japan
  2. 2.Graduate School of Information Science and TechnologyUniversity of TokyoJapan

Personalised recommendations