Theory of Computing Systems

, Volume 62, Issue 4, pp 941–976 | Cite as

Reasoning about integrity constraints for tree-structured data

  • Wojciech Czerwiński
  • Claire David
  • Filip Murlak
  • Paweł Parys
Part of the following topical collections:
  1. Special Issue on Database Theory


We study a class of integrity constraints for tree-structured data modelled as data trees, whose nodes have a label from a finite alphabet and store a data value from an infinite data domain. The constraints require each tuple of nodes selected by a conjunctive query (using navigational axes and labels) to satisfy a positive combination of equalities and a positive combination of inequalities over the stored data values. Such constraints are instances of the general framework of XML-to-relational constraints proposed recently by Niewerth and Schwentick. They cover some common classes of constraints, including W3C XML Schema key and unique constraints, as well as domain restrictions and denial constraints, but cannot express inclusion constraints, such as reference keys. Our main result is that consistency of such integrity constraints with respect to a given schema (modelled as a tree automaton) is decidable. An easy extension gives decidability for the entailment problem. Equivalently, we show that validity and containment of unions of conjunctive queries using navigational axes, labels, data equalities and inequalities is decidable, as long as none of the conjunctive queries uses both equalities and inequalities; without this restriction, both problems are known to be undecidable. In the context of XML data exchange, our result can be used to establish decidability for a consistency problem for XML schema mappings. All the decision procedures are doubly exponential, with matching lower bounds. The complexity may be lowered to singly exponential, when conjunctive queries are replaced by tree patterns, and the number of data comparisons is bounded.


Data trees Integrity constraints Unions of conjunctive queries Schema mappings Entailment Containment Consistency 



We thank the anonymous referees of ICDT 2016 and TOCS for their insightful questions.


  1. 1.
    Arenas, M., Barceló, P., Libkin, L., Murlak, F.: Foundations of data exchange. Cambridge University Press (2014)Google Scholar
  2. 2.
    Arenas, M., Fan, W., Libkin, L.: On the complexity of verifying consistency of XML specifications. SIAM J. Comput. 38(3), 841–880 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Arenas, M., Libkin, L.: A normal form for XML documents. ACM Trans. Database Syst. 29, 195–232 (2004)CrossRefGoogle Scholar
  4. 4.
    Arenas, M., Libkin, L.: XML data exchange: Consistency and query answering. J. ACM, 55(2) (2008)Google Scholar
  5. 5.
    Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of DTDs. J. ACM, 55(2) (2008)Google Scholar
  6. 6.
    Björklund, H., Martens, W., Schwentick, T.: Conjunctive query containment over trees using schema information. Acta Informatica, 1–40 (2016)Google Scholar
  7. 7.
    Bojańczyk, M., Murlak, F., Witkowski, A.: Containment of monadic datalog programs via bounded clique-width. In: Proceedings of the ICALP 2015, pp. 427–439 (2015)Google Scholar
  8. 8.
    Bojańczyk, M., Muscholl, A., Schwentick, T., Segoufin, L.: Two-variable logic on data trees and XML reasoning. J. ACM 56(3), 13:1–13:48 (2009)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the STOC 1977, pp. 77–90 (1977)Google Scholar
  10. 10.
    Courcelle, B., Makowsky, J.A., Rotics, U.: Linear time solvable optimization problems on graphs of bounded clique-width. Theory Comput. Syst. 33(2), 125–150 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Courcelle, B., Olariu, S.: Upper bounds to the clique width of graphs. Discret. Appl. Math. 101(1-3), 77–114 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    David, C., Gheerbrant, A., Libkin, L., Martens, W.: Containment of pattern-based queries over data trees. In: Proceedings of the ICDT 2013, pp. 201–212 (2013)Google Scholar
  13. 13.
    David, C., Hofman, P., Murlak, F., Pilipczuk, M.: Synthesizing transformations from XML schema mappings. In: Proceedings of the ICDT 2014, pp. 61–71 (2014)Google Scholar
  14. 14.
    Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst. 30(4), 994–1055 (2005)CrossRefGoogle Scholar
  16. 16.
    Fagin, R., Vardi, M.Y.: The theory of data dependencies - a survey. In: Mathematics of information processing, volume 34 of proceedings of symposia in applied mathematics, pp. 19–71. American Mathematical Society, Providence, Rhode Island (1986)Google Scholar
  17. 17.
    Figueira, D.: Alternating register automata on finite words and trees. Logical Methods Comput. Sci. 8(1), 2012Google Scholar
  18. 18.
    Gao, S., Sperberg-McQueen, C.M., Thompson, H.S., Mendelsohn, N., Beech, D., Maloney, M.: W3C XML Schema Definition Language (XSD) 1.1, Part 1: Structures. Technical report, World Wide Web Consortium, April (2009)Google Scholar
  19. 19.
    Gogacz, T., Marcinkowski, J.: All-instances termination of chase is undecidable. In: Proceedings of the ICALP 2014, pp. 293–304 (2014)Google Scholar
  20. 20.
    Gogacz, T., Marcinkowski, J.: Red spider meets a rainworm: query finite determinacy is undecidable. In: Proceedings of the PODS 2016, pp. 121–134 (2016)Google Scholar
  21. 21.
    Hartmann, S., Link, S.: More functional dependencies for XML. In: Proceedings of the ADBIS 2003, pp. 355–369 (2003)Google Scholar
  22. 22.
    Hartmann, S., Link, S., Trinh, T.: Solving the implication problem for XML, functional dependencies with properties. In: Proceedings of the WoLLIC 2010, pp. 161–175 (2010)Google Scholar
  23. 23.
    Jurdzinski, M., Lazic, R.: Alternating automata on data trees and XPath satisfiability. ACM Trans. Comput. Logic 12(3), 19:1–19:21 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Lenzerini, M.: Data integration: A theoretical perspective. In: Proceedings of the PODS 2002, pp. 233–246 (2002)Google Scholar
  25. 25.
    Miklau, G., Suciu, D: Containment and equivalence for a fragment of XPath. J. ACM, 51(1), 2–45 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Neven, F., Schwentick, T.: On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. Log. Meth. Comput. Sci., 2(3) (2006)Google Scholar
  27. 27.
    Niewerth, M., Schwentick, T.: Reasoning XML constraints based on XML-to-relational mappings. In: Proceedings of the ICDT 2014, pp. 72–83 (2014)Google Scholar
  28. 28.
    Vardi, M.Y.: Fundamentals of dependency theory. In: Borger, E. (ed.) Trends in theoretical computer science, pp. 171–224. Computer Science Press (1987)Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.University of WarsawWarsawPoland
  2. 2.Université Paris-Est Marne-la-ValléeChamps-sur-MarneFrance

Personalised recommendations