# XML Navigation and Tarski’s Relation Algebras

## Abstract

Navigation is at the core of most XML processing tasks. The W3C endorsed navigation language XPath is part of XPointer (for creating links between elements in (different) XML documents), XSLT (for transforming XML documents) and XQuery (for, indeed, querying XML documents). Navigation in an XML document tree is the task of moving from a given node to another node by following a path specified by a certain formula. Hence formulas in navigation languages denote paths, or stated otherwise binary relations between nodes. Binary relations can be expressed in XPath or with first or second order formulas in two free variables. The problem with all of these formalisms is that they are not compositional in the sense that each subexpression also specifies a binary relation. This makes a mathematical study of these languages complicated because one has to deal with objects of different sorts. Fortunately there exists an algebraic formalism which is created solely to study binary relations. This formalism goes back to logic pioneers as de Morgan, Peirce and Schröder and has been formalized by Tarski as *relation algebras* [7]. (Cf., [5] for a monograph on this topic, and [8] for a database oriented introduction). A relation algebra is a boolean algebra with three additional operations. In its natural representation each element in the domain of the algebra denotes a binary relation. The three extra operations are a constant denoting the identity relation, a unary conversion operation, and a binary operation denoting the composition of two relations. The elements in the algebra denote *first order definable* relations. Later Tarski and Ng added the Kleene star as an additional operator, denoting the transitive reflexive closure of a relation [6].

We will show that the formalism of relation algebras is very well suited for defining navigation paths in XML documents. One of its attractive features is that it does not contain variables, a feature shared by XPath 1.0 and the regular path expressions of [1]. The connection between relation algebras and XPath was first made in [4].

The aim of this talk is to show that relation algebras (possibly expanded with the Kleene star) can serve as a unifying framework in which many of the proposed navigation languages can be embedded. Examples of these embeddings are

- 1
Every Core XPath definable path is definable using composition, union and the counterdomain operator ~ with semantics ~

*R*= {(*x*,*x*)|not ∃*y*:*xRy*}. - 2
Every first order definable path is definable by a relation algebraic expression.

- 3
Every first order definable path is definable by a positive relation algebraic expression which may use the Kleene star.

- 4
The paths definable by tree walk automata and certain tree walk automata with pebbles can be characterized by natural fragments of relation algebras with the Kleene star.

## Keywords

Binary Relation Boolean Algebra Transitive Closure Order Logic Relation Algebra## References

- 1.Abiteboul, S., Buneman, P., Suciu, D.: Data on the web. Morgan Kaufmann, San Francisco (2000)Google Scholar
- 2.Engelfriet, J., Hoogeboom, H.: Tree-walking pebble automata. In: Jewels are Forever, Contributions on Theoretical Computer Science in Honor of Arto Salomaa, pp. 72–83. Springer, Heidelberg (1999)Google Scholar
- 3.Engelfriet, J., Hoogeboom, H.: Automata with nested pebbles capture first-order logic with transitive closure. Technical Report 05-02, LIACS (2005)Google Scholar
- 4.Hidders, J.: Satisfiability of XPath expressions. In: Lausen, G., Suciu, D. (eds.) DBPL 2003. LNCS, vol. 2921, pp. 21–36. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 5.Hirsch, R., Hodkinson, I.: Relation algebras by games. Studies in Logic and the Foundations of Mathematics, vol. 147. North-Holland, Amsterdam (2002)zbMATHGoogle Scholar
- 6.Ng, K.: Relation Algebras with Transitive Closure. PhD thesis, University of California, Berkeley (1984)Google Scholar
- 7.Tarski, A.: On the calculus of relations. Journal of Symbolic Logic 6, 73–89 (1941)zbMATHCrossRefMathSciNetGoogle Scholar
- 8.Van den Bussche, J.: Applications of Alfred Tarski’s ideas in database theory. In: Fribourg, L. (ed.) CSL 2001 and EACSL 2001. LNCS, vol. 2142, pp. 20–37. Springer, Heidelberg (2001)CrossRefGoogle Scholar