Partition Based Path Join Algorithms for XML Data

  • Quanzhong Li
  • Bongki Moon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2736)


Path expression is an important component in querying XML data. The extended preorder numbering scheme enables us to quickly determine the ancestor-descendant relationship between elements in the hierarchy of XML data. Using the numbering scheme, a path expression can be evaluated by join operations to avoid potentially high cost of tree traversals. In this paper, we first formulate XML path queries as range-point join queries. Then we discuss the partition based algorithms that can utilize the range containment property to efficiently process the range-point join queries. Under the partition based framework, we propose three algorithms, namely Descendant partition join, Segment-tree partition join and Ancestor Link partition join, which can be chosen by a query optimizer for different input data characteristics. The experimental results show that the partition based algorithms can make better use of the buffer memory than sort-merge algorithms, and the proposed Ancestor Link join algorithm yields the best performance by using small in-memory data structures and by taking advantage of unevenly sized inputs.


Numbering Scheme Path Expression Range Tree Segment Tree Tree Traversal 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chaudhuri, S., Motwani, R., Narasayya, V.R.: Random sampling for histogram construction: How much is enough? In: SIGMOD 1998, Seattle, Washington, USA, June 2–4, pp. 436–447 (1998)Google Scholar
  2. 2.
    DeWitt, D.J., Naughton, J.F., Scheneider, D.A.: An evaluation of nonequijoin algorithms. In: VLDB 1991, Barcelona, Spain (September 1991)Google Scholar
  3. 3.
    Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: VLDB 1997, Athens, Greece (September 1997)Google Scholar
  4. 4.
    Graefe, G., Linville, A., Shapiro, L.D.: Sort versus hash revisited. IEEE Transactions on Knowledge and Data Engineering 6(6), 934–944 (1994)CrossRefGoogle Scholar
  5. 5.
    Gunadhi, H., Segev, A.: Query processing algorithms for temporal intersection joins. In: ICDE 1991, Kobe, Japan (April 1991)Google Scholar
  6. 6.
    Li, Q., Moon, B.: Indexing and querying xml data for regular path expressions. In: VLDB 2001, Rome, Italy (September 2001)Google Scholar
  7. 7.
    Lipton, R.J., Naughton, J.F., Schneider, D.A., Seshadri, S.: Efficient sampling strategies for relational database operations. Theoretical Computer Science 116, 195–226 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    McHugh, J., Widom, J.: Query optimization for XML. In: VLDB 1999, Edinburgh, Scotland, pp. 315–326 (September 1999)Google Scholar
  9. 9.
    Ley, M.: DBLP Computer Science Biblography (February 2003),
  10. 10.
    Preparata, F.P., Shamos, M.I.: Computational Geometry - An Introduction. Springer, Berlin (1985)Google Scholar
  11. 11.
    Soo, M.D., Snodgrass, R.T., Jensen, C.S.: Efficient evaluation of the valid-time natural join. In: ICDE 1994, Houston, Texas, USA, February 14-18 (1994)Google Scholar
  12. 12.
    Srivastava, D., Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Wu, Y.: Structural joins: A primitive for efficient xml query pattern matching. In: ICDE 2002, San Jose, California (February 2002)Google Scholar
  13. 13.
    Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: SIGMOD 2001, Santa Barbara, CA (May 2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Quanzhong Li
    • 1
  • Bongki Moon
    • 1
  1. 1.Department of Computer ScienceUniversity of ArizonaTucsonUSA

Personalised recommendations