On the Optimality of Holistic Algorithms for Twig Queries

  • Byron Choi
  • Malika Mahoui
  • Derick Wood
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2736)


Streaming XML documents has many emerging applications. However, in this paper, we show that the restrictions imposed by data streaming are too restrictive for processing twig queries – the core operation for XML query processing. Previous proposed algorithm TwigStack is an optimal algorithm for processing twig queries with only descendent edges over streams of nodes. The cause of the suboptimality of the TwigStack algorithm is the structurally recursions appearing in XML documents. We show that without relaxing the data streaming model, it is not possible to develop an optimal holistic algorithm for twig queries. Also the computation of the twig queries is not memory bounded. This motivates us to study two variations of the data streaming model: (1) offline sorting is allowed and the algorithm is allowed to select the correct nodes to be streamed and (2) multiple scans on the data streams are allowed. We show the lower bounds of the two variations.


Data Streaming Multiple Scan Node Test Complete Binary Tree Descendent Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, Los Altos (1999)Google Scholar
  2. 2.
    Arasu, A., Babcock, B., Babu, S., McAlister, J., Widom, J.: Characterizing Memory Requirements for Queries over Continuous Data Streams. In: PODS, pp. 221–232 (June 2002)Google Scholar
  3. 3.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: PODS, pp. 1–16 (June 2002)Google Scholar
  4. 4.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: SIGMOD, pp. 310–321 (June 2002)Google Scholar
  5. 5.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. Technical Report. Columbia University (2002)Google Scholar
  6. 6.
    Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J.: Efficient Structural Joins on Indexed XML Documents. In: ICDE, pp. 141–154 (February 2002)Google Scholar
  7. 7.
    Choi, B.: What Are Real DTDs Like. In: WebDB, pp. 43–48 (June 2002)Google Scholar
  8. 8.
    Lee, M.L., Chua, B.C., Hsu, W., Tan, K.-L.: Efficient Evaluation of Multiple Queries on Streaming XML Data. In: CIKM, pp. 118–125 (November 2002)Google Scholar
  9. 9.
    Wang, W., Jiang, H., Lu, H., Yu, J.X.: Containment Join Size Estimation:Models and Methods. In: SIGMOD (June 2003)Google Scholar
  10. 10.
    Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. ACM SIGMOD Record 30(2), 425–436 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Byron Choi
    • 1
  • Malika Mahoui
    • 1
  • Derick Wood
    • 2
  1. 1.University of Pennsylvania 
  2. 2.HKUST 

Personalised recommendations