Skip to main content

Prefix Path Streaming: A New Clustering Method for Optimal Holistic XML Twig Pattern Matching

  • Conference paper
Database and Expert Systems Applications (DEXA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3180))

Included in the following conference series:

Abstract

Searching for all occurrences of a twig pattern in a XML document is an important operation in XML query processing. Recently a class of holistic twig pattern matching algorithms has been proposed. Compared with the prior approaches, the holistic method avoids generating large intermediate results which do not contribute to the final answer. The method is CPU and I/O optimal when twig patterns only have ancestor-descendant relationships.The holistic twig-pattern matching method proposed earlier [1] operates on element streams which cluster all XML elements with the same tag name together. In this paper we introduce a clustering method called Prefix Path Streaming (PPS) and new holistic twig pattern matching algorithms based on PPS. PPS clusters elements of XML documents according to the paths from root to the elements. This clustering approach avoids unnecessary scanning of irrelevant portion of XML documents.More importantly, we develop optimal algorithms based on PPS streaming which can process a large class of twig patterns consisting of both ancestor-descendant and parent-child relationships.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bruno, N., Srivastava, D., Koudas, N.: Holistic twig joins: Optimal xml pattern matching. In: ICDE Conference (2002)

    Google Scholar 

  2. Chen, T., Ling, T.W., Chan, C.Y.: Prefix path streaming: A new clustering method for optimal holistic xml twig pattern matching. Technical report, National University of Singapore, http://www.comp.nus.edu.sg/~chent/dexapps.pdf

  3. Choi, B., Mahoui, M., Wood, D.: On the optimality of the holistic twig join algorithms. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 28–37. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Consens, M.P., Milo, T.: Optimizing queries on files. In: Proceedings of ACM SIGMOD (1994)

    Google Scholar 

  5. Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient xml query pattern matching. In: Proceedings of ICDE, pp. 141–152 (2002)

    Google Scholar 

  6. XMARK. Xml-benchmark, http://monetdb.cwi.nl/xml

  7. XPath, http://www.w3.org/TR/xpath

  8. Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: SIGMOD Conference (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, T., Ling, T.W., Chan, CY. (2004). Prefix Path Streaming: A New Clustering Method for Optimal Holistic XML Twig Pattern Matching. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2004. Lecture Notes in Computer Science, vol 3180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30075-5_77

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30075-5_77

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22936-0

  • Online ISBN: 978-3-540-30075-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics