Practical Indexing XML Document for Twig Query

Wang, Hongzhi; Wang, Wei; Li, Jianzhong; Lin, Xuemin; Wong, Reymond

doi:10.1007/11596370_19

Hongzhi Wang¹⁹,
Wei Wang^20,21,
Jianzhong Li¹⁹,
Xuemin Lin^20,21 &
…
Reymond Wong²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3818))

Included in the following conference series:

Annual Asian Computing Science Conference

475 Accesses
1 Citations

Abstract

Answering structural queries of XML with index is an important approach of efficient XML query processing. Among existing structural indexes for XML data, F&B index is the smallest index that can answer all branching queries. However, an F&B index for less regular XML data often contains a large number of index nodes, and hence a large amount of main memory. If the F&B index cannot be accommodated in the available memory, its performance will degrade significantly. This issue has practically limited wider application of the F&B index.

In this paper, we propose a disk organization method for the F&B index which shift part of the leave nodes in the F&B index to the disk and organize them judiciously on the disk. Our method is based on the observation that the majority of the nodes in a F&B index is often the leaf nodes, yet their access frequencies are not high.

We select some leaves to output to disk. With the support of reasonable storage structure in main memory and in disk, we design efficient query processing method). We further optimize the design of the F&B index based on the query workload . Experimental results verified the effectiveness of our proposed approach.

This work was partially supported by ARC Discovery Grant – DP0346004 and the Defence Pre-Research Project of the ’Tenth Five-Year-Plan’of China no.41315.2.3.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chung, C.-W., Min, J.-K., Shim, K.: Apex: an adaptive path index for XML data. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 121–132 (2002)
Google Scholar
Goldman, R., Widom, J.: Dataguides: Enabling query formulation and optimization in semistructured databases. In: Proceedings of 23rd International Conference on Very Large Data Bases (VLDB 1997), pp. 436–445 (1997)
Google Scholar
He, H., Yang, J.: Multiresolution indexing of XML for frequent queries. In: Proceedings of the 20th International Conference on Data Engineering (ICDE 2004), Boston, MA, USA, March 2004, pp. 683–694 (2004)
Google Scholar
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: Xr-Tree: Indexing Xml Data For Efficient Structural Join. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), pp. 253–263 (2003)
Google Scholar
Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering Indexes For Branching Path Queries. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 133–144 (2002)
Google Scholar
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting Local Similarity For Efficient Indexing Of Paths In Graph Structured Data. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), San Jose, CA, USA, March 2002, pp. 129–140 (2002)
Google Scholar
McHugh, J., Widom, J.: Query Optimization For XML. In: Proceedings of 25th International Conference on Very Large Data Bases (VLDB 1999), pp. 315–326 (1999)
Google Scholar
Milo, T., Suciu, D.: Index Structures For Path Expressions. In: Proceedings of the 7th International Conference on Database Theory (ICDE 1999), pp. 277–295 (1999)
Google Scholar
Qun, C., Lim, A., Ong, K.W.: D(K)-Index: An Adaptive Structural Summary For Graph-Structured Data. In: The 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003), San Diego, California, USA, June 2003, pp. 134–144 (2003)
Google Scholar
Ramanan, P.: Covering indexes for XML queries: Bisimulation - Simulation = Negation. In: Proceedings of 29th International Conference on Very Large Data Bases (VLDB 2003), pp. 165–176 (2003)
Google Scholar
Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management. In: Proceedings of 28th International Conference on Very Large Data Bases (VLDB 2002), pp. 974–985 (2002)
Google Scholar
W3C. XML Query 1.0 and XPath 2.0 data model (2003), Available from http://www.w3.org/TR/xpath-datamodel/

Download references

Author information

Authors and Affiliations

Harbin Institute of Technology, Harbin, China
Hongzhi Wang & Jianzhong Li
University of New South Wales, Australia
Wei Wang, Xuemin Lin & Reymond Wong
National ICT of Australia, Australia
Wei Wang & Xuemin Lin

Authors

Hongzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhong Li
View author publications
You can also search for this author in PubMed Google Scholar
Xuemin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Reymond Wong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA-LIAMA Institute of Automation, Beijing, China
Stéphane Grumbach
Computer Science and Engineering Department, San Diego, University of California, 92093-0404, La Jolla, CA, USA
Liying Sui
UC San Diego, USA
Victor Vianu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H., Wang, W., Li, J., Lin, X., Wong, R. (2005). Practical Indexing XML Document for Twig Query. In: Grumbach, S., Sui, L., Vianu, V. (eds) Advances in Computer Science – ASIAN 2005. Data Management on the Web. ASIAN 2005. Lecture Notes in Computer Science, vol 3818. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596370_19

Download citation

DOI: https://doi.org/10.1007/11596370_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30767-9
Online ISBN: 978-3-540-32249-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics