Abstract
A new way of indexing XML document is proposed, which supports twig queries and queries with wildcards. An once-over index construction algorithm is also given. According to the Line Model we design, we consider XML document as a line, and every elements of the document as the line’s segments. To query an XML document is to identify the corresponding segments. Using a range-based dynamic tree labeling scheme, each segment of the line is given a range. We put all the paths of XML document into a trie, and organize the range sets with B+-trees grouping by the nodes on the trie. Three operations are defined, which enable the range sets on the B+-trees corresponding to different nodes in the trie to operate with each other. The worst-case time complexity of the algorithm we designed for the operations is O(m+n). The final results of twig queries can be got through these operations directly at a speed similar to the simple path query. Through extensive experiments, we compare our method with other popular techniques. In particular, we show that the processing cost and disk I/O of our index method is linearly proportional to the complexity of query and the size of query results. Experimental results demostrate the great performance benefits of our proposed techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Buneman, P., Suciu, D.: Data on the web: from relations to semistructured data and XML. Morgan Kaufmann Publishers, Los Altos (1999)
Berglund, A., Boag, S., Chamberlin, D., Fernandez, M.F., Kay, M., Robie, J., Simon, J.: XML path language (XPath) 2.0 W3C working draft 16. Technical Report D-xpath20 - 20020816, World Wide Web Consortium (August 2002)
Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simon, J.: XQuery 1.0: An XML Query Language W3C working draft 16. Technical Report WD-xquery- 20020816, World Wide Web Consortium (August 2002)
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E.: Extensible markup language (XML) 1.0 second edition W3C recommendation. Technical Report REC-xml-20001006, World Wide Web Consortium (October 2000)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: Proceedings of the 2002 ACM-SIGMOD Conference, Madison, Wisconsin (June 2002)
Bruno, N., Gravano, L., Koudas, N., Srivastava, D.: Navigation- vs. Index-Based XML Multi-Query Processing. In: The 19th Inter. Conference on Data Engineering (ICDE 2003) (March 2003)
Chung, C., Min, J., Shim, K.: APEX: An adaptive path index for XML data. In: ACM SIGMOD (June 2002)
Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: PODS, pp. 271–281 (2002)
Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A fast index for semistructured data. In: VLDB, September 2001, pp. 341–350 (2001)
Goldman, R., Widom, J.: DataGuides: Enable query formulation and optimization in semistructured databases. In: VLDB, August 1997, pp. 436–445 (1997)
Koch, C.: Efficient processing of expressive node-selecting queries on xml data in secondary storage: A tree automata-based approach. In: Proc. of VLDB (2003)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proceedings of the 27th VLDB Conference, Rome, Italy, September 2001, pp. 361–370 (2001)
Ley, M.: DBLP database web site (2000), http://www.informatik.uni-trier.de/ley/db
Miklau, G.: UW XML Repository, http://www.cs.washington.edu/research/xmldatasets
Milo, T., Suciu, D.: Index structures for path expression. In: Proceedings of 7th International Conference on Database Theory (ICDT), January 1999, pp. 277–295 (1999)
Rao, P., Moon, B.: PRIX: Indexing And Querying XML Using Prufer Sequences. In: The 20th Inter. Conference on Data Engineering (ICDE), Boston, MA, U.S.A. (March 2004)
Sleepycat Software, http://www.sleepycat.com , The Berkeley Database (Berkeley DB)
Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A Dynamic Index Method for Querying XML Data by Tree Structures. In: Proceedings of the 2003 ACM-SIGMOD Conference, San Diego, CA (June 2003)
XMARK: The XML-benchmark project (2002), http://monetdb.cwi.nl/xml
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xuefeng, H., De, X. (2005). LMIX: A Dynamic XML Index Method Using Line Model. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)