Skip to main content

LMIX: A Dynamic XML Index Method Using Line Model

  • Conference paper
Web Technologies Research and Development - APWeb 2005 (APWeb 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3399))

Included in the following conference series:

  • 510 Accesses

Abstract

A new way of indexing XML document is proposed, which supports twig queries and queries with wildcards. An once-over index construction algorithm is also given. According to the Line Model we design, we consider XML document as a line, and every elements of the document as the line’s segments. To query an XML document is to identify the corresponding segments. Using a range-based dynamic tree labeling scheme, each segment of the line is given a range. We put all the paths of XML document into a trie, and organize the range sets with B+-trees grouping by the nodes on the trie. Three operations are defined, which enable the range sets on the B+-trees corresponding to different nodes in the trie to operate with each other. The worst-case time complexity of the algorithm we designed for the operations is O(m+n). The final results of twig queries can be got through these operations directly at a speed similar to the simple path query. Through extensive experiments, we compare our method with other popular techniques. In particular, we show that the processing cost and disk I/O of our index method is linearly proportional to the complexity of query and the size of query results. Experimental results demostrate the great performance benefits of our proposed techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Buneman, P., Suciu, D.: Data on the web: from relations to semistructured data and XML. Morgan Kaufmann Publishers, Los Altos (1999)

    Google Scholar 

  2. Berglund, A., Boag, S., Chamberlin, D., Fernandez, M.F., Kay, M., Robie, J., Simon, J.: XML path language (XPath) 2.0 W3C working draft 16. Technical Report D-xpath20 - 20020816, World Wide Web Consortium (August 2002)

    Google Scholar 

  3. Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simon, J.: XQuery 1.0: An XML Query Language W3C working draft 16. Technical Report WD-xquery- 20020816, World Wide Web Consortium (August 2002)

    Google Scholar 

  4. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E.: Extensible markup language (XML) 1.0 second edition W3C recommendation. Technical Report REC-xml-20001006, World Wide Web Consortium (October 2000)

    Google Scholar 

  5. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: Proceedings of the 2002 ACM-SIGMOD Conference, Madison, Wisconsin (June 2002)

    Google Scholar 

  6. Bruno, N., Gravano, L., Koudas, N., Srivastava, D.: Navigation- vs. Index-Based XML Multi-Query Processing. In: The 19th Inter. Conference on Data Engineering (ICDE 2003) (March 2003)

    Google Scholar 

  7. Chung, C., Min, J., Shim, K.: APEX: An adaptive path index for XML data. In: ACM SIGMOD (June 2002)

    Google Scholar 

  8. Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: PODS, pp. 271–281 (2002)

    Google Scholar 

  9. Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A fast index for semistructured data. In: VLDB, September 2001, pp. 341–350 (2001)

    Google Scholar 

  10. Goldman, R., Widom, J.: DataGuides: Enable query formulation and optimization in semistructured databases. In: VLDB, August 1997, pp. 436–445 (1997)

    Google Scholar 

  11. Koch, C.: Efficient processing of expressive node-selecting queries on xml data in secondary storage: A tree automata-based approach. In: Proc. of VLDB (2003)

    Google Scholar 

  12. Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proceedings of the 27th VLDB Conference, Rome, Italy, September 2001, pp. 361–370 (2001)

    Google Scholar 

  13. Ley, M.: DBLP database web site (2000), http://www.informatik.uni-trier.de/ley/db

  14. Miklau, G.: UW XML Repository, http://www.cs.washington.edu/research/xmldatasets

  15. Milo, T., Suciu, D.: Index structures for path expression. In: Proceedings of 7th International Conference on Database Theory (ICDT), January 1999, pp. 277–295 (1999)

    Google Scholar 

  16. Rao, P., Moon, B.: PRIX: Indexing And Querying XML Using Prufer Sequences. In: The 20th Inter. Conference on Data Engineering (ICDE), Boston, MA, U.S.A. (March 2004)

    Google Scholar 

  17. Sleepycat Software, http://www.sleepycat.com , The Berkeley Database (Berkeley DB)

  18. Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A Dynamic Index Method for Querying XML Data by Tree Structures. In: Proceedings of the 2003 ACM-SIGMOD Conference, San Diego, CA (June 2003)

    Google Scholar 

  19. XMARK: The XML-benchmark project (2002), http://monetdb.cwi.nl/xml

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xuefeng, H., De, X. (2005). LMIX: A Dynamic XML Index Method Using Line Model. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31849-1_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25207-8

  • Online ISBN: 978-3-540-31849-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics