Skip to main content

Efficient Evaluation of XML Path Queries with Automata

  • Conference paper
Book cover Advances in Web-Age Information Management (WAIM 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2762))

Included in the following conference series:

Abstract

Path query is one of the most frequently used components by the various XML query languages. Most of the proposed methods compute path queries in instance space, i.e. directly facing the XML instances, such as XML tree traversal and containment join ways. As a query method based on automata technique, automata match (AM) can evaluate path expression queries in schema space so that it allows efficient computation of complex queries on vast amount of data. This paper introduces how to construct query automata in order to compute all regular expression queries including those with wildcards. Furthermore, a data structure named schema automata is proposed to evaluate containment queries that are very difficult from the conventional automata point of view. To improve the efficiency of schema automata, methods to reduce and persistent them are proposed. Finally, performance study of the proposed methods are given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altmel, M., Franklin, M.: Efficient filtering of XML documents for selective dissemination of information. In: Proc. of the 26th VLDB Conf., Cario, Egypt, pp. 53–63 (2000)

    Google Scholar 

  2. Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.: The Lorel Query Language for Semistructured Data. Int. J. on Digital Libraries 1(1), 68–88 (1997)

    Google Scholar 

  3. Berglund, A., Baog, S., Chamberlin, D., et al.: XML Path Languages (XPath), ver 2.0. W3C Working Draft 20 December 2001, Tech. Report WD-xpath20-20011220, W3C (2001), http://www.w3.org/TR/WD-xpath20-20011220

  4. Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient Structual Joins on Indexed XML Documents. In: Proc. of VLDB Conf., Hong Kong, China (2002)

    Google Scholar 

  5. Deutsch, A., Fernandez, M., Florescu, D., Levy, A., Suciu, D.: XML-QL: A Query Language for XML (1999), http://www.w3.org/TR/NOTE-xml-ql/

    Google Scholar 

  6. Fankhauser, P.: XQuery Formal Semantics: State and Challenges. SIGMOD Record 30(3), 14–19 (2001)

    Article  Google Scholar 

  7. Lv, J., Wang, G., Yu, J.X., Yu, G., Lu, H., Sun, B.: A New Path Expression Computing Approach for XML Data. In: VLDB Workshop on Efficiency and Effectiveness of XML Tools, and Techniques, Hong Kong, China (2002)

    Google Scholar 

  8. Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proc. of ICDE (2002)

    Google Scholar 

  9. Lv, J., Wang, G., Yu, G.: Storage, Indexing and Query optimization in A High Performance XML Database System. In: Proc. of the 2002 PYIWIT Conf. 2002, Japan (2002)

    Google Scholar 

  10. Lu, H., Wang, G., Yu, G., Bao, Y., Lv, J., Yu, Y.: Xbase: Making your gigabyte disk queriable. In: Proc. of the 2002 ACM SIGMOD Conf. (2002)

    Google Scholar 

  11. McHugh, J., Widom, J.: Query optimization for XML. In: Proc. of the 25th VLDB Conf., Edinburgh, Scotland, pp. 315–326 (1999)

    Google Scholar 

  12. Chamberlin, D., Robie, J., Florescu, D.: Quilt: An XML Query Language for Heterogeneous Data Sources. In: Suciu, D., Vossen, G. (eds.) WebDB 2000. LNCS, vol. 1997, p. 1. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Schmidt, A.R., Waas, F., Kersten, M.L., Florescu, D., Manolescu, I., Carey, M.J., Busse, R.: The XML Benchmark Project. Tech. Report, CWI, Amsterdam, Netherlands (2001)

    Google Scholar 

  14. Wang, G., Lu, H., Yu, G., Bao, Y.: Managing Very Large Document Collections Using Semantics. Journal of Computer Science and Technology 18(3), 403–406 (2003)

    Article  Google Scholar 

  15. Yu, G., Wang, G., Makinouchi, A.: A Distributed and Parallel Object Database Server System for Windows NT. In: Proc. of Conf. on Software: Theory and Practice, Beijing (August 2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sun, B., Lv, J., Wang, G., Yu, G., Zhou, B. (2003). Efficient Evaluation of XML Path Queries with Automata. In: Dong, G., Tang, C., Wang, W. (eds) Advances in Web-Age Information Management. WAIM 2003. Lecture Notes in Computer Science, vol 2762. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45160-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45160-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40715-7

  • Online ISBN: 978-3-540-45160-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics