Skip to main content

On the Efficient Processing Regular Path Expressions of an Enormous Volume of XML Data

  • Conference paper
Database and Expert Systems Applications (DEXA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4653))

Included in the following conference series:

Abstract

XML (Extensible Mark-up Language) has recently been embraced as a new approach to data modeling. Nowadays, more and more information is formatted as semi-structured data, i.e. articles in a digital library, documents on the web and so on. Implementation of an efficient system enabling storage and querying of XML documents requires development of new techniques. The indexing of an XML document is enabled by providing an efficient evaluation of a user query. XML query languages, like XPath or XQuery, apply a form of path expressions for composing more general queries. The evaluation process of regular path expressions is not efficient enough using the current approaches to indexing XML data. Most approaches index single elements and the query statement is processed by joining individual expressions. In this article we will introduce an approach which makes it possible to efficiently process a query defined by regular path expressions. This approach indexes all root-to-leaf paths and stores them in multi-dimensional data structures, allowing the indexing and efficient querying of an enormous volume of XML data.

Work is partially supported by Grant of GACR No. 201/03/0912.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Khalifa, S., Jagadish, H.V., Koudas, N.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proceedings of ICDE 2002, The IEEE International Conference on Data Engineering, San Jose, IEEE Computer Society Press, Los Alamitos (2002)

    Google Scholar 

  2. Bayer, R.: The Universal B-Tree for multidimensional indexing: General Concepts. In: Masuda, T., Tsukamoto, M., Masunaga, Y. (eds.) WWCA 1997. LNCS, vol. 1274, Springer, Heidelberg (1997)

    Google Scholar 

  3. Chen, Y., Davidson, S.B., Zheng, Y.: Blas: an efficient xpath processing system. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, pp. 47–58. ACM Press, New York (2004)

    Chapter  Google Scholar 

  4. Chen, Z., Korn, G., Koudas, F., Shanmugasundaram, N., Srivastava, J.: Index Structures for Matching XML Twigs Using Relational Query Processors. In: Proceedings of ICDE 2005, The IEEE International Conference on Data Engineering, Tokyo, Japan, pp. 1273–1273. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  5. Chung, C.-W., Min, J.-K., Shim, K.: Apex: an adaptive path index for xml data. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, pp. 121–132. ACM Press, New York (2002)

    Chapter  Google Scholar 

  6. Georgiadis, H., Vassalos, V.: Improving the Efficiency of XPath Execution on Relational Systems. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Boehm, K., Kemper, A., Grust, T., Boehm, C. (eds.) EDBT 2006. LNCS, vol. 3896, p. 570. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Grust, T.: Accelerating XPath Location Steps. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, ACM Press, New York (2002)

    Google Scholar 

  8. Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, Boston, pp. 47–57. ACM Press, New York (1984)

    Chapter  Google Scholar 

  9. Li, W.H.H., Lee, M.L.: A path-based labeling scheme for efficient structural join. In: Bressan, S., Ceri, S., Hunt, E., Ives, Z.G., Bellahsène, Z., Rys, M., Unland, R. (eds.) XSym 2005. LNCS, vol. 3671, pp. 34–48. Springer, Heidelberg (2005)

    Google Scholar 

  10. Jiang, H., Lu, H., Wang, W., Ooi, B.: XR-Tree: Indexing XML Data for Efficient Structural Join. In: Proceedings of ICDE 2003, The IEEE International Conference on Data Engineering, India, IEEE Computer Society Press, Los Alamitos (2003)

    Google Scholar 

  11. Krátký, M., Pokorný, J., Snášel, V.: Implementation of XPath Axes in the Multi-dimensional Approach to Indexing XML Data. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, Springer, Heidelberg (2004)

    Google Scholar 

  12. Krátký, M., Skopal, T., Snášel, V.: Multidimensional Term Indexing for Efficient Processing of Complex Queries. Kybernetika, Journal 40(3), 381–396 (2004)

    Google Scholar 

  13. Krátký, M., Snášel, V., Zezula, P., Pokorný, J.: Efficient Processing of Narrow Range Queries in the R-Tree. In: Proceedings of International Database Engineering & Applications Symposium, IDEAS 2006, IEEE Computer Society Press, Los Alamitos (2006)

    Google Scholar 

  14. Krishnamurthy, R., Kaushik, R., Naughton, J.F.: Efficient XML-to-SQL Query Translation: Where to Add the Intelligence?. In: Proceedings of the 30th International Conference on Very Large Data Bases, VLDB 2004 (2004)

    Google Scholar 

  15. Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: Proceedings of 27th International Conference on Very Large Data Bases, VLDB 2001 (2001)

    Google Scholar 

  16. Shimura, T., Yoshikawa, M., Amagasa, T., Uemura, S.: Xrel: a path-based approach to storage and retrieval of xml documents using relational databases. ACM Trans. Inter. Tech. 1(1), 110–141 (2001)

    Article  Google Scholar 

  17. Widom, J., Goldman, R.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proceedings of International Conference on Very Large Data Bases, VLDB 1997, pp. 436–445 (1997)

    Google Scholar 

  18. Shanmugasundaram, J., et al.: A general technique for querying XML documents using a relational database system. SIGMOD Rec. 30, 20–26 (2001)

    Article  Google Scholar 

  19. Shasha, D.: Algorithmics and Applications of Tree and Graph Searching, tutorial. In: Proceedings of ACM Symposium on Principles of Database Systems, PODS 2002, ACM Press, New York (2002)

    Google Scholar 

  20. Tatarinov, I., et al.: Storing and querying ordered XML using a relational database system. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, pp. 204–215. ACM Press, New York (2002)

    Chapter  Google Scholar 

  21. W3 Consortium. Extensible Markup Language (XML) 1.0, W3C Recommendation (February 10, 1998), http://www.w3.org/TR/REC-xml

  22. W3 Consortium. XQuery 1.0: An XML Query Language, W3C Working Draft (November 12, 2003), http://www.w3.org/TR/xquery/

  23. W3 Consortium. XML Path Language (XPath) Version 2.0, W3C Working Draft (November 15, 2002), http://www.w3.org/TR/xpath20/

  24. Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: a dynamic index method for querying XML data by tree structures. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, pp. 110–121. ACM Press, New York (2003)

    Chapter  Google Scholar 

  25. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, Santa Barbara, pp. 425–436. ACM Press, New York, USA (2001)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Wagner Norman Revell Günther Pernul

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Krátký, M., Bača, R., Snášel, V. (2007). On the Efficient Processing Regular Path Expressions of an Enormous Volume of XML Data. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74469-6_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74467-2

  • Online ISBN: 978-3-540-74469-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics