Abstract
With the proliferation of XML data and applications on the Internet, efficiently XML query processing techniques are in great demand. Many path queries on XML data have complex structure with both structural and value constraints. Existing query processing techniques can only process some part of such queries efficiently. In this paper, a query optimization strategy for complex path queries is presented. Such strategy combines index on values, structural index and join on labeling scheme and generates effective query plan for complex queries. Experimental results show that the optimization strategy is efficient and effective; our method is suitable for various path queries with value constraints and our method scales up for large data size with good query performance.
Support by the Key Program of the National Natural Science Foundation of China under Grant No.60533110; the National Grand Fundamental Research 973 Program of China under Grant No.2006CB303000; the National Natural Science Foundation of China under Grant No.60773068 and No.60773063.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhang, N., Kacholia, V., Ozsu, M.T.Ä.: A succinct physical storage scheme for efficient evaluation of path queries in XML. In: ICDE 2004 (2004)
Zhang, N., ÄOzsu, M.T., Ilyas, I.F., Aboulnaga, A.: Fix: Feature-based indexing technique for xml documents. In: VLDB 2006 (2006)
Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient XML query pattern matching. In: ICDE 2002 (2002)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: SIGMOD 2002 (2002)
Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed XML documents. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590. Springer, Heidelberg (2003)
Clark, J., DeRose, S.: XML path language (XPath). In: W3C Recommendation (November 16, 1999), http://www.w3.org/TR/xpath
Goldman, R., Widom, J.: Dataguides: Enabling query formulation and optimization in semistructured databases. In: VLDB 1997 (1997)
Grust, T.: Accelerating XPath location steps. In: SIGMOD 2002 (2002)
Halverson, A., Burger, J., Galanis, L., Kini, A., et al.: Mixed mode XML query processing. In: VLDB 2003 (2003)
He, H., Yang, J.: Multiresolution indexing of xml for frequent queries. In: ICDE 2004 (2004)
Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: Indexing XML data for efficient structural join. In: ICDE 2003 (2003)
Jiang, W., Wang, H., Yu, J.X.: Holistic twig joins on indexed XML documents. In: VLDB 2003 (2003)
Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: SIGMOD 2002 (2002)
Kaushik, R., Krishnamurthy, R., Naughton, J.F., Ramakrishnan, R.: On the integration of structure indexes and inverted lists. In: SIGMOD 2004 (2004)
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. In: ICDE 2002 (2002)
Li, H.-G., Aghili, S.A., Agrawal, D.P., El Abbadi, A.: FLUX: Content and structure matching of xPath queries with range predicates. In: Amer-Yahia, S., Bellahsène, Z., Hunt, E., Unland, R., Yu, J.X. (eds.) XSym 2006. LNCS, vol. 4156, pp. 61–76. Springer, Heidelberg (2006)
Li, H.-G., Aghili, S.A., Agrawal, D., Abbadi, A.E.: Flux: fuzzy content and structure matching of xml range queries. In: WWW 2006 (2006)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: VLDB 2001 (2001)
Lu, J., Ling, T.W., Chan, C.Y., Chen, T.: From region encoding to extended dewey: On efficient processing of XML twig pattern matching. In: VLDB 2005 (2005)
Mathis, C., HÄarder, T., Haustein, M.P.: Locking-aware structural join operators for XML query processing. In: SIGMOD 2006 (2006)
Milo, T., Suciu, D.: Index structures for path expressions. In: ICDE 1999 (1999)
Polyzotis, N., Garofalakis, M.N.: Statistical synopses for graph-structured XML databases. In: SIGMOD 2002 (2002)
Polyzotis, N., Garofalakis, M.N.: Structure and value synopses for XML data graphs. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590. Springer, Heidelberg (2003)
Polyzotis, N., Garofalakis, M.N., Ioannidis, Y.E.: Selectivity estimation for XML twigs. In: ICDE 2004 (2004)
Qun, C., Lim, A., Ong, K.W.: D(k)-index: An adaptive structural summary for graph-structured data. In: SIGMOD 2003 (2003)
Ramanan, P.: Covering indexes for XML queries: Bisimulation - Simulation= Negation. In: VLDB 2003 (2003)
Schmidt, F.W., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, Springer, Heidelberg (2003)
W3C. XML Query 1.0 and XPath 2.0 data model (2003), http://www.w3.org/TR/xpath-datamodel
Wang, W., Jiang, H., Lu, H., Yu, J.X.: PBiTree coding and efficient processing of containment joins. In: ICDE 2003 (2003)
Wang, W., Wang, H., Lu, H., Jiang, H., Lin, X., Li, J.: Efficient processing of xml path queries using the disk-based F&B index. In: VLDB 2005 (2005)
Zhang, C., Naughton, J.F., DeWitt, D.J., et al.: On supporting containment queries in relational database management systems. In: SIGMOD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, H., Li, J., Liu, X., Luo, J. (2009). Query Optimization for Complex Path Queries on XML Data. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00887-0_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-00887-0_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00886-3
Online ISBN: 978-3-642-00887-0
eBook Packages: Computer ScienceComputer Science (R0)