Skip to main content
Log in

ANDES: efficient evaluation of NOT-twig queries in relational databases

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Despite a large body of work on XPath query processing in relational environment, systematic study of queries containing not-predicates have received little attention in the literature. Particularly, several xml supports of industrial-strength commercial rdbms fail to efficiently evaluate such queries. In this paper, we present an efficient and novel strategy to evaluate not -twig queries in a tree-unaware relational environment. not -twig queries are XPath queries with ancestor–descendant and parent–child axis and contain one or more not-predicates. We propose a novel Dewey-based encoding scheme called Andes (ANcestor Dewey-based Encoding Scheme), which enables us to efficiently filter out elements satisfying a not-predicate by comparing their ancestor group identifiers. In this approach, a set of elements under the same common ancestor at a specific level in the xml tree is assigned same ancestor group identifier. Based on this scheme, we propose a novel sql translation algorithm for not-twig query evaluation. Experiments carried out confirm that our proposed approach built on top of an off-the-shelf commercial rdbms significantly outperforms state-of-the-art relational and native approaches. We also explore the query plans selected by a commercial relational optimizer to evaluate our translated queries in different input cardinality. Such exploration further validates the performance benefits of Andes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Al-Khalifa, A., Jagadish, H.V.: Multi-level operator combination in XML query processing. In: ACM CIKM (2002)

  2. Bamford, R., Vinayak et al.: XQuery reloaded. In: PVLDB (2009)

  3. Bhowmick, S.S., Leonardi, E., Sun, H.: Efficient evaluation of high-selective xml twig patterns with parent child edges in tree-unaware RDBMS. In: ACM CIKM (2007)

  4. Boncz, P., Grust, T. et al.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proceedings of the 2006 ACM SIGMOD international conference on management of data. ACM, New York (2006)

  5. Boncz, P., Kersten, M.L.: MIL primitives for querying a fragmented world. VLDB J. 8(2) (1999)

  6. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD (2002)

  7. Franceschet, M.: XPathMark: an XPath benchmark for the XMark generated data. In XSym (2005)

  8. Garakani, V., Izadi, S.K., Haghjoo, M., Harizi, M.: NTJFsat¬: A novel method for query with not-predicates on XML data. In: CIKM (2007)

  9. Georgiadis, H., Vassalos, V.: Xpath on steroids: exploiting relational engines for Xpath performance. In: SIGMOD (2007)

  10. Georgiadis, H., et al.: Cost-based plan selection for XPath. In: SIGMOD (2009)

  11. Gou, G., Chirkova, R.: Efficiently querying large xml data repositories: a survey. IEEE TKDE 19(10) (2007)

  12. Grust, T., Rittinger, J., Teubner, J.: Why off-the-shelf RDBMSs are better at XPath than you might expect. In: SIGMOD (2007)

  13. Grust, T., van Keulen, M., Teubner, J.: Staircase join: teaching a relational DBMS to watch its (axis) steps. In VLDB (2003)

  14. Jiao, E., Ling, T.-W., Chan, C.-Y.: PathStack : a holistic path join algorithm for path query with not-predicates on XML data. In: DASFAA (2005)

  15. Li, H., Lee, M.-L., Hsu, W.: A path-based labeling scheme for efficient structural join. In: XSym (2005)

  16. Li, H., Lee, M.-L., Hsu, W., Li, L.: A path-based approach for efficient structural join with not-predicates. In DASFAA (2007)

  17. Li C., Ling T.W., Hu M.: Efficient updates in dynamic XML data: from binary string to quaternary string. VLDB J. 17: 573–601, (2008)

    Article  Google Scholar 

  18. Lu, J., Ling, T.W., et al.: From region encoding to extended Dewey: on efficient processing of XML twig pattern matching. In: VLDB (2005)

  19. Mayer, S., Grust, T. et al.: An injection with tree awareness: adding staircase join to PostgreSQL. In VLDB (2004)

  20. O’Neal, P., O’Neal, E., Pal, S., et al.: ORDPATHs: insert-friendly XML node labels. In: SIGMOD (2004)

  21. Pooja, H.D., Darera, N., Haritsa, J.R.: Identifying robust plans through plan diagram reduction. In: VLDB (2008)

  22. Reddy, N., Haritsa, J.R.: Analyzing plan diagrams of database query optimizers. In: VLDB (2005)

  23. Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In VLDB (2002)

  24. Seah, B.-S., Widjanarko, K.G., Bhowmick, S.S., et al.: Efficient support for ordered XPath processing in tree-unaware commercial relational databases. In: DASFAA (2007)

  25. Shanmugasundaram, J., Tufte, K., et al.: Relational databases for querying xml documents: limitations and opportunities. In VLDB (1999)

  26. Soh, K.H., Bhowmick, S.S.: Efficient evaluation of not-twig queries in A tree-unaware RDBMS. In: DASFAA (2011)

  27. Stonebraker, M., Abadi, D., et al.: C-store: a column-oriented DBMS. In: VLDB (2005)

  28. Tatarinov, I., Viglas, S., et al.: Storing and querying ordered xml using a relational database system. In: SIGMOD (2002)

  29. ToXGene—the ToX XML data generator. http://www.cs.toronto.edu/tox/toxgene/

  30. Wu, X., Lee, M.L., Hsu, W.: A prime number labeling scheme for dynamic ordered XML trees. In: ICDE (2004)

  31. Xu, L., Ling, T.W., Wu, H., Bao, Z.: DDE: from Dewey to a fully dynamic XML labeling scheme. In: SIGMOD (2009)

  32. Yoshikawa, M., et al.: XRel: a path-based approach to storage and retrieval of xml documents using relational databases. ACM TOIT 1(1) (2001)

  33. Yao, B., Özsu, M.T., Khandelwal, N.: XBench: benchmark and performance testing of XML DBMSs. In ICDE (2004)

  34. Yu, T., Ling, T.-W., Lu, J.: TwigStackList¬: a holistic twig join algorithm for twig query with not-predicates on XML data. In: DASFAA (2006)

  35. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: SIGMOD (2001)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sourav S. Bhowmick.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Soh, K.H., Truong, B.Q. & Bhowmick, S.S. ANDES: efficient evaluation of NOT-twig queries in relational databases. The VLDB Journal 21, 889–914 (2012). https://doi.org/10.1007/s00778-012-0275-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-012-0275-9

Keywords

Navigation