Skip to main content
Log in

Query translation from XPath to SQL in the presence of recursive DTDs

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

We study the problem of evaluating xpath queries over xml data that is stored in an rdbms via schema-based shredding. The interaction between recursion (descendants-axis) in xpath queries and recursion in dtds makes it challenging to answer xpath queries using rdbms. We present a new approach to translating xpath queries into sql queries based on a notion of extended XP ath expressions and a simple least fixpoint (lfp) operator. Extended xpath expressions are a mild extension of xpath, and the lfp operator takes a single input relation and is already supported by most commercial rdbms. We show that extended xpath expressions are capable of capturing both dtd recursion and xpath queries in a uniform framework. Furthermore, they can be translated into an equivalent sequence of sql queries with the lfp operator. We present algorithms for rewriting xpath queries over a (possibly recursive) dtd into extended xpath expressions and for translating extended xpath expressions to sql queries, as well as optimization techniques. The novelty of our approach consists in its capability to answer a large class of xpath queries by means of only low-end rdbms features already available in most rdbms, as well as its flexibility to accommodate existing relational query optimization techniques. In addition, these translation algorithms provide a solution to query answering for certain (possibly recursive) xml views of xml data. Our experimental results verify the effectiveness of our techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Afanasiev, L., Grust, T., Marx, M., Rittinger, J., Teubner, J.: An inflationary fixed point in XQuery. In: Proc. of ICDE (2008)

  2. Agrawal, R., Devanbu, P.: Moving selections into linear least fixpoint queries. In: Proc. of ICDE (1988)

  3. Aho, A., Ullman, J.: Universality of data retrieval languages. In: Proc. of POPL (1979)

  4. Amer-Yahia, S., Cho, S., Lakshmanan, L., Srivistava, D.: Minimization of tree pattern queries. In: Proc. of SIGMOD (2001)

  5. Bancilhon, F., Maier, D., Sagiv, Y., Ullman, J.: Magic sets and other strange ways to implement logic programs. In: Proc. of PODS (1986)

  6. Bancilhon, F., Ramakrishnan, R.: An amateur’s introduction to recursive query processing strategies. In: Proc. of SIGMOD (1986)

  7. Barbosa, D., Freire, J., Mendelzon, A.: Designing information-preserving mapping schemes for XML. In: Prof. of VLDB (2005)

  8. Beeri, C., Ramakrishnan, R.: On the power of magic. J. Log. Program 10 (1991)

  9. Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of DTDs. J. ACM 55(2) (2008)

  10. BIOML. BIOpolymer Markup Language. http://xml.coverpages.org/BIOML-XML-DTD.txt

  11. Boncz, P.A., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proc. of SIGMOD (2004)

  12. Choi, B.: What are real DTDs like. In: Proc. of WebDB (2002)

  13. Choi, B., Cong, G., Fan, W., Viglas, S.: Updating recursive XML views of relations. In: Prof. of ICDE (2007)

  14. Christophides, V., Cluet, S., Moerkotte, G.: Evaluating queries with generalized path expressions. In: Proc. of SIGMOD (1996)

  15. Clark, J., DeRose, S.: XML path language (XPath). W3C Recommendation, Nov 1999

  16. DeHaan, D., Toman, D., Consens, M., Ozsu, T.: Comprehensive XQuery to SQL translation using dynamic interval encoding. In: Proc. of SIGMOD (2003)

  17. Deutsch, A., Tannen, V.: MARS: A system for publishing XML from mixed and redundant storage. In: Proc. of VLDB (2003)

  18. Ehrenfeucht, A., Zeiger, P.: Complexity measures for regular expressions. In: Proc. of STC’74 (1974)

  19. EXSLT.: http://www.exslt.org/dyn/functions/closure/index.html

  20. Fan, W., Bohannon, P.: Information preserving XML schema embedding. TODS 33(1) (2008)

  21. Fan, W., Chan, C.-Y., Garofalakis, M.: Secure XML querying with security views. In: Proc. of SIGMOD (2004)

  22. Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Rewriting regular XPath queries on XML views. In: Proc. of ICDE (2007)

  23. Fan, W., Geerts, F., Neven, F.: Expressiveness and complexity of XML publishing transducers. TODS (2008)

  24. Fan, W., Yu, J.X., Lu, H., Lu, J., Rastogi, R.: Query translation from XPath to SQL in the presence of recursive DTDs. In: Proc. of VLDB (2005)

  25. Fernandez, M., Suciu, D.: Optimizing regular path expression using graph schemas. In: Proc. of ICDE (1998)

  26. Fernandez, M.F., Morishima, A., Suciu, D.: Efficient evaluation of XML middleware queries. In: Proc. of SIGMOD (2001)

  27. GedML.: Genealogy Markup Language. http://xml.coverpages.org/gedml-dtd9808.txt

  28. Georgiadis, H., Vassalos, V.: Improving the efficiency of XPath execution on relational systems. In: Proc. of EBDT (2006)

  29. Georgiadis, H., Vassalos, V.: XPath on steroids: exploiting relational engines for XPath performance. In: Proc. of SIGMOD (2007)

  30. Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2) (1993)

  31. Grust T., Keulen M., van Teubner J.: Accelerating XPath evaluation in any RDBMS. TODS 29, 91–131 (2004)

    Article  Google Scholar 

  32. Halevy, A.Y.: Theory of answering queries using views. SIGMOD Record 29(4) (2001)

  33. IBM.: DB2 XML Extender. http://www-3.ibm.com/software/data/db2/extended/xmlext/index.html

  34. Jain, S., Mahajan, R., Suciu, D.: Translating XSLT programs to efficient SQL querie. In: Proc. of WWW (2002)

  35. Kambayashi, Y.: Processing cyclic queries. In: Query processing in database systems, pp. 63–78. Springer, Heidelberg (1985)

  36. Kha, D.D., Yoshikawa, M., Uemura, S.: An XML indexing structure with relative region coordinate. In: Proc. of ICDE (2001)

  37. Kim, Y.-C., Kim, W., Dale, A.: Cyclic query processing in object-oriented databases. In: Proc. of ICDE (1989)

  38. Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Prof. of PODS (2006)

  39. Krishnamurthy, R., Chakaravarthy, V.T., Kaushik, R., Naughton, J.: Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation. In: Proc. of ICDE (2004)

  40. Krishnamurthy, R., Kaushik, R., Naughton, J.: XML-SQL query translation literature: The state of the art and open problems. In: Proc. of Xsym (2003)

  41. Krishnamurthy, R., Kaushik, R., Naughton, J.: Efficient XML-to-SQL query translation: Where to add the intelligence. In: Proc. of VLDB (2004)

  42. Krishnamurthy, R., Kaushik, R., Naughton, J.: XML views as integrity constraints and their use in query translation. In: Proc. of ICDE (2005)

  43. Kunen, I.K., Suciu, D.: A scalable algorithm for query minimization. Technical Report, University of Washington (2004)

  44. Lenzerini, M.: Data integration: A theoretical perspective. In: Proc. of PODS (2002)

  45. Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB (2001)

  46. Likin, L.: Logics for unranked trees: An overview. Log. Meth. Comput. Sci. 2(3) (2006)

  47. Manolescu, I., Florescu, D., Kossmann, D.: Answering XML queries on heterogeneous data sources. In: Proc. of VLDB (2001)

  48. Marx, M.: XPath with conditional axis relations. In: Proc. of EDBT (2004)

  49. Microsoft.: SQLXML and XML Mapping Technologies. http://msdn.microsoft.com/sqlxml/default.asp

  50. Mishra, P., Eich, M.H.: Join processing in relational databases. ACM Comput. Surv. 24(1) (1992)

  51. Nunn, M.: An Overview of SQL Server 2005 for the Database Developer, (2004). http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql_ovyukondev.asp

  52. Oracle.: Oracle9i XML Database Developer’s Guide—Oracle XML DB Release 2. http://otn.oracle.com/tech/xmldb/content.html

  53. Papakonstantinou, Y., Vianu, V.: Type inference for views of semistructured data. In: Proc. of PODS (2000)

  54. Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient algorithms for multi query optimization. In: Proc. of SIGMOD (2000)

  55. Saxon.: http://www.saxonica.com/

  56. Shan, M.-C., Neimat, M.-A.: Optimization of relational algebra expressions containing recursion operators. In: Proc. of ACM Annual Computer Science Conference (1999)

  57. Shanmugasundaram, J., Kiernan, J., Shekita, E.J., Fan, C., Funderburk, J.: Querying XML views of relational data. In: Proc. of VLDB (2001)

  58. Shanmugasundaram, J., Shekita, E., Barr, R., Carey, M., Lindsay, B., Pirahesh, H., Reinwald, B.: A general techniques for querying XML documents using a relational database system. SIGMOD Record 30(3) (2001)

  59. Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton, J.: Relational databases for querying XML documents: Limitations and opportunities. In: Proc. of VLDB (1999)

  60. Silberstein, A., He, H., Yi, K., Yang, J.: BOXes: Efficient maintenance of order-based labeling for dynamic XML data. In: Proc. of ICDE (2005)

  61. Tarjan R.E.: Fast algorithms for solving path problems. J. ACM 28(3), 594–614 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  62. Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and querying ordered XML using a relational database system. In: Proc. of SIGMOD (2002)

  63. Thompson, H. et al.: XML Schema. W3C Working Draft, May 2001. http://www.w3.org/XML/Schema

  64. Zhang, C., Naughton, J., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD’01 (2001)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenfei Fan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, W., Yu, J.X., Li, J. et al. Query translation from XPath to SQL in the presence of recursive DTDs. The VLDB Journal 18, 857–883 (2009). https://doi.org/10.1007/s00778-008-0131-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-008-0131-0

Keywords

Navigation