The VLDB Journal

, Volume 18, Issue 4, pp 857–883 | Cite as

Query translation from XPath to SQL in the presence of recursive DTDs

  • Wenfei Fan
  • Jeffrey Xu Yu
  • Jianzhong Li
  • Bolin Ding
  • Lu Qin
Regular Paper

Abstract

We study the problem of evaluating xpath queries over xml data that is stored in an rdbms via schema-based shredding. The interaction between recursion (descendants-axis) in xpath queries and recursion in dtds makes it challenging to answer xpath queries using rdbms. We present a new approach to translating xpath queries into sql queries based on a notion of extended XPath expressions and a simple least fixpoint (lfp) operator. Extended xpath expressions are a mild extension of xpath, and the lfp operator takes a single input relation and is already supported by most commercial rdbms. We show that extended xpath expressions are capable of capturing both dtd recursion and xpath queries in a uniform framework. Furthermore, they can be translated into an equivalent sequence of sql queries with the lfp operator. We present algorithms for rewriting xpath queries over a (possibly recursive) dtd into extended xpath expressions and for translating extended xpath expressions to sql queries, as well as optimization techniques. The novelty of our approach consists in its capability to answer a large class of xpath queries by means of only low-end rdbms features already available in most rdbms, as well as its flexibility to accommodate existing relational query optimization techniques. In addition, these translation algorithms provide a solution to query answering for certain (possibly recursive) xml views of xml data. Our experimental results verify the effectiveness of our techniques.

Keywords

XML database XPath SQL Recursive DTD Query translation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Afanasiev, L., Grust, T., Marx, M., Rittinger, J., Teubner, J.: An inflationary fixed point in XQuery. In: Proc. of ICDE (2008)Google Scholar
  2. 2.
    Agrawal, R., Devanbu, P.: Moving selections into linear least fixpoint queries. In: Proc. of ICDE (1988)Google Scholar
  3. 3.
    Aho, A., Ullman, J.: Universality of data retrieval languages. In: Proc. of POPL (1979)Google Scholar
  4. 4.
    Amer-Yahia, S., Cho, S., Lakshmanan, L., Srivistava, D.: Minimization of tree pattern queries. In: Proc. of SIGMOD (2001)Google Scholar
  5. 5.
    Bancilhon, F., Maier, D., Sagiv, Y., Ullman, J.: Magic sets and other strange ways to implement logic programs. In: Proc. of PODS (1986)Google Scholar
  6. 6.
    Bancilhon, F., Ramakrishnan, R.: An amateur’s introduction to recursive query processing strategies. In: Proc. of SIGMOD (1986)Google Scholar
  7. 7.
    Barbosa, D., Freire, J., Mendelzon, A.: Designing information-preserving mapping schemes for XML. In: Prof. of VLDB (2005)Google Scholar
  8. 8.
    Beeri, C., Ramakrishnan, R.: On the power of magic. J. Log. Program 10 (1991)Google Scholar
  9. 9.
    Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of DTDs. J. ACM 55(2) (2008)Google Scholar
  10. 10.
    BIOML. BIOpolymer Markup Language. http://xml.coverpages.org/BIOML-XML-DTD.txt
  11. 11.
    Boncz, P.A., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proc. of SIGMOD (2004)Google Scholar
  12. 12.
    Choi, B.: What are real DTDs like. In: Proc. of WebDB (2002)Google Scholar
  13. 13.
    Choi, B., Cong, G., Fan, W., Viglas, S.: Updating recursive XML views of relations. In: Prof. of ICDE (2007)Google Scholar
  14. 14.
    Christophides, V., Cluet, S., Moerkotte, G.: Evaluating queries with generalized path expressions. In: Proc. of SIGMOD (1996)Google Scholar
  15. 15.
    Clark, J., DeRose, S.: XML path language (XPath). W3C Recommendation, Nov 1999Google Scholar
  16. 16.
    DeHaan, D., Toman, D., Consens, M., Ozsu, T.: Comprehensive XQuery to SQL translation using dynamic interval encoding. In: Proc. of SIGMOD (2003)Google Scholar
  17. 17.
    Deutsch, A., Tannen, V.: MARS: A system for publishing XML from mixed and redundant storage. In: Proc. of VLDB (2003)Google Scholar
  18. 18.
    Ehrenfeucht, A., Zeiger, P.: Complexity measures for regular expressions. In: Proc. of STC’74 (1974)Google Scholar
  19. 19.
  20. 20.
    Fan, W., Bohannon, P.: Information preserving XML schema embedding. TODS 33(1) (2008)Google Scholar
  21. 21.
    Fan, W., Chan, C.-Y., Garofalakis, M.: Secure XML querying with security views. In: Proc. of SIGMOD (2004)Google Scholar
  22. 22.
    Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Rewriting regular XPath queries on XML views. In: Proc. of ICDE (2007)Google Scholar
  23. 23.
    Fan, W., Geerts, F., Neven, F.: Expressiveness and complexity of XML publishing transducers. TODS (2008)Google Scholar
  24. 24.
    Fan, W., Yu, J.X., Lu, H., Lu, J., Rastogi, R.: Query translation from XPath to SQL in the presence of recursive DTDs. In: Proc. of VLDB (2005)Google Scholar
  25. 25.
    Fernandez, M., Suciu, D.: Optimizing regular path expression using graph schemas. In: Proc. of ICDE (1998)Google Scholar
  26. 26.
    Fernandez, M.F., Morishima, A., Suciu, D.: Efficient evaluation of XML middleware queries. In: Proc. of SIGMOD (2001)Google Scholar
  27. 27.
    GedML.: Genealogy Markup Language. http://xml.coverpages.org/gedml-dtd9808.txt
  28. 28.
    Georgiadis, H., Vassalos, V.: Improving the efficiency of XPath execution on relational systems. In: Proc. of EBDT (2006)Google Scholar
  29. 29.
    Georgiadis, H., Vassalos, V.: XPath on steroids: exploiting relational engines for XPath performance. In: Proc. of SIGMOD (2007)Google Scholar
  30. 30.
    Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2) (1993)Google Scholar
  31. 31.
    Grust T., Keulen M., van Teubner J.: Accelerating XPath evaluation in any RDBMS. TODS 29, 91–131 (2004)CrossRefGoogle Scholar
  32. 32.
    Halevy, A.Y.: Theory of answering queries using views. SIGMOD Record 29(4) (2001)Google Scholar
  33. 33.
  34. 34.
    Jain, S., Mahajan, R., Suciu, D.: Translating XSLT programs to efficient SQL querie. In: Proc. of WWW (2002)Google Scholar
  35. 35.
    Kambayashi, Y.: Processing cyclic queries. In: Query processing in database systems, pp. 63–78. Springer, Heidelberg (1985)Google Scholar
  36. 36.
    Kha, D.D., Yoshikawa, M., Uemura, S.: An XML indexing structure with relative region coordinate. In: Proc. of ICDE (2001)Google Scholar
  37. 37.
    Kim, Y.-C., Kim, W., Dale, A.: Cyclic query processing in object-oriented databases. In: Proc. of ICDE (1989)Google Scholar
  38. 38.
    Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Prof. of PODS (2006)Google Scholar
  39. 39.
    Krishnamurthy, R., Chakaravarthy, V.T., Kaushik, R., Naughton, J.: Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation. In: Proc. of ICDE (2004)Google Scholar
  40. 40.
    Krishnamurthy, R., Kaushik, R., Naughton, J.: XML-SQL query translation literature: The state of the art and open problems. In: Proc. of Xsym (2003)Google Scholar
  41. 41.
    Krishnamurthy, R., Kaushik, R., Naughton, J.: Efficient XML-to-SQL query translation: Where to add the intelligence. In: Proc. of VLDB (2004)Google Scholar
  42. 42.
    Krishnamurthy, R., Kaushik, R., Naughton, J.: XML views as integrity constraints and their use in query translation. In: Proc. of ICDE (2005)Google Scholar
  43. 43.
    Kunen, I.K., Suciu, D.: A scalable algorithm for query minimization. Technical Report, University of Washington (2004)Google Scholar
  44. 44.
    Lenzerini, M.: Data integration: A theoretical perspective. In: Proc. of PODS (2002)Google Scholar
  45. 45.
    Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB (2001)Google Scholar
  46. 46.
    Likin, L.: Logics for unranked trees: An overview. Log. Meth. Comput. Sci. 2(3) (2006)Google Scholar
  47. 47.
    Manolescu, I., Florescu, D., Kossmann, D.: Answering XML queries on heterogeneous data sources. In: Proc. of VLDB (2001)Google Scholar
  48. 48.
    Marx, M.: XPath with conditional axis relations. In: Proc. of EDBT (2004)Google Scholar
  49. 49.
    Microsoft.: SQLXML and XML Mapping Technologies. http://msdn.microsoft.com/sqlxml/default.asp
  50. 50.
    Mishra, P., Eich, M.H.: Join processing in relational databases. ACM Comput. Surv. 24(1) (1992)Google Scholar
  51. 51.
    Nunn, M.: An Overview of SQL Server 2005 for the Database Developer, (2004). http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql_ovyukondev.asp
  52. 52.
    Oracle.: Oracle9i XML Database Developer’s Guide—Oracle XML DB Release 2. http://otn.oracle.com/tech/xmldb/content.html
  53. 53.
    Papakonstantinou, Y., Vianu, V.: Type inference for views of semistructured data. In: Proc. of PODS (2000)Google Scholar
  54. 54.
    Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient algorithms for multi query optimization. In: Proc. of SIGMOD (2000)Google Scholar
  55. 55.
  56. 56.
    Shan, M.-C., Neimat, M.-A.: Optimization of relational algebra expressions containing recursion operators. In: Proc. of ACM Annual Computer Science Conference (1999)Google Scholar
  57. 57.
    Shanmugasundaram, J., Kiernan, J., Shekita, E.J., Fan, C., Funderburk, J.: Querying XML views of relational data. In: Proc. of VLDB (2001)Google Scholar
  58. 58.
    Shanmugasundaram, J., Shekita, E., Barr, R., Carey, M., Lindsay, B., Pirahesh, H., Reinwald, B.: A general techniques for querying XML documents using a relational database system. SIGMOD Record 30(3) (2001)Google Scholar
  59. 59.
    Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton, J.: Relational databases for querying XML documents: Limitations and opportunities. In: Proc. of VLDB (1999)Google Scholar
  60. 60.
    Silberstein, A., He, H., Yi, K., Yang, J.: BOXes: Efficient maintenance of order-based labeling for dynamic XML data. In: Proc. of ICDE (2005)Google Scholar
  61. 61.
    Tarjan R.E.: Fast algorithms for solving path problems. J. ACM 28(3), 594–614 (1981)MATHCrossRefMathSciNetGoogle Scholar
  62. 62.
    Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and querying ordered XML using a relational database system. In: Proc. of SIGMOD (2002)Google Scholar
  63. 63.
    Thompson, H. et al.: XML Schema. W3C Working Draft, May 2001. http://www.w3.org/XML/Schema
  64. 64.
    Zhang, C., Naughton, J., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD’01 (2001)Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  • Wenfei Fan
    • 1
    • 2
  • Jeffrey Xu Yu
    • 3
  • Jianzhong Li
    • 4
  • Bolin Ding
    • 3
  • Lu Qin
    • 3
  1. 1.University of EdinburghEdinburghUK
  2. 2.Bell LaboratoriesMadisonUSA
  3. 3.The Chinese University of Hong KongHong KongChina
  4. 4.Harbin Institute of TechnologyHarbinChina

Personalised recommendations