Advertisement

TwigTable: Using Semantics in XML Twig Pattern Query Processing

  • Huayu Wu
  • Tok Wang Ling
  • Bo Chen
  • Liang Xu
Chapter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6720)

Abstract

In this paper, we demonstrate how the semantic information, such as value, property, object class and relationship between object classes in XML data impacts XML query processing. We show that the lack of using semantics causes different problems in value management and content search in existing approaches. Motivated on solving these problems, we propose a semantic approach for XML twig pattern query processing. In particular, we design TwigTable algorithm to incorporate property and value information into query processing. This information can be correctly discovered in any XML data. In addition, we propose three object-based optimization techniques to TwigTable. If more semantics of object classes are known in an XML document, we can process queries more efficiently with these semantic optimizations. Last, we show the benefits of our approach by a comprehensive experimental study.

Keywords

Query Processing Relational Table Structural Search Query Node Inverted List 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
  3. 3.
    Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient XML query pattern matching. In: Proc. of ICDE, pp. 141–154 (2002)Google Scholar
  4. 4.
    Berglund, A., Chamberlin, D., Fernandez, M.F., Kay, M., Robie, J., Simeon, J.: XML Path Language (XPath) 2.0. W3C Working Draft (2003)Google Scholar
  5. 5.
    Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query. W3C Working Draft (2003)Google Scholar
  6. 6.
    Bohannon, P., Freire, J., Roy, P., Simeon, J.: From XML schema to relations: a cost-based approach to XML storage. In: Proc. of ICDE, pp. 64–75 (2002)Google Scholar
  7. 7.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: Proc. of SIGMOD, pp. 310–321 (2002)Google Scholar
  8. 8.
    Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: Proc. of SIGMOD, pp. 455–466 (2005)Google Scholar
  9. 9.
    Chen, Y., Davidson, S.B., Hara, C.S., Zheng, Y.: RRXS: redundancy reducing XML storage in relations. In: Proc. of VLDB, pp. 189–200 (2003)Google Scholar
  10. 10.
    Doan, A., Ramakrishnan, R., Chen, F., DeRose, P., Lee, Y., McCann, R., Sayyadian, M., Shen, W.: Community information management. IEEE Data Eng. Bull. 29(1), 64–72 (2006)Google Scholar
  11. 11.
    Florescu, D., Kossmann, D.: Storing and querying XML data using an RDMBS. IEEE Data Eng. Bull. 22(3), 27–34 (1999)Google Scholar
  12. 12.
    Gou, G., Chirkova, R.: Efficienty querying large XML data repositories: a survey. IEEE Transactions on Knowledge and Data Engineering 19(10), 1381–1403 (2007)CrossRefGoogle Scholar
  13. 13.
    Grust, T.: Accelerating XPath location steps. In: Proc. of SIGMOD, pp. 109–120 (2002)Google Scholar
  14. 14.
    Jiang, H., Lu, H., Wang, W.: Efficient processing of XML twig queries with OR-predicates. In: Proc. of SIGMOD, pp. 59–70 (2004)Google Scholar
  15. 15.
    Jiang, H., Wang, W., Lu, H., Yu, J.: Holistic twig joins on indexed XML documents. In: Proc. of VLDB, pp. 273–284 (2003)Google Scholar
  16. 16.
    Li, C., Ling, T.W.: QED: a novel quaternary encoding to completely avoid re-labeling in XML updates. In: Proc. of CIKM, pp. 501–508 (2005)Google Scholar
  17. 17.
    Ling, T.W., Lee, M.L., Dobbie, G.: Semistructured database design (web information systems engineering and Internet technologies series). Springer, Heidelberg (2004)Google Scholar
  18. 18.
    Liu, Z., Chen, Y.: Identifying meaningful return information for XML keyword search. In: Proc. of SIGMOD, pp. 329–340 (2007)Google Scholar
  19. 19.
    Lu, J., Chen, T., Ling, T.W.: Efficient processing of XML twig patterns with parent child edges: a look-ahead approach. In: Proc. of CIKM, pp. 533–542 (2004)Google Scholar
  20. 20.
    Lu, J., Ling, T.W., Chan, C., Chen, T.: From region encoding to extended dewey: On efficient processing of XML twig pattern matching. In: Proc. of VLDB, pp. 193–204 (2005)Google Scholar
  21. 21.
    Navathe, S., Ceri, S., Wiederhold, G., Dou, J.: Vertical partitioning algorithms for database design. ACM Transactions on Database Systems 9(4), 680–710 (1984)CrossRefGoogle Scholar
  22. 22.
    Pal, S., Cseri, I., Seeliger, O., Schaller, G., Giakoumakis, L., Zolotov, V.: Indexing XML data stored in a relational database. In: Proc. of VLDB, pp. 1146–1157 (2004)Google Scholar
  23. 23.
    Rao, P.R., Moon, B.: PRIX: Indexing and Querying XML Using Prufer Sequences. In: Proc. of ICDE, p. 288 (2004)Google Scholar
  24. 24.
    Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational databases for querying XML documents: limitations and opportunities. In: Proc. of VLDB, pp. 302–314 (1999)Google Scholar
  25. 25.
    Spink, A.: A user-centered approach to evaluating human interaction with web search engines: an exploratory study. Information Processing & Management 38(3), 401–426 (2002)CrossRefzbMATHGoogle Scholar
  26. 26.
    Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc. of SIGMOD, pp. 204–215 (2002)Google Scholar
  27. 27.
    Tian, F., DeWitt, D.J., Chen, J., Zhang, C.: The design and performance evaluation of alternative XML storage strategies. SIGMOD Record 31(1), 5–10 (2002)CrossRefGoogle Scholar
  28. 28.
    TreeBank. Retrieved from University of Washington Database Group (2002)Google Scholar
  29. 29.
    Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A Dynamic index method for querying XML data by tree structures. In: Proc. of SIGMOD, pp. 110–121 (2003)Google Scholar
  30. 30.
    Wu, H., Ling, T.W., Chen, B.: VERT: A semantic approach for content search and content extraction in XML query processing. In: Parent, C., Schewe, K.-D., Storey, V.C., Thalheim, B. (eds.) ER 2007. LNCS, vol. 4801, pp. 534–549. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  31. 31.
    Wu, H., Ling, T.W., Dobbie, G., Bao, Z., Xu, L.: Reducing graph matching to tree matching for XML queries with ID references. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6262, pp. 391–406. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  32. 32.
    XMark. An xml benchmark project, http://www.xml-benchmark.org
  33. 33.
    Xu, L., Ling, T.W., Wu, H., Bao, Z.: DDE: From Dewey to a Fully Dynamic XML Labeling Scheme. In: Proc. of SIGMOD, pp. 719–730 (2009)Google Scholar
  34. 34.
    Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1), 110–141 (2001)CrossRefGoogle Scholar
  35. 35.
    Yu, C., Jagadish, H.V.: Efficient discovery of XML data redundancies. In: Proc. of VLDB, pp. 103–114 (2006)Google Scholar
  36. 36.
    Yu, T., Ling, T.W., Lu, J.: Twigstacklistnot: A holistic twig join algorithm for twig query with NOT-predicates on XML data. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 249–263. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  37. 37.
    Zhang, C., Naughton, J., Dewitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD, pp. 425–436 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Huayu Wu
    • 1
  • Tok Wang Ling
    • 1
  • Bo Chen
    • 1
  • Liang Xu
    • 1
  1. 1.School of ComputingNational University of SingaporeSingapore

Personalised recommendations