Skip to main content

XQuery Processing with Relevance Ranking

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3186))

Abstract

We are presenting a coherent framework for XQuery processing that incorporates IR-style approximate matching and allows the ordering of results by their relevance score. Our relevance ranking algorithm is based on both stem matching and term proximity. Our XQuery processor is stream-based, consisting of iterators connected into pipelines. In our framework, all values produced by XQuery expressions are assigned scores,and these scores propagate and are combined when piped through the iterators. The most important feature of our evaluation engine is the use of structural and content-based inverse indexes that deliver data in document order and facilitate the use of efficient merge joins to evaluate path expressions and search predicates. We present the rules for the translation from a large part of XQuery to iterator pipeline. Our modular approach of building pipelines to evaluate XQuery scales up to any query complexity because the pipes can be connected in the same way complex queries are formed from simpler ones.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural Joins, A Primitive for Efficient XML Query Pattern Matching. In: ICDE (2002)

    Google Scholar 

  2. Al-Khalifa, S., Yu, C., Jagadish, H.V.: Querying Structured Text in an XML Database. In: SIGMOD, pp. 4–15 (2003)

    Google Scholar 

  3. Amer-Yahia, S., Botev, C., Shanmugasundaram, S.: TeXQuery: A Full-Text Search Extension to XQuery. In: WWW (2004)

    Google Scholar 

  4. Amer-Yahia, S., Lakshmanan, L.V.S., Pandit, S.: FleXPath: Flexible Structure and Full-Text Querying for XML. In: SIGMOD (2004)

    Google Scholar 

  5. Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (November 2003), At http://www.w3.org/TR/xquery/

  6. Botev, C., Amer-Yahia, S., Shanmugasundaram, J.: A TexQuery-Based XML Full-Text Search Engine (Demo Paper). In: SIGMOD (2004)

    Google Scholar 

  7. Bremer, J., Gertz, M.: XQuery/IR: Integrating XML Document and Data Retrieval. In: WebDB 2002, pp. 1–6 (2002)

    Google Scholar 

  8. Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: SIGMOD, pp. 310–321 (2002)

    Google Scholar 

  9. Chinenyanga, T., Kushmerick, N.: An Expressive and Efficient Language for XML Information Retrieval. JASIST 53(6), 438–453 (2002)

    Article  Google Scholar 

  10. Fuhr, N., Grojohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: SIGIR (2001)

    Google Scholar 

  11. Grabs, T., Schek, H.-J.: PowerDB-XML: a Platform for Data-Centric and Document-Centric XML Processing. In: XSym (2003)

    Google Scholar 

  12. Graefe, G.: Query Evaluation Techniques for Large Databases. ACM Computing Surveys 25(2), 73–170 (1993)

    Article  Google Scholar 

  13. Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: SIGMOD, pp. 16–27 (2003)

    Google Scholar 

  14. Kamps, J., Marx, M., de Rijke, M., Sigurbjornsson, B.: Best-Match Querying from Document-Centric XML. In: WebDB (2004)

    Google Scholar 

  15. Meyer, H., Bruder, I., Weber, G., Heuer, A.: The Xircus Search Engine (2003), At http://www.xircus.de

  16. Porter, M.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)

    Google Scholar 

  17. Theobald, A., Weikum, G.: Adding Relevance to XML. In: WebDB (2000)

    Google Scholar 

  18. Theobald, A., Weikum, G.: The Index-based XXL Search Engine for Querying. In: EDBT, pp. 477–495 (2002)

    Google Scholar 

  19. Theobald, A., Weikum, G.: The XXL Search Engine: Ranked Retrieval on XML Data using Indexes and Ontologies (Demo Paper). In: SIGMOD (2002)

    Google Scholar 

  20. Weigel, F., Meuss, H., Schulz, K.U., Bry, F.: Content and Structure in Indexing and Ranking XML. In: WebDB (2004)

    Google Scholar 

  21. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. In: SIGMOD (2001)

    Google Scholar 

  22. http://www.w3.org/TR/xmlquery-full-text-use-cases/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fegaras, L. (2004). XQuery Processing with Relevance Ranking. In: Bellahsène, Z., Milo, T., Rys, M., Suciu, D., Unland, R. (eds) Database and XML Technologies. XSym 2004. Lecture Notes in Computer Science, vol 3186. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30081-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30081-6_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22969-8

  • Online ISBN: 978-3-540-30081-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics