Skip to main content

Relaxed Global Term Weights for XML Element Search

  • Conference paper
Comparative Evaluation of Focused Retrieval (INEX 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6932))

Abstract

XML element search engines return XML elements which are part of XML documents as search results. Existing studies related to XML element search are brought from the information retrieval techniques for document search. There are some ways to calculate global weights of each term from statistics of XML elements with 1) the same path expression or 2) the same tag. In the first approach, the more complex a path expression is, the less the number of XML elements with the path expression becomes. This is a problem that global term weights may be calculated using statistics of a few XML elements. Such global weights are never global. The second approach also has a problem that it does not consider document structures of XML elements. To resolve the problems, we propose a method for calculating accurate global weights. In our method, we regard a path expression as an array of tags. We relax the restriction of appearance order and appearance frequency of tags in a path expression to gather similar path expressions into the same class. Therefore, we try to decrease the number of classes which hardly contain elements. Our experimental results show that our method can integrate path expressions without decreasing search accuracy with a certain test collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arvola, P., Geva, S., Kamps, J., Schenkel, R., Trotman, A., Vainio, J.: Overview of the INEX 2010 Ad Hoc Track. In: INEX 2010 Workshop Pre-proceedings, pp. 11–40 (December 2010)

    Google Scholar 

  2. Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G.: Intelligent Search on XML Data: Applications, Languages, Models, Implementations, and Benchmarks. LNCS, vol. 2818. Springer, Heidelberg (2003)

    Book  MATH  Google Scholar 

  3. Grabs, T., Schek, H.-J.: PowerDB-XML: A Platform for Data–Centric and Document–Centric XML Processing. In: Bellahsène, Z., Chaudhri, A.B., Rahm, E., Rys, M., Unland, R. (eds.) XSym 2003. LNCS, vol. 2824, pp. 100–117. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Hatano, K., Amer-yahia, S., Srivastava, D.: Document-Scoring, for XML Information Retrieval using Structural Condition of XML Queries. In: IEICE technical report, pp. 13–18, DE2007-117 (October 2007)

    Google Scholar 

  5. Kamps, J., Geva, S., Trotman, A., Woodley, A., Koolen, M.: Overview of the INEX 2008 Ad Hoc Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 1–28. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective Keyword search in Relational Databases. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 563–574. ACM, New York (2006)

    Chapter  Google Scholar 

  7. Lu, W., Robertson, S., MacFarlane, A.: Field-Weighted XML Retrieval Based on BM25. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 161–171. Springer, Heidelberg (2006)

    Google Scholar 

  8. Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval, pp. 157–159. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  9. Piwowarski, B., Gallinari, P.: A Bayesian Framework for XML Information Retrieval: Searching and Learning with the INEX Collection. Journal of Information Retrieval 8(4), 655–681 (2005)

    Article  Google Scholar 

  10. Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 Extension to Multiple Weighted Fields. In: Proceedings of the 13 ACM International Conference on Information and Knowledge Management, pp. 42–49 (November 2004)

    Google Scholar 

  11. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: The Third Text Retrieval Conference (TREC-3), pp. 109–126 (1995)

    Google Scholar 

  12. Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Journal of Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Keyaki, A., Hatano, K., Miyazaki, J. (2011). Relaxed Global Term Weights for XML Element Search. In: Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds) Comparative Evaluation of Focused Retrieval. INEX 2010. Lecture Notes in Computer Science, vol 6932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23577-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23577-1_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23576-4

  • Online ISBN: 978-3-642-23577-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics