Skip to main content

The Effect of Structured Queries and Selective Indexing on XML Retrieval

  • Conference paper
Book cover Advances in XML Information Retrieval and Evaluation (INEX 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3977))

Abstract

We describe the University of Amsterdam’s participation in the INEX 2005 ad hoc track, covering the Thorough, Focused, and FetchBrowse tasks and their structured (+S) counterparts. Our research questions for this round of INEX were threefold. Our first and main research question was to investigate the contribution of structural constraints to improved retrieval performance. Our main results were that the two types of structural constraints have different effects. Constraining the target of result elements gives improvements in terms of early precision. Constraining the context of result elements improves mean average precision. Our second research question was to experiment with selective indexing strategies based on either the length of elements, the tag-name of elements considered relevant in earlier INEX years, or simply by indexing all sections or articles. Our experiments show that disregarding 80–90% of the total number of elements does not decrease retrieval performance. Third, we considered the automatic creation of structured queries using blind feedback. Here, our results are inconclusive, mainly due to few queries used and lack of comparison to traditional blind feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Clarke, C.L.A., Tilker, P.L.: MultiText experiments for INEX 2004. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 85–87. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  2. Fox, E.A., Shaw, J.A.: Combination of multiple searches. In: The Second Text REtrieval Conference (TREC-2), pp. 243–252 (1994)

    Google Scholar 

  3. Gövert, N., Abolhassani, M., Fuhr, N., Grossjohan, K.: Content-based XML retrieval with HyRex. In: Fuhr, N., Gövert, N., Kazai, G., Lalmas, M. (eds.) Proceedings of the First Workshop of the INitiative for the Evaluation of XML Retrieval (INEX 2002), ERCIM, pp. 26–32 (2003)

    Google Scholar 

  4. Hiemstra, D.: Using Language Models for Information Retrieval. PhD thesis, University of Twente (2001)

    Google Scholar 

  5. Kamps, J., de Rijke, M., Sigurbjörnsson, B.: The importance of length normalization for XML retrieval. Information Retrieval 8, 631–654 (2005)

    Article  Google Scholar 

  6. Liu, S., Shahinian, R., Chu, W.: Vague content and structure (VCAS) retrieval over document-centric XML collections. In: Doan, A., Neven, F., McCann, R., Bex, G.J. (eds.) Proceedings of the Eighth International Workshop on the Web and Databases (WebDB 2005), pp. 79–84 (2005)

    Google Scholar 

  7. Mass, Y., Mandelbrod, M.: Retrieving the most relevant XML components. In: Fuhr, N., Lalmas, M., Malik, S. (eds.) INEX 2003 Workshop Proceedings, pp. 53–58 (2004)

    Google Scholar 

  8. Mihajlović, V., Ramírez, G., de Vries, A.P., Hiemstra, D., Blok, H.E.: TIJAH at INEX 2004 modeling phrases and relevance feedback. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 276–291. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Ogilvie, P., Callan, J.: Hierarchical language models for XML component retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 224–237. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Ponte, J.: Language models for relevance feedback. In: Croft, W. (ed.) Advances in Information Retrieval, Ch. 3, pp. 73–96. Kluwer Academic Publishers, Boston (2000)

    Google Scholar 

  11. Sigurbjörnsson, B., Kamps, J., de Rijke, M.: An Element-Based Approch to XML Retrieval. In: INEX 2003 Workshop Proceedings, pp. 19–26 (2004)

    Google Scholar 

  12. Sigurbjörnsson, B., Kamps, J., de Rijke, M.: Mixture models, overlap, and structural hints in XML element retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 196–210. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sigurbjörnsson, B., Kamps, J. (2006). The Effect of Structured Queries and Selective Indexing on XML Retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds) Advances in XML Information Retrieval and Evaluation. INEX 2005. Lecture Notes in Computer Science, vol 3977. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-34963-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-34963-1_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34962-4

  • Online ISBN: 978-3-540-34963-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics