Abstract
This paper shows how the s-term ranking model [1] is extended and combined with index structures and algorithms for structured document retrieval to enhance both the effectiveness of the model and the retrieval efficiency. We explain in detail how previous work on ranked and exact retrieval can be integrated and optimized, and which adaptions are necessary. Our approach is evaluated experimentally at the INEX workshop 2004 [2]. The results are encouraging and give rise to a number of future enhancements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Schlieder, T., Meuss, H.: Querying and Ranking XML Documents. Journal of the American Society for Information Science and Technology 53 (2002)
INEX: Initiative for the Evaluation of XML Retrieval (2004), Available at http://inex.is.informatik.uni-duisburg.de:2004
Fuhr, N., Großjohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: Research and Development in Information Retrieval (2001)
Wolff, J.E., Flörke, H., Cremers, A.B.: Searching and Browsing Collections of Structural Information. In: Proc. IEEE Forum on Research and Technology Advances in Digital Libraries (2000)
Schlieder, T.: Similarity Search in XML Data using Cost-Based Query Transformations. In: Proc. 4th Intern. Workshop on the Web and Databases (2001)
Theobald, A., Weikum, G.: The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking. In: Proc. 8th Int. Conf. on Extending Database Technology (2002)
Shin, D., Jang, H., Jin, H.: BUS: An Effective Indexing and Retrieval Scheme in Structured Documents. In: Proc. 3rd ACM Int. Conf. on Digital Libraries (1998)
Salton, G.: The SMART Retrieval System – Experiments in Automatic Document Processing. Prentice Hall Inc., Englewood Cliffs (1971)
Weigel, F., Meuss, H., Schulz, K.U., Bry, F.: Content and Structure in Indexing and Ranking XML. In: Proc. 7th Int. Workshop on the Web and Databases (2004)
Weigel, F., Meuss, H., Bry, F., Schulz, K.U.: Content-Aware DataGuides: Interleaving IR and DB Indexing Techniques for Efficient Retrieval of Textual XML Data. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 378–393. Springer, Heidelberg (2004)
Sacks-Davis, R., Arnold-Moore, T., Zobel, J.: Database Systems for Structured Documents. In: Proc. Int. Symposium on Advanced Database Technologies and Their Integration (1994)
Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proc. 18th IEEE Int. Conf. on Data Engineering (2002)
Kilpeläinen, P.: Tree Matching Problems with Applications to Structured Text Databases. PhD thesis, University of Helsinki (1992)
Meuss, H., Schulz, K.U., Weigel, F., Leonardi, S., Bry, F.: Visual Exploration and Retrieval of XML Document Collections with the Generic System X 2. Journal of Digital Libraries, Special Issue on Information Visualization Interfaces (2004)
Meuss, H.: Logical Tree Matching with Complete Answer Aggregates for Retrieving Structured Documents. PhD thesis, University of Munich (2000)
Meuss, H., Schulz, K.U.: Complete Answer Aggregates for Tree-like Databases: A Novel Approach to Combine Querying and Navigation. ACM Transactions on Information Systems 19 (2001)
Meuss, H., Schulz, K., Bry, F.: Towards Aggregated Answers for Semistructured Data. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 346. Springer, Heidelberg (2001)
Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (2004)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. 23rd Int. Conf. on Very Large Data Bases (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Weigel, F., Schulz, K.U., Meuss, H. (2005). Ranked Retrieval of Structured Documents with the S-Term Vector Space Model. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds) Advances in XML Information Retrieval. INEX 2004. Lecture Notes in Computer Science, vol 3493. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424550_19
Download citation
DOI: https://doi.org/10.1007/11424550_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26166-7
Online ISBN: 978-3-540-32053-1
eBook Packages: Computer ScienceComputer Science (R0)