Abstract
In this paper, a novel XML information retrieval system is proposed. It supports our extended IR language, which integrates our notion of relevance scoring. We develop the inverted index structure, which can efficiently term and phrase search. A novel scoring method is also presented to filter the irrelevant nodes. As our system supports two kinds of formats for the returning fragments, we take different measures to realize. Experiments show that the system can efficiently and effectively handle XMLIR style queries.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Yu, C., Qi, H., Jagadish, H.V.: Integration of IR into an XML Database. In: INEX (2002)
Fuhr, N., Grobjohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: Proceedings of the 21th Annual ACM SIGIR Conference on Research and Development in Information Retrieval (2001)
Mass, Y., Mandelbrod, M., Amitay, E., Carmel, D., Maarek, Y., Soffer, A.: JuruXML-an XML retrieval system at INEX 2002 (2002)
Grabs, E., Schek, H.-J.: Flexible Information Retrieval from XML with PowerDBXML. In: INEX (2002)
Schlieder, T., Meuss, H.: Result Ranking for Structured Queries against XML Documents. In: DELOSWorkshop on Information Seeking, Searching and Querying in Digital Libraries (2000)
Al-Khalifa, S., Yu, C., Jagadish, H.V.: Querying Structured Text in an XML Database. In: Sigmod (2003)
AmerYahia, S., Chavdar Botev, J.: TeXQuery: A FullText Search Extension to XQuery. In: WWW (2004)
Guo, L.L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: Sigmod (2003)
AmerYahia, S., Lakshmanan, L.V.S., Pandit, S.: FleXPath: Flexible Structure and FullText Querying for XML. In: Sigmod (2004)
Grabs, T., Schek, H.-J.: Generating Vector Spaces On-the-fly for Flexible XML Retrieval. In: Proceedings of the ACM SIGIR Workshop on XML and Information Retrieval, ACM Press, New York (2002)
Florescu, D., Kossmann, D., Manolescu, I.: Integrating Keyword Search into XML Query Processing. In: Proc. of the Intern. WWW Conference, Amsterdam (2000)
Sacks-Davis, R., Dao, T., Thom, J.A., Zobel, J.: Indexing Documents for Queries on Structure, Content and Attributes. In: DMIB 1997 (1997)
Kamps, J., de Rijke, M., Sigurbjornsson, B.: Length. Normalization in XML Retrieval. In: ACM SIGIR (2004)
Williams, H.E., Zobel, J., Bahle, D.: Fast Phrase Querying with Multiple Indexes. ACM Transactions on Information Systems 22(4), 573–594 (2004)
Kaushik, R., Krishnamurthy, R., Naughton, J.F., Ramakrishnan, R.: On the Integration of Structure Indexes and Inverted Lists. In: Sigmod (2004)
Chinenyanga, T.T., Kushmerick, N.: An expressive and efficient language for XML information retrieval. J. American Society for Information Science & Technology 53(6), 438–453 (2002)
The World Wide Web Consortium. XQuery and XPath Full-Text Requirements, http://www.w3.org/TR/xmlquery-full-text-requirements/
The World Wide Web Consortium. XQuery 1.0: An XML Query Language. http://www.w3.org/TR/xquery/
The World Wide Web Consortium. XML Path Language (XPath) 2.0 http://www.w3.org/TR/xpath20/
Initiative for the Evaluation of XML Retrieval (INEX) http://www.is.informatik.uni-duisburg.de/projects/inex03/
Han, Z., Xi, C., Le, J.: Efficiently Coding and Indexing XML Document. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 138–150. Springer, Heidelberg (2005)
Han, Z., Shen, B.: A New Effective Relevance Scoring Algorithm for XML IR. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 12–21. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Han, Z., Li, W., Mao, M. (2007). SSRS—An XML Information Retrieval System. In: Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2007. Lecture Notes in Computer Science, vol 4777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75512-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-75512-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75511-1
Online ISBN: 978-3-540-75512-8
eBook Packages: Computer ScienceComputer Science (R0)