Abstract
Information Retrieval Systems have been studied in Computer Science for decades. The traditional ad-hoc task is to find all documents relevant for an ad-hoc given query but the accuracy of adhoc document retrieval systems has plateaued in recent years. At DFKI, we are working on so-called collaborative information retrieval (CIR) systems which unintrusively learn from their users search processes. In this paper, a new approach is presented called term-based concept learning (TCL) which learns conceptual description terms occurring in known queries. A new query is expanded term by term using the previously learned concepts. Experiments have shown that TCL and the combination with pseudo relevance feedback result in notable improvements in the retrieval effectiveness if measured the recall/precision in comparison to the standard vector space model and to the pseudo relevance feedback. This approach can be used to improve the retrieval of documents in Digital Libraries, in Document Management Systems, in the WWW etc.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Buckley C., Salton G., Allen J.: The effect of adding relevance information in a relevance feedback environment. In Proceedings of the Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 292–300, 1994
Baeza-Yates R., Ribeiro-Neto B.: Modern Information Retrieval. Addison-Wesley Pub. Co., 1999. ISBN 020139829X
Hull D.: Using Statistical Testing in the Evaluation of Retrieval Experiments. In Proceedings of the 16th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 329–338, 1993
Jansen B.J., Spink A., Bateman J. and Saracevic T.: Real Life Information Retrieval: A Study of User Queries on the Web, In SIGIR Forum, Vol. 31, pp. 5–17, 1988
Kise K., Junker M., Dengel A., Matsumoto K.: Passage-Based Document Retrieval as a Tool for Text Mining with User’s Information Needs, In Proceedings of the 4th Internatl. Conference of Discovery Science, pp. 155–169, Washington, DC, USA, November 2001
Manning C.D. and Schütze H.: Foundations of Statistical Natural Language Processing, MIT Press, 1999
McCune B.P., Tong R.M., Dean J.S., Shapiro D.G.: RUBIC: A System for Rule-Based Information Retrieval, IEEE Transaction on Software Engineering, Vol. SE-11, No.9, September 1985
Minker J., Wilson, G.A. Zimmerman, B.H.: An evaluation of query expansion by the addition of clustered terms for a document retrieval system, Information Storage and Retrieval, vol. 8(6), pp. 329–348, 1972
Peat H.J., Willet, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems, Journal of the ASIS, vol. 42(5), pp. 378–383, 1991
Pirkola A.: Studies on Linguistic Problems and Methods in Text Retrieval: The Effects of Anaphor and Ellipsis Resolution in Proximity Searching, and Translation and query Structuring Methods in Cross-Language Retrieval, PhD dissertation, Department of Information Studies, University of Tampere. Acta Universitatis Tamperensis 672. ISBN 951-44-4582-1; ISSN 1455-1616. June 1999
Qiu Y.: ISIR: an integrated system for information retrieval, In Proceedings of 14th IR Colloqium, British Computer Society, Lancaster, 1992
Salton G., Buckley C.: Term weighting approaches in automatic text retrieval. Information Processing & Management 24(5), pp. 513–523, 1988
Sparck-Jones K.: Notes and references on early classification work. In SIGIR Forum, vol. 25(1), pp. 10–17, 1991
Smeaton A.F., van Rijsbergen C.J.: The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal, vol. 26(3), pp. 239–246, 1983
Yang Y. and Liu X.: A Re-Examination of Text Categorization Methods. In Proceedings of the 22nd Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49, Berkeley, CA, August 1999
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klink, S., Hust, A., Junker, M., Dengel, A. (2002). Collaborative Learning of Term-Based Concepts for Automatic Query Expansion. In: Elomaa, T., Mannila, H., Toivonen, H. (eds) Machine Learning: ECML 2002. ECML 2002. Lecture Notes in Computer Science(), vol 2430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36755-1_17
Download citation
DOI: https://doi.org/10.1007/3-540-36755-1_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44036-9
Online ISBN: 978-3-540-36755-0
eBook Packages: Springer Book Archive