Abstract
Query expansion methods have been studied for a long time - with debatable success in many instances. In this paper, a new approach is presented based on using term concepts learned by other queries. Two important issues with query expansion are addressed: the selection and the weighing of additional search terms. In contrast to other methods, the regarded query is expanded by adding those terms which are most similar to the concept of individual query terms, rather than selecting terms that are similar to the complete query or that are directly similar to the query terms. Experiments have shown that this kind of query expansion results in notable improvements of the retrieval effectiveness if measured the recall/precision in comparison to the standard vector space model and to the pseudo relevance feedback. This approach can be used to improve the retrieval of documents in Digital Libraries, in Document Management Systems, in the WWW etc.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aalbersberg I.J.: Incremental relevance feedback. In Proceedings of the Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 11–22, 1992
Allan J.: Incremental relevance feedback for information filtering. In Proceedings of the 19 th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 270–278, 1996
Alsaffar A.H., Deogun J.S., Raghavan V.V., Sever H: Concept-based retrieval with minimal term sets. In Z.W. Ras and A. Skowon, editors, Foundation of Intelligent Systems: 11th Int. Symposium, ISMIS’99, pp. 114–122, Springer, Warsaw, Poland, June 1999
Buckley C, Salton G., Allen J.: The effect of adding relevance information in a relevance feedback environment. In Proceedings of the 17 th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 292–300, 1994
Baeza-Yates R., Ribeiro-Neto B.: Modern Information Retrieval. Addison-Wesley Pub. Co., 1999. ISBN020139829X
Croft W.B.: Approaches to intelligent information retrieval. Information Processing and Management, 1987, Vol.23, No.4, pp. 249–254
Harman D.: Towards Interactive Query Expansion. In: Chiaramella Y. editor: 11th International Conference on Research and Development in Information Retrieval, pp. 321–331, Grenoble, France, 1988
Hull D.: Using Statistical Testing in the Evaluation of Retrieval Experiments. In Proceedings of the 16 th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 329–338,1993
Iwayama M.: Relevance Feedback with a Small Number of Relevance Judgments: Incremental Relevance Feedback vs. Document Clustering. In Proceedings of the 23 rd Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 10–16, Athens, Greece, July 2000
Jansen B.J., Spink A., Bateman J. and Saracevic T.: Real Life Information Retrieval: A Study of User Queries on the Web, In SIGIR Forum, Vol. 31, pp. 5–17, 1988
Kim M., Raghavan V.: Adaptive concept-based Retrieval Using a Neural Network, In Proceedings of ACM SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval, Athens, Greece, July 2000
Kise K., Junker M., Dengel A., Matsumoto K.: Passage-Based Document Retrieval as a Tool for Text Mining with User’s Information Needs, In Proceedings of the 4 th International Conference of Discovery Science, pp. 155–169, Washington, DC, USA, November 2001
Kwok K.: Query Modification and Expansion in a Network with Adaptive Architecture. In Proceedings of the 14 th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 192–201, 1991
Lu F., Johnsten Th., Raghavan V.V., Traylor D.: Enhancing Internet Search Engines to Achieve Concept-based Retrieval, In Proceedings of Info rum’99, Oakridge, USA
Manning CD. and Schiitze H.: Foundations of Statistical Natural Language Processing, MIT Press, 1999
Maglano V., Beauiieu M., Robertson S.,: Evaluation of interfaces for IRS: modeling end-user search behaviour. 20th Colloquium on Information Retrieval, Grenoble, 1988
McCune B.P., Tong R.M., Dean J.S., Shapiro D.G.: RUBRIC: A System for Rule-Based Information Retrieval, In IEEE Transaction on Software Engineering, Vol. SE-11, No.9, September 1985
Minker J., Wilson, G.A. Zimmerman, B.H.: An evaluation of query expansion by the addition of clustered terms for a document retrieval system, Information Storage and Retrieval, vol. 8(6), pp. 329–348, 1972
Peat H.J., Willet, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems, Journal of the ASIS, vol. 42(5), pp. 378–383, 1991
Pirkola A.: Studies on Linguistic Problems and Methods in Text Retrieval: The Effects of Anaphor and Ellipsis Resolution in Proximity Searching, and Translation and query Structuring Methods in Cross-Language Retrieval, PhD dissertation, Department of Information Studies, University of Tampere. Acta Universitatis Tamperensis 672. ISBN 951-44-4582-1; ISSN 1455-1616. June 1999
Qiu Y.: ISIR: an integrated system for information retrieval, In Proceedings of 14 th IR Colloqium, British Computer Society, Lancaster, 1992
van Rijsbergen C.J., Harper D.H., etal.: The Selection of Good Search Terms. Information Processing and Management 17, pp. 77–91, 1981
Resnik P.: Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14 th Int. Joint Conference on Artificial Intelligence, pp. 448–453, 1995
Salton G., Buckley C: Term weighting approaches in automatic text retrieval. Information Processing & Management 24(5), pp. 513–523, 1988
Salton G., Buckley G: Improving Retrieval Performance by Relevance Feedback. Journal of the American Society for Information Science 41 (4), pp. 288–297, 1990
Sanderson M., Croft B.: Deriving concept hierarchies from text. In Proceedings of the 22 nd Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 206–213, Berkeley, CA, August 1999
Sparck-Jones K.: Notes and references on early classification work. In SIGIR Forum, vol. 25(1), pp. 10–17,1991
Smeaton A.F., van Rijsbergen C.J.: The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal, vol. 26(3), pp. 239–246, 1983
Stucky D.,: Unterstutzung der Anfrageformulierung bei Internet-Suchmaschinen durch User Relevance Feedback, diploma thesis, German Research Center of Artificial Intelligence (DFKI), Kaiserslautern, November 2000
Yang Y. and Liu X.: A Re-Examination of Text Categorization Methods. In Proceedings of the 22 nd Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49, Berkeley, CA, August 1999
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klink, S., Hust, A., Junker, M., Dengel, A. (2002). Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts. In: Lopresti, D., Hu, J., Kashi, R. (eds) Document Analysis Systems V. DAS 2002. Lecture Notes in Computer Science, vol 2423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45869-7_42
Download citation
DOI: https://doi.org/10.1007/3-540-45869-7_42
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44068-0
Online ISBN: 978-3-540-45869-2
eBook Packages: Springer Book Archive