Abstract
In this paper we explore the challenges to effectively use natural language processing (NLP) for information retrieval. First, we briefly cover current NLP uses and research areas in the intersection of both fields, namely summarization, information extraction, and question answering. Second, we motivate other possible challenging uses of NLP for information retrieval such as determining context, semantic search, and supporting the Semantic Web. We end with a particular use of NLP for a new problem, searching the future, that poses additional NLP challenges.
Partially funded by Fondecyt Grant 1020803 of CONICYT Chile. This paper was written while visiting the Dept. of Computer Science and Software Engineering of the University of Melbourne, Australia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abney, S.: Statistical Methods and Linguistics. In: Klavans, J.L., Resnik, P. (eds.) The Balancing Act: Combining Symbolic and Statistical Approaches to Language, pp. 1–26. MIT Press, Cambridge (1996)
ACL Anthology, http://www.acl.ldc.upenn.edu
Allan, J., Croft, B. (eds.): Challenges in Information Retrieval and Language Modeling, CIIR, UMass, Amherst, MA (2003)
Allan, J.: NLP for IR. In: Slides of tutorial presented at Joint Language Technology Conference, Seattle, WA (2000)
Allan, J. (ed.): Topic Detection and Tracking: Event-based Information Organization. Kluwer, Dordrecht (2002)
Armstrong, S., Church, K.W., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D.: Natural Language Processing using Very Large Corpora. Kluwer, Dordrecht (1999)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 513 pages. Addison- Wesley, England (1999)
Baeza-Yates, R., Piquer, J.: Agents, Crawlers, and Web Retrieval. In: Klusch, M., Ossowski, S., Shehory, O. (eds.) CIA 2002. LNCS (LNAI), vol. 2446, pp. 1–9. Springer, Heidelberg (2002)
Baeza-Yates, R.: Information Retrieval in the Web: beyond current search engines. Int. Journal of Approximate Reasoning 34(2-3), 97–104 (2003)
Baeza-Yates, R.: Searching the Future, Technical Report, CS Dept., University of Chile (2003)
Battista Varile, G., Zampolli, A. (eds.): Survey of the State of the Art in Human Language Technology. Cambridge University Press, Cambridge (1997)
Benjamins, R., Contreras, J., Corcho, O., Gomez-Perez, A.: Six Challenges for the Semantic Web. In: KR 2002 Workshop on Formal Ontology, Knowledge Representation and Intelligent Systems for the Web, Toulouse, France (2002)
Broder. A taxonomy of Web search. SIGIR Forum 36(2) (2002)
Brown, J.S., Duguid, P.: The social life of information. Harvard Press (2000)
Buitelaar, P., Declerck, T.: Linguistic Annotation for the Semantic Web. In: Handschuch, S., Staab, S. (eds.) Annotation for the Semantic Web. IOS Press, Amsterdam (2003)
Chakrabarti, S.: Mining the Web: Discovering knowledge from hypertext data. Morgan Kaufmann, San Francisco (2003)
Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern discovery on the World Wide Web. In: ICTAI 1997, pp. 558–567 (1997)
Dale, R., Moisl, H., Somers, H. (eds.): Handbook of Natural Language Processing. Marcel Dekker, NY (2000)
Dini, L.: NLP Technologies and the Semantic Web: Risks, Opportunities and Challenges. In: 8th Conference of the AI*IA, Pisa, Italy (2003)
Feldman, S.: NLP Meets the Jabberwocky: Natural Language Processing in Information Retrieval (Online) (May 1999)
Grefenstette, G. (ed.): Cross-Language Information Retrieval. Kluwer, Dordrecht (1998)
Hearst, M.: Untangling Text Data Mining. In: Proceedings of ACL 1999: the 37th Annual Meeting of the Association for Computational Linguistics, Univ. of Maryland, June 20-26 (1999)
Ingwersen, P.: Information Retrieval Interaction. Taylor Graham (1992)
Lewis, D., Sparck-Jones, K.: Natural Language Processing for Information Retrieval. Communications of the ACM 39(1), 92–101 (1996)
Lu, S., Dong, M., Fotouhi, F.: The Semantic Web: opportunities and challenges for next-generation Web applications. Information Research 7(4) (2002)
Mahesh, K. (ed.): Natural Language Processing for the World Wide Web. AAAI Press, Menlo Park (2002)
Mani, I.: Automatic Summarization. John Benjamins, Amsterdam (2001)
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Marchionini, G.: Information Seeking in Electronic Environments. Cambridge University Press, Cambridge (1992)
Melamed, I.D.: Empirical Methods for Exploiting Parallel Texts. MIT Press, Cambridge (2001)
Mitkow, R. (ed.): The Oxford Handbook of Computational Linguistics. Oxford University Press, Oxford (2003)
Moore, J.: Desiderata for an Every Citizen Interface to the National Information Structure: Challenges for NLP. In: Mahesh, K. (ed.) Natural Language Processing for the World Wide Web. AAAI Press, Menlo Park (2002)
OntoBuilder, http://ie.technion.ac.il/OntoBuilder/
Pazienza, M.T., Carbonell, J.G., Siekmann, J. (eds.): Information Extraction: Towards Scalable, Adaptable Systems. LNCS (LNAI). Springer, Heidelberg (1999)
Pazienza, M.T. (ed.): Information Extraction in the Web Era: Natural Language Communication for Knowledge Acquisition and Intelligent Information Agents. LNCS (LNAI), vol. 1714. Springer, Heidelberg (1999)
Pasca, M.: Open-Domain Question Answering from Large Text Collections. CSLI, Stanford (2003)
Perez-Carballo, J., Strzalkowski, T.: Natural Language Information Retrieval: Progress Report. Information Processing and Management 36(1), 155–178 (2000)
Ruthven, I., van Rijsbergen, C.J.: Context Generation in Information Retrieval. In: Florida Artificial Intelligence Research Symposium, Key West, FA, pp. 20–22 (1996)
Sanderson, M.: Word sense disambiguation and information retrieval. In: Proc. of the 17th SIGIR Conference, pp. 142–151 (1994)
SIG5-OntoWeb Project, http://ontoweb-lt.dfki.de/
Smeaton, A.: Using NLP or NLP Resources for Information Retrieval Tasks. In: Strzalkowski, T. (ed.) Natural Language Information Retrieval. Kluwer, Dordrecht (1999)
Sparck-Jones, K.: What is the role of NLP in text retrieval? In: Strzalkowski, T. (ed.) Natural Language Information Retrieval. Kluwer, Dordrecht (1999)
Sparck-Jones, K.: Natural Language Processing: she needs something old and something new (maybe something borrowed and something blue, too), Presidential address, Association for Computational Linguistics (1994)
Strzalkowski, T. (ed.): Natural Language Information Retrieval. Kluwer, Dordrecht (1999)
Strzalkowski, T., Lin, F., Wang, J., Perez-Carballo, J.: Evaluating NLP Techniques in IR. In: Strzalkowski, T. (ed.) Natural Language Information Retrieval. Kluwer, Dordrecht (1999)
Strzalkowski, T., Stein, G., Wise, G.B., Bagga, A.: Towards the Next Generation Information Retrieval. In: RIAO 2000, Paris (2000)
Strzalkowski, T., Perez-Carballo, J., Karlgren, J., Hulth, A., Tapanainen, P., Lahtinen, T.: Natural Language Information Retrieval: TREC-8 Report. In: TREC Proceedings (1999)
Sullivan, D.: Document Warehousing and Text Mining. Wiley Computer Publishing, New York (2001)
Tan, A.-H.: Text Mining: the State of the Art and the Challenges. In: Proceedings of PAKDD 1999 Workshop on Knowledge discovery from Advanced Databases, Beijing, pp. 71–76 (1999)
van Harmelen, F.: How the Semantic Web will change KR: challenges and opportunities for a new research agenda. The Knowledge Engineering Review 17(1) (2002)
Voorhees, E.: Natural Language Processing and Information Retrieval. In: Pazienza, M.T. (ed.) Information Extraction: Towards Scalable, Adaptable Systems. Lecture Notes in AI, vol. 1714. Springer, Berlin (1999)
Voorhees, E.: Evaluating the Evaluation: A Case Study using the TREC 2002 Question Answering Track. In: Proc. of HLT-NAAL 2003, Edmonton, Canada, pp. 181–188 (2003)
White, R.W., Jose, J.M., Ruthven, I.: Using implicit contextual modelling to help users in information seeking (poster). In: Proceedings of Building Bridges: Interdisciplinary context-sensitive computing, Glasgow (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baeza-Yates, R. (2004). Challenges in the Interaction of Information Retrieval and Natural Language Processing. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2004. Lecture Notes in Computer Science, vol 2945. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24630-5_55
Download citation
DOI: https://doi.org/10.1007/978-3-540-24630-5_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21006-1
Online ISBN: 978-3-540-24630-5
eBook Packages: Springer Book Archive