In this paper we examine the interplay between the requirements of information seekers to access information in large digital text collections and the techniques developed by natural language processing researchers to support this access. In particular we examine how language processing technologies such as question answering, single and multidocument summarisation, and ontology-guided similar event searching can assist journalists in gathering information from news archives for the purpose of writing background to a breaking news event – the Cub Reporter scenario. Our thesis is that investigating real-world tasks with complex information access requirements, such as the Cub Reporter scenario, stimulates researchers to look beyond existing search engine solutions and drives the development and evaluation of novel language processing techniques; at the same time novel developments in language processing capabilities allow both conceptual insights into how to characterise information seeking behaviour and empirical insights based on observation of information seeking behaviour using new technologies
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Agirre and O. Lopez de Lacalle. 2003. Clustering WordNet Word Senses. In Proceedings of RANLP 2003, p. 121–130.
E.J. Barker and R. Gaizauskas. 2005. Evaluating Cub Reporter: proposals for extrinsic evaluation of journalists using language technologies to access a news archive in background research. In Proceedings of the COLIS 2005 Workshop on Evaluating User Studies in Information Access. To appear.
H. Cunningham, D. Maynard, K. Bontcheva and V. Tablan. 2002. GATE: A framework and graphical development environment for robust NLP tools and applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics.
R. Gaizauskas, M. Hepple, H. Saggion and M. Greenwood. 2005. SUPPLE: A Practical Parser for Natural Language Engineering Applications. In International Workshop on Parsing Technologies.
D. Jurafsky and J.H. Martin. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall, Upper Saddle River, NJ.
H.P. Luhn. (1999). The automatic creation of literature abstracts. IBM Journal of Research & Development, 2(2):159–165, 1958. Reprinted in Mani and Maybury.
I. Mani and M.T. Maybury. (eds.). 1999. Advances in Automatic Text Summarization. The MIT Press.
D. Marcu. 2000. The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge, MA.
D. Milward and J. Thomas. 2000. From information retrieval to information extraction. In Proceedings of the ACL Workshop on Recent Advances in Natural Language Processing and Information Retrieval. Available at: http://www.cam.sri.com/html/highlight.html.
P. Over and J. Yen. 2004. Introduction to DUC-2004: An intrinsic evaluation of generic news text summarization systems. In Proceedings of the HLT/NAACL 2004 Document Understanding Workshop (DUC-2004). Available at: http://www-nlpir.nist.gov/projects/duc/pubs/ 2004slides/duc2004.intro.pdf.
N. Sager. 1981. Natural Language Information Processing. Addison-Wesley, Reading, MA.
H. Saggion. 2002. Shallow-based Robust Summarization. In Automatic Summarization: Solutions and Perspectives, ATALA, December, 14.
H. Saggion and R. Gaizauskas. 2004a. Multi-document summarization by cluster/profile relevance and redundancy removal. In Proceedings of Document Understanding Conference, Boston, MA, May 6–7. NIST.
H. Saggion and R. Gaizauskas. 2004b. Mining on-line sources for definition knowledge. In Proceedings of FLAIRS 2004, Florida, USA. AAAI.
H. Saggion and R. Gaizauskas. 2005. Experiments on Statistical and Pattern-based Biographical Summarization. In Proceedings of the 12th Portuguese Conference on Artificial Intelligence – TeMA Workshop. Accepted.
G. Salton. 1988. Automatic Text Processing. Addison-Wesley Publishing Company.
G. Sampson. 1995. English for the Computer: The SUSANNE Corpus and Analytic Scheme. Clarendon Press, Oxford.
R.F. Simmons. 1965. Answering English questions by computer: A survey. Communications of the ACM, 8(1):53–70.
K. Sparck Jones. 1981. Retrieval system tests: 1958–1978. In K. Sparck Jones, (ed.), Information Retrieval Experiment, pages 213–255. Butterworths, London. URL http://www.nist.gov/itl/div894/984.02/projects/irlib.
K. Sparck Jones and J.R. Galliers. 1996. Evaluating Natural Language Processing Systems. Springer, Berlin.
K. Sparck Jones and P. Willett. 1997. Chapter 1: Overall introduction. In K. Sparck Jones and P. Willett, (ed.), Readings in Information Retrieval, p 1–7. Morgan Kaufmann, San Francisco, CA.
A. Tombros, M. Sanderson and P. Gray. 1998. Advantages of Query Biased Summaries in Information retrieval. In Intelligent Text Summarization. Papers from the 1998 AAAI Spring Symposium. Technical Report SS-98-06, p 34–43, Standford (CA), USA, March 23–25. The AAAI Press.
E. Voorhees. 2004. Overview of TREC 2003. In Proceedings of the Twelfth Text Retrieval Conference (TREC 2003), NIST Special Publication 500-255. Available at: http://trec.nist.gov/pubs/trec12/papers/ OVERVIEW.12.pdf.
E. Voorhees. 2005. Overview of the TREC 2004 question answering track. In Proceedings of the Thirteenth Text Retrieval Conference (TREC 2003). URL http://trec.nist.gov/pubs/trec13/papers/QA.OVERVIEW.pdf. NIST Special Publication 500-261.
Y. Wilks. 1964. Text searching with templates. Technical Report Memo, ML.156, Cambridge Language Research Unit.
F. Wolf and E. Gibson. 2004. A response to Marcu (2003). Discourse structure: trees or graphs?. Available at: http://web.mit.edu/fwolf/www/discourse-annotation/Wolf_Gibson-coherence-representation.pdf.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Gaizauskas, R., Saggion, H., Barker, E. (2007). Information Access and Natural Language Processing: A Stimulating Dialogue. In: Ahmad, K., Brewster, C., Stevenson, M. (eds) Words and Intelligence II. Text, Speech and Language Technology, vol 36. Springer, Dordrecht. https://doi.org/10.1007/1-4020-5833-0_4
Download citation
DOI: https://doi.org/10.1007/1-4020-5833-0_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-5832-5
Online ISBN: 978-1-4020-5833-2
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)