Information Access and Natural Language Processing: A Stimulating Dialogue

Gaizauskas, Robert; Saggion, Horacio; Barker, Emma

doi:10.1007/1-4020-5833-0_4

Robert Gaizauskas¹⁴,
Horacio Saggion¹⁴ &
Emma Barker¹⁴

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 36))

433 Accesses

In this paper we examine the interplay between the requirements of information seekers to access information in large digital text collections and the techniques developed by natural language processing researchers to support this access. In particular we examine how language processing technologies such as question answering, single and multidocument summarisation, and ontology-guided similar event searching can assist journalists in gathering information from news archives for the purpose of writing background to a breaking news event – the Cub Reporter scenario. Our thesis is that investigating real-world tasks with complex information access requirements, such as the Cub Reporter scenario, stimulates researchers to look beyond existing search engine solutions and drives the development and evaluation of novel language processing techniques; at the same time novel developments in language processing capabilities allow both conceptual insights into how to characterise information seeking behaviour and empirical insights based on observation of information seeking behaviour using new technologies

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Agirre and O. Lopez de Lacalle. 2003. Clustering WordNet Word Senses. In Proceedings of RANLP 2003, p. 121–130.
Google Scholar
E.J. Barker and R. Gaizauskas. 2005. Evaluating Cub Reporter: proposals for extrinsic evaluation of journalists using language technologies to access a news archive in background research. In Proceedings of the COLIS 2005 Workshop on Evaluating User Studies in Information Access. To appear.
Google Scholar
H. Cunningham, D. Maynard, K. Bontcheva and V. Tablan. 2002. GATE: A framework and graphical development environment for robust NLP tools and applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics.
Google Scholar
R. Gaizauskas, M. Hepple, H. Saggion and M. Greenwood. 2005. SUPPLE: A Practical Parser for Natural Language Engineering Applications. In International Workshop on Parsing Technologies.
Google Scholar
D. Jurafsky and J.H. Martin. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall, Upper Saddle River, NJ.
Google Scholar
H.P. Luhn. (1999). The automatic creation of literature abstracts. IBM Journal of Research & Development, 2(2):159–165, 1958. Reprinted in Mani and Maybury.
Article Google Scholar
I. Mani and M.T. Maybury. (eds.). 1999. Advances in Automatic Text Summarization. The MIT Press.
Google Scholar
D. Marcu. 2000. The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge, MA.
Google Scholar
D. Milward and J. Thomas. 2000. From information retrieval to information extraction. In Proceedings of the ACL Workshop on Recent Advances in Natural Language Processing and Information Retrieval. Available at: http://www.cam.sri.com/html/highlight.html.
Google Scholar
P. Over and J. Yen. 2004. Introduction to DUC-2004: An intrinsic evaluation of generic news text summarization systems. In Proceedings of the HLT/NAACL 2004 Document Understanding Workshop (DUC-2004). Available at: http://www-nlpir.nist.gov/projects/duc/pubs/ 2004slides/duc2004.intro.pdf.
Google Scholar
N. Sager. 1981. Natural Language Information Processing. Addison-Wesley, Reading, MA.
Google Scholar
H. Saggion. 2002. Shallow-based Robust Summarization. In Automatic Summarization: Solutions and Perspectives, ATALA, December, 14.
Google Scholar
H. Saggion and R. Gaizauskas. 2004a. Multi-document summarization by cluster/profile relevance and redundancy removal. In Proceedings of Document Understanding Conference, Boston, MA, May 6–7. NIST.
Google Scholar
H. Saggion and R. Gaizauskas. 2004b. Mining on-line sources for definition knowledge. In Proceedings of FLAIRS 2004, Florida, USA. AAAI.
Google Scholar
H. Saggion and R. Gaizauskas. 2005. Experiments on Statistical and Pattern-based Biographical Summarization. In Proceedings of the 12th Portuguese Conference on Artificial Intelligence – TeMA Workshop. Accepted.
Google Scholar
G. Salton. 1988. Automatic Text Processing. Addison-Wesley Publishing Company.
Google Scholar
G. Sampson. 1995. English for the Computer: The SUSANNE Corpus and Analytic Scheme. Clarendon Press, Oxford.
Google Scholar
R.F. Simmons. 1965. Answering English questions by computer: A survey. Communications of the ACM, 8(1):53–70.
Article Google Scholar
K. Sparck Jones. 1981. Retrieval system tests: 1958–1978. In K. Sparck Jones, (ed.), Information Retrieval Experiment, pages 213–255. Butterworths, London. URL http://www.nist.gov/itl/div894/984.02/projects/irlib.
Google Scholar
K. Sparck Jones and J.R. Galliers. 1996. Evaluating Natural Language Processing Systems. Springer, Berlin.
Google Scholar
K. Sparck Jones and P. Willett. 1997. Chapter 1: Overall introduction. In K. Sparck Jones and P. Willett, (ed.), Readings in Information Retrieval, p 1–7. Morgan Kaufmann, San Francisco, CA.
Google Scholar
A. Tombros, M. Sanderson and P. Gray. 1998. Advantages of Query Biased Summaries in Information retrieval. In Intelligent Text Summarization. Papers from the 1998 AAAI Spring Symposium. Technical Report SS-98-06, p 34–43, Standford (CA), USA, March 23–25. The AAAI Press.
Google Scholar
E. Voorhees. 2004. Overview of TREC 2003. In Proceedings of the Twelfth Text Retrieval Conference (TREC 2003), NIST Special Publication 500-255. Available at: http://trec.nist.gov/pubs/trec12/papers/ OVERVIEW.12.pdf.
Google Scholar
E. Voorhees. 2005. Overview of the TREC 2004 question answering track. In Proceedings of the Thirteenth Text Retrieval Conference (TREC 2003). URL http://trec.nist.gov/pubs/trec13/papers/QA.OVERVIEW.pdf. NIST Special Publication 500-261.
Google Scholar
Y. Wilks. 1964. Text searching with templates. Technical Report Memo, ML.156, Cambridge Language Research Unit.
Google Scholar
F. Wolf and E. Gibson. 2004. A response to Marcu (2003). Discourse structure: trees or graphs?. Available at: http://web.mit.edu/fwolf/www/discourse-annotation/Wolf_Gibson-coherence-representation.pdf.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Sheffield, Sheffield, UK
Robert Gaizauskas, Horacio Saggion & Emma Barker

Authors

Robert Gaizauskas
View author publications
You can also search for this author in PubMed Google Scholar
Horacio Saggion
View author publications
You can also search for this author in PubMed Google Scholar
Emma Barker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Trinity College, Dublin, Ireland
Khurshid Ahmad
University of Sheffield, Sheffield, UK
Christopher Brewster
University of Sheffield, Sheffield, UK
Mark Stevenson

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gaizauskas, R., Saggion, H., Barker, E. (2007). Information Access and Natural Language Processing: A Stimulating Dialogue. In: Ahmad, K., Brewster, C., Stevenson, M. (eds) Words and Intelligence II. Text, Speech and Language Technology, vol 36. Springer, Dordrecht. https://doi.org/10.1007/1-4020-5833-0_4

Download citation

DOI: https://doi.org/10.1007/1-4020-5833-0_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-5832-5
Online ISBN: 978-1-4020-5833-2
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics