This chapter discusses the importance of information extraction (IE) in question answering (QA) systems. Most QA systems are focused on sentence-level answer generation. Such systems are based on information retrieval (IR) techniques such as passage retrieval in conjunction with shallow IE techniques such as named entity tagging. Sentence-level answers may be sufficient in many applications focused on reducing information overload: instead of a list of URLs provided by search engines which need further perusal by the user, sentences containing potential answers are extracted and presented to the user. However, if the goal is precise answers consisting of a phrase, or even a single word or number, new techniques for QA must be developed. Specifically, there is a need to use more advanced IE and natural language processing (NLP) in the answer generation process. This chapter presents a system whereby multiple levels of IE are utilized in a QA system that attempts to generate the most precise answer possible, backing off to coarser levels where necessary. In particular, generic grammatical relationships are exploited in the absence of information about specific relationships between entities. An IE engine, InfoXtract, is first described to illustrate the types of IE and NLP output that is needed. Results are presented for QA based on named entity tagging, relationships between entities, and semantic parsing, including more recent work that handles caseless input. This work has implications for broader coverage QA systems since domain-independent IE results can be exploited and noisy input can be “normalized” by restoration techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
10. References
Abney, S., Collins, M. and Singhal, A. 2000. Answer Extraction. In Proceedings of ANLP-2000, pages 296-301, Seattle, Washington.
Bikel, D.M., R. Schwartz, and R.M. Weischedel. 1999. An Algorithm that Learns What’s in a Name. Machine Learning, Vol. 1,3: pages 211-231.
Chinchor, N. and Marsh, E. 1998a. MUC-7 Information Extraction Task Definition (version 5.1), In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/ie_task.html
Chinchor, N., P. Robinson and E. Brown. 1998b. HUB-4 Named Entity Task Definition Version 4.8. www.nist.gov/speech/tests/bnr/hub4_98/hub4_98.htm
Chieu, H.L. and H.T. Ng. 2002. Teaching a Weaker Classifier: Named Entity Recognition on Upper Case Text. In Proceedings of ACL-2002, Philadelphia, PA, pages 481-488.
Clarke, C.L. A., Cormack, G.V. and Lynam, T.R. 2001. Exploiting Redundancy in Question Answering. In Proceedings of SIGIR’01, pages 358-365, New Orleans, LA.
Grunfeld, L. and K.L. Kwok 2005. Sentence Ranking Using Keywords and Meta-keywords: Experience with TREC-9 and TREC-2001 Using PIRCS System, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering, Kluwer Academic Publishers,2005(in this volume).
Hobbs, J.R. 1993. FASTUS: A System for Extracting Information from Text. In Proceedings of the DARPA Workshop on Human Language Technology, pages 133-137,
Princeton, NJ. Hovy, E.H., U. Hermjakob, and Chin-Yew Lin. 2001. The Use of External Knowledge of Factoid QA. In Proceedings of the 10th Text Retrieval Conference (TREC 2001), page 644, Gaithersburg, MD, U.S.A., November 13-16, 2001.
Krupka, G.R. and Hausman, K. 1998. IsoQuest Inc.: Description of the NetOwl (TM) Extractor System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html
Kubala, F., R. Schwartz, R. Stone and R. Weischedel. 1998. Named Entity Extraction from Speech. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA.
Kupiec, J. 1993. MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopaedia, In Proceedings of SIGIR-93, pages181-190, Pittsburgh, PA.
Kwok, K.L., Grunfeld, L., Dinstl, N. and Chan, M. 2001. TREC2001 Question-Answer, Web and Cross Language Experiments using PIRCS. In proceedings of the Tenth Text REtrieval Conference (TREC-2001), page 452, Gaithersburg, MD.
Li, W., R. Srihari, C. Niu and X. Li. 2003a. Question Answering on a Case Insensitive Corpus. In Proceedings of Multilingual Summarization and Question Answering - Machine Learning and Beyond (ACL-2003 Workshop), Sapporo, Japan.
Li, W., R. Srihari, C. Niu, and X. Li 2003b. Entity Profile Extraction from Large Corpora. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.
Li, W. and R. Srihari, 2003c. Flexible Information Extraction Learning Algorithm, Phase 2 Final Technical Report, Air Force Research Laboratory, Rome Research Site, NY.
Li, W, R. Srihari, X. Li, M. Srikanth, X. Zhang and C. Niu. 2002. Extracting Exact Answers to Questions Based on Structural Links. In Proceedings of Multilingual Summarization and Question Answering (COLING-2002 Workshop), Taipei, Taiwan.
Lita, L.V., A. Ittycheriah, S. Toukos and N. Kambhatla. 2003. tRuEcaSing. In Proceedings of ACL-2003. Sapporo, Japan.
Litkowski, K.C. 1999. Question-Answering Using Semantic Relation Triples. In proceedings of the eighth Text Retrieval Conference (TREC-8), page 349, Gaithersburg, MD.
Miller, Scott; Crystal, Michael; Fox, Heidi; Ramshaw, Lance; Schwartz, Richard; Stone, Rebecca; Weischedel, Ralph; and Annotation Group (BBN Technologies). 1998. BBN: Description of the SIFT System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference, www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html
Miller, D., S. Boisen, R. Schwartz, R. Stone, and R. Weischedel. 2000. Named Entity Extraction from Noisy Input: Speech and OCR. In Proceedings of ANLP 2000, Seattle.
Niu, C., W. Li, J. Ding, and R. Rohini. 2004a. Orthographic Case Restoration Using Supervised Learning Without Manual Annotation. International Journal of Artificial Intelligence Tools, Vol. 13, No. 1, 2004.
Niu, C., W. Li and R. Srihari, 2004b. A Bootstrapping Approach to Information Extraction Domain Porting. AAAI-2004 Workshop on Adaptive Text Extraction and Mining (ATEM), California.
Niu, C., W. Li, J. Ding, and R. Srihari 2003a. A Bootstrapping Approach to Named Entity Classification Using Successive Learners. In Proceedings of 41st Annual Meeting of ACL. Sapporo, Japan. pp. 335-342.
Niu, C., W. Li, R. Srihari, and L. Crist, 2003b. Bootstrapping a Hidden Markov Model for Relationship Extraction Using Multi-level Contexts. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.
Niu, C., W.i, J. Ding and R. Srihari. 2003c. Orthographic Case Restoration Using Supervised earning Without Manual Annotation. In Proceedings of the 16th International FLAIRS Conference 2003, Florida
Palmer, D., M. Ostendorf and J.D. Burger. 2000. Robust Information Extraction from Automatically Generated Speech Transcriptions. Speech Communications, Vol. 32: 95-109.
Pasca, M. and Harabagiu, S. M. 2000. High Performance Question/Answering. In Proceedings of SIGIR 2000. pages 366-374.
Prager, J., Radev, D., Brown, E., Coden, A. and Samn, V. 1999. The use of predictive annotation for question answering in TREC8. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, MD.
Ravichandran, D., and Hovy, E. 2002. Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40 th Annual Meeting of the Association for Computational Linguistics (ACL), pages 41-47, Philadelphia, PA.
Robinson, P., E. Brown, J. Burger, N. Chinchor, A. Douthat, L. Ferro, and L. Hirschman. 1999. Overview: Information Extraction from Broadcast News. In Proceedings of The DARPA Broadcast News Workshop. Herndon, Virginia: pages 27-30.
Srihari, R., W. Li, C. Niu and T. Cornell. 2005. InfoXtract: A Customizable Intermediate Level Information Extraction Engine. Journal of Natural Language Engineering (forthcoming).
Srihari, R., C. Niu and Li, W. 2000a. A Hybrid Approach for Named Entity and Sub-Type Tagging. In Proceedings of ANLP 2000, pages 247-254. Seattle, Washington.
Srihari, R. and Li, W. 2000b. A Question Answering System Supported by Information Extraction. In Proceedings of ANLP 2000, pages 166-172, Seattle, Washington.
Srihari, R. and Li, W. 1999. Information Extraction supported Question Answering. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pages 185-196. Gaithersberg, MD.
Vicedo, J.L. and A. Ferrández, 2006. Co-reference in Q&A, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering. Kluwer Academic Publishers. (in this volume).
Voorhees, E. 1999. The TREC-8 Question Answering Track Report. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), page 77, Gaithersburg, MD.
Voorhees, E. 2000. Overview of the TREC-9 Question Answering Track. In Proceedings of the Ninth Text Retrieval Conference (TREC-9), pages 77-82. Gaithersburg, MD.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer
About this chapter
Cite this chapter
Srihari, R.K., Li, W., Li, X. (2008). Question Answering Supported By Multiple Levels Of Information Extraction. In: Strzalkowski, T., Harabagiu, S.M. (eds) Advances in Open Domain Question Answering. Text, Speech and Language Technology, vol 32. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4746-6_11
Download citation
DOI: https://doi.org/10.1007/978-1-4020-4746-6_11
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4744-2
Online ISBN: 978-1-4020-4746-6
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)