Question Answering Supported By Multiple Levels Of Information Extraction

Srihari, Rohini K.; Li, Wei; Li, Xiaoge

doi:10.1007/978-1-4020-4746-6_11

Rohini K. Srihari⁵,
Wei Li⁵ &
Xiaoge Li⁵

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 32))

760 Accesses

This chapter discusses the importance of information extraction (IE) in question answering (QA) systems. Most QA systems are focused on sentence-level answer generation. Such systems are based on information retrieval (IR) techniques such as passage retrieval in conjunction with shallow IE techniques such as named entity tagging. Sentence-level answers may be sufficient in many applications focused on reducing information overload: instead of a list of URLs provided by search engines which need further perusal by the user, sentences containing potential answers are extracted and presented to the user. However, if the goal is precise answers consisting of a phrase, or even a single word or number, new techniques for QA must be developed. Specifically, there is a need to use more advanced IE and natural language processing (NLP) in the answer generation process. This chapter presents a system whereby multiple levels of IE are utilized in a QA system that attempts to generate the most precise answer possible, backing off to coarser levels where necessary. In particular, generic grammatical relationships are exploited in the absence of information about specific relationships between entities. An IE engine, InfoXtract, is first described to illustrate the types of IE and NLP output that is needed. Results are presented for QA based on named entity tagging, relationships between entities, and semantic parsing, including more recent work that handles caseless input. This work has implications for broader coverage QA systems since domain-independent IE results can be exploited and noisy input can be “normalized” by restoration techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

10. References

Abney, S., Collins, M. and Singhal, A. 2000. Answer Extraction. In Proceedings of ANLP-2000, pages 296-301, Seattle, Washington.
Google Scholar
Bikel, D.M., R. Schwartz, and R.M. Weischedel. 1999. An Algorithm that Learns What’s in a Name. Machine Learning, Vol. 1,3: pages 211-231.
Article Google Scholar
Chinchor, N. and Marsh, E. 1998a. MUC-7 Information Extraction Task Definition (version 5.1), In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/ie_task.html
Chinchor, N., P. Robinson and E. Brown. 1998b. HUB-4 Named Entity Task Definition Version 4.8. www.nist.gov/speech/tests/bnr/hub4_98/hub4_98.htm
Chieu, H.L. and H.T. Ng. 2002. Teaching a Weaker Classifier: Named Entity Recognition on Upper Case Text. In Proceedings of ACL-2002, Philadelphia, PA, pages 481-488.
Google Scholar
Clarke, C.L. A., Cormack, G.V. and Lynam, T.R. 2001. Exploiting Redundancy in Question Answering. In Proceedings of SIGIR’01, pages 358-365, New Orleans, LA.
Google Scholar
Grunfeld, L. and K.L. Kwok 2005. Sentence Ranking Using Keywords and Meta-keywords: Experience with TREC-9 and TREC-2001 Using PIRCS System, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering, Kluwer Academic Publishers,2005(in this volume).
Google Scholar
Hobbs, J.R. 1993. FASTUS: A System for Extracting Information from Text. In Proceedings of the DARPA Workshop on Human Language Technology, pages 133-137,
Google Scholar
Princeton, NJ. Hovy, E.H., U. Hermjakob, and Chin-Yew Lin. 2001. The Use of External Knowledge of Factoid QA. In Proceedings of the 10th Text Retrieval Conference (TREC 2001), page 644, Gaithersburg, MD, U.S.A., November 13-16, 2001.
Google Scholar
Krupka, G.R. and Hausman, K. 1998. IsoQuest Inc.: Description of the NetOwl (TM) Extractor System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html
Kubala, F., R. Schwartz, R. Stone and R. Weischedel. 1998. Named Entity Extraction from Speech. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA.
Google Scholar
Kupiec, J. 1993. MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopaedia, In Proceedings of SIGIR-93, pages181-190, Pittsburgh, PA.
Google Scholar
Kwok, K.L., Grunfeld, L., Dinstl, N. and Chan, M. 2001. TREC2001 Question-Answer, Web and Cross Language Experiments using PIRCS. In proceedings of the Tenth Text REtrieval Conference (TREC-2001), page 452, Gaithersburg, MD.
Google Scholar
Li, W., R. Srihari, C. Niu and X. Li. 2003a. Question Answering on a Case Insensitive Corpus. In Proceedings of Multilingual Summarization and Question Answering - Machine Learning and Beyond (ACL-2003 Workshop), Sapporo, Japan.
Google Scholar
Li, W., R. Srihari, C. Niu, and X. Li 2003b. Entity Profile Extraction from Large Corpora. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.
Google Scholar
Li, W. and R. Srihari, 2003c. Flexible Information Extraction Learning Algorithm, Phase 2 Final Technical Report, Air Force Research Laboratory, Rome Research Site, NY.
Google Scholar
Li, W, R. Srihari, X. Li, M. Srikanth, X. Zhang and C. Niu. 2002. Extracting Exact Answers to Questions Based on Structural Links. In Proceedings of Multilingual Summarization and Question Answering (COLING-2002 Workshop), Taipei, Taiwan.
Google Scholar
Lita, L.V., A. Ittycheriah, S. Toukos and N. Kambhatla. 2003. tRuEcaSing. In Proceedings of ACL-2003. Sapporo, Japan.
Google Scholar
Litkowski, K.C. 1999. Question-Answering Using Semantic Relation Triples. In proceedings of the eighth Text Retrieval Conference (TREC-8), page 349, Gaithersburg, MD.
Google Scholar
Miller, Scott; Crystal, Michael; Fox, Heidi; Ramshaw, Lance; Schwartz, Richard; Stone, Rebecca; Weischedel, Ralph; and Annotation Group (BBN Technologies). 1998. BBN: Description of the SIFT System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference, www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html
Miller, D., S. Boisen, R. Schwartz, R. Stone, and R. Weischedel. 2000. Named Entity Extraction from Noisy Input: Speech and OCR. In Proceedings of ANLP 2000, Seattle.
Google Scholar
Niu, C., W. Li, J. Ding, and R. Rohini. 2004a. Orthographic Case Restoration Using Supervised Learning Without Manual Annotation. International Journal of Artificial Intelligence Tools, Vol. 13, No. 1, 2004.
Google Scholar
Niu, C., W. Li and R. Srihari, 2004b. A Bootstrapping Approach to Information Extraction Domain Porting. AAAI-2004 Workshop on Adaptive Text Extraction and Mining (ATEM), California.
Google Scholar
Niu, C., W. Li, J. Ding, and R. Srihari 2003a. A Bootstrapping Approach to Named Entity Classification Using Successive Learners. In Proceedings of 41st Annual Meeting of ACL. Sapporo, Japan. pp. 335-342.
Google Scholar
Niu, C., W. Li, R. Srihari, and L. Crist, 2003b. Bootstrapping a Hidden Markov Model for Relationship Extraction Using Multi-level Contexts. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.
Google Scholar
Niu, C., W.i, J. Ding and R. Srihari. 2003c. Orthographic Case Restoration Using Supervised earning Without Manual Annotation. In Proceedings of the 16th International FLAIRS Conference 2003, Florida
Google Scholar
Palmer, D., M. Ostendorf and J.D. Burger. 2000. Robust Information Extraction from Automatically Generated Speech Transcriptions. Speech Communications, Vol. 32: 95-109.
Article Google Scholar
Pasca, M. and Harabagiu, S. M. 2000. High Performance Question/Answering. In Proceedings of SIGIR 2000. pages 366-374.
Google Scholar
Prager, J., Radev, D., Brown, E., Coden, A. and Samn, V. 1999. The use of predictive annotation for question answering in TREC8. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, MD.
Google Scholar
Ravichandran, D., and Hovy, E. 2002. Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40 ^th Annual Meeting of the Association for Computational Linguistics (ACL), pages 41-47, Philadelphia, PA.
Google Scholar
Robinson, P., E. Brown, J. Burger, N. Chinchor, A. Douthat, L. Ferro, and L. Hirschman. 1999. Overview: Information Extraction from Broadcast News. In Proceedings of The DARPA Broadcast News Workshop. Herndon, Virginia: pages 27-30.
Google Scholar
Srihari, R., W. Li, C. Niu and T. Cornell. 2005. InfoXtract: A Customizable Intermediate Level Information Extraction Engine. Journal of Natural Language Engineering (forthcoming).
Google Scholar
Srihari, R., C. Niu and Li, W. 2000a. A Hybrid Approach for Named Entity and Sub-Type Tagging. In Proceedings of ANLP 2000, pages 247-254. Seattle, Washington.
Google Scholar
Srihari, R. and Li, W. 2000b. A Question Answering System Supported by Information Extraction. In Proceedings of ANLP 2000, pages 166-172, Seattle, Washington.
Google Scholar
Srihari, R. and Li, W. 1999. Information Extraction supported Question Answering. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pages 185-196. Gaithersberg, MD.
Google Scholar
Vicedo, J.L. and A. Ferrández, 2006. Co-reference in Q&A, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering. Kluwer Academic Publishers. (in this volume).
Google Scholar
Voorhees, E. 1999. The TREC-8 Question Answering Track Report. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), page 77, Gaithersburg, MD.
Google Scholar
Voorhees, E. 2000. Overview of the TREC-9 Question Answering Track. In Proceedings of the Ninth Text Retrieval Conference (TREC-9), pages 77-82. Gaithersburg, MD.
Google Scholar

Download references

Author information

Authors and Affiliations

Janya Inc., 1408 Sweet Home Road, 14228, Amherst, NY, USA
Rohini K. Srihari, Wei Li & Xiaoge Li

Authors

Rohini K. Srihari
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoge Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

State University of New York at Albany, 1400 Washington Avenue, 12222, Albany, NY, USA
Tomek Strzalkowski
University of Texas at Dallas, 75083, Richardson, TX, USA
Sanda M. Harabagiu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Srihari, R.K., Li, W., Li, X. (2008). Question Answering Supported By Multiple Levels Of Information Extraction. In: Strzalkowski, T., Harabagiu, S.M. (eds) Advances in Open Domain Question Answering. Text, Speech and Language Technology, vol 32. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4746-6_11

Download citation

DOI: https://doi.org/10.1007/978-1-4020-4746-6_11
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4744-2
Online ISBN: 978-1-4020-4746-6
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics