Skip to main content

Question Answering Supported By Multiple Levels Of Information Extraction

  • Chapter
Advances in Open Domain Question Answering

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 32))

  • 760 Accesses

This chapter discusses the importance of information extraction (IE) in question answering (QA) systems. Most QA systems are focused on sentence-level answer generation. Such systems are based on information retrieval (IR) techniques such as passage retrieval in conjunction with shallow IE techniques such as named entity tagging. Sentence-level answers may be sufficient in many applications focused on reducing information overload: instead of a list of URLs provided by search engines which need further perusal by the user, sentences containing potential answers are extracted and presented to the user. However, if the goal is precise answers consisting of a phrase, or even a single word or number, new techniques for QA must be developed. Specifically, there is a need to use more advanced IE and natural language processing (NLP) in the answer generation process. This chapter presents a system whereby multiple levels of IE are utilized in a QA system that attempts to generate the most precise answer possible, backing off to coarser levels where necessary. In particular, generic grammatical relationships are exploited in the absence of information about specific relationships between entities. An IE engine, InfoXtract, is first described to illustrate the types of IE and NLP output that is needed. Results are presented for QA based on named entity tagging, relationships between entities, and semantic parsing, including more recent work that handles caseless input. This work has implications for broader coverage QA systems since domain-independent IE results can be exploited and noisy input can be “normalized” by restoration techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

10. References

  • Abney, S., Collins, M. and Singhal, A. 2000. Answer Extraction. In Proceedings of ANLP-2000, pages 296-301, Seattle, Washington.

    Google Scholar 

  • Bikel, D.M., R. Schwartz, and R.M. Weischedel. 1999. An Algorithm that Learns What’s in a Name. Machine Learning, Vol. 1,3: pages 211-231.

    Article  Google Scholar 

  • Chinchor, N. and Marsh, E. 1998a. MUC-7 Information Extraction Task Definition (version 5.1), In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/ie_task.html

  • Chinchor, N., P. Robinson and E. Brown. 1998b. HUB-4 Named Entity Task Definition Version 4.8. www.nist.gov/speech/tests/bnr/hub4_98/hub4_98.htm

  • Chieu, H.L. and H.T. Ng. 2002. Teaching a Weaker Classifier: Named Entity Recognition on Upper Case Text. In Proceedings of ACL-2002, Philadelphia, PA, pages 481-488.

    Google Scholar 

  • Clarke, C.L. A., Cormack, G.V. and Lynam, T.R. 2001. Exploiting Redundancy in Question Answering. In Proceedings of SIGIR’01, pages 358-365, New Orleans, LA.

    Google Scholar 

  • Grunfeld, L. and K.L. Kwok 2005. Sentence Ranking Using Keywords and Meta-keywords: Experience with TREC-9 and TREC-2001 Using PIRCS System, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering, Kluwer Academic Publishers,2005(in this volume).

    Google Scholar 

  • Hobbs, J.R. 1993. FASTUS: A System for Extracting Information from Text. In Proceedings of the DARPA Workshop on Human Language Technology, pages 133-137,

    Google Scholar 

  • Princeton, NJ. Hovy, E.H., U. Hermjakob, and Chin-Yew Lin. 2001. The Use of External Knowledge of Factoid QA. In Proceedings of the 10th Text Retrieval Conference (TREC 2001), page 644, Gaithersburg, MD, U.S.A., November 13-16, 2001.

    Google Scholar 

  • Krupka, G.R. and Hausman, K. 1998. IsoQuest Inc.: Description of the NetOwl (TM) Extractor System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html

  • Kubala, F., R. Schwartz, R. Stone and R. Weischedel. 1998. Named Entity Extraction from Speech. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA.

    Google Scholar 

  • Kupiec, J. 1993. MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopaedia, In Proceedings of SIGIR-93, pages181-190, Pittsburgh, PA.

    Google Scholar 

  • Kwok, K.L., Grunfeld, L., Dinstl, N. and Chan, M. 2001. TREC2001 Question-Answer, Web and Cross Language Experiments using PIRCS. In proceedings of the Tenth Text REtrieval Conference (TREC-2001), page 452, Gaithersburg, MD.

    Google Scholar 

  • Li, W., R. Srihari, C. Niu and X. Li. 2003a. Question Answering on a Case Insensitive Corpus. In Proceedings of Multilingual Summarization and Question Answering - Machine Learning and Beyond (ACL-2003 Workshop), Sapporo, Japan.

    Google Scholar 

  • Li, W., R. Srihari, C. Niu, and X. Li 2003b. Entity Profile Extraction from Large Corpora. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.

    Google Scholar 

  • Li, W. and R. Srihari, 2003c. Flexible Information Extraction Learning Algorithm, Phase 2 Final Technical Report, Air Force Research Laboratory, Rome Research Site, NY.

    Google Scholar 

  • Li, W, R. Srihari, X. Li, M. Srikanth, X. Zhang and C. Niu. 2002. Extracting Exact Answers to Questions Based on Structural Links. In Proceedings of Multilingual Summarization and Question Answering (COLING-2002 Workshop), Taipei, Taiwan.

    Google Scholar 

  • Lita, L.V., A. Ittycheriah, S. Toukos and N. Kambhatla. 2003. tRuEcaSing. In Proceedings of ACL-2003. Sapporo, Japan.

    Google Scholar 

  • Litkowski, K.C. 1999. Question-Answering Using Semantic Relation Triples. In proceedings of the eighth Text Retrieval Conference (TREC-8), page 349, Gaithersburg, MD.

    Google Scholar 

  • Miller, Scott; Crystal, Michael; Fox, Heidi; Ramshaw, Lance; Schwartz, Richard; Stone, Rebecca; Weischedel, Ralph; and Annotation Group (BBN Technologies). 1998. BBN: Description of the SIFT System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference, www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_toc.html

  • Miller, D., S. Boisen, R. Schwartz, R. Stone, and R. Weischedel. 2000. Named Entity Extraction from Noisy Input: Speech and OCR. In Proceedings of ANLP 2000, Seattle.

    Google Scholar 

  • Niu, C., W. Li, J. Ding, and R. Rohini. 2004a. Orthographic Case Restoration Using Supervised Learning Without Manual Annotation. International Journal of Artificial Intelligence Tools, Vol. 13, No. 1, 2004.

    Google Scholar 

  • Niu, C., W. Li and R. Srihari, 2004b. A Bootstrapping Approach to Information Extraction Domain Porting. AAAI-2004 Workshop on Adaptive Text Extraction and Mining (ATEM), California.

    Google Scholar 

  • Niu, C., W. Li, J. Ding, and R. Srihari 2003a. A Bootstrapping Approach to Named Entity Classification Using Successive Learners. In Proceedings of 41st Annual Meeting of ACL. Sapporo, Japan. pp. 335-342.

    Google Scholar 

  • Niu, C., W. Li, R. Srihari, and L. Crist, 2003b. Bootstrapping a Hidden Markov Model for Relationship Extraction Using Multi-level Contexts. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.

    Google Scholar 

  • Niu, C., W.i, J. Ding and R. Srihari. 2003c. Orthographic Case Restoration Using Supervised earning Without Manual Annotation. In Proceedings of the 16th International FLAIRS Conference 2003, Florida

    Google Scholar 

  • Palmer, D., M. Ostendorf and J.D. Burger. 2000. Robust Information Extraction from Automatically Generated Speech Transcriptions. Speech Communications, Vol. 32: 95-109.

    Article  Google Scholar 

  • Pasca, M. and Harabagiu, S. M. 2000. High Performance Question/Answering. In Proceedings of SIGIR 2000. pages 366-374.

    Google Scholar 

  • Prager, J., Radev, D., Brown, E., Coden, A. and Samn, V. 1999. The use of predictive annotation for question answering in TREC8. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, MD.

    Google Scholar 

  • Ravichandran, D., and Hovy, E. 2002. Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40 th Annual Meeting of the Association for Computational Linguistics (ACL), pages 41-47, Philadelphia, PA.

    Google Scholar 

  • Robinson, P., E. Brown, J. Burger, N. Chinchor, A. Douthat, L. Ferro, and L. Hirschman. 1999. Overview: Information Extraction from Broadcast News. In Proceedings of The DARPA Broadcast News Workshop. Herndon, Virginia: pages 27-30.

    Google Scholar 

  • Srihari, R., W. Li, C. Niu and T. Cornell. 2005. InfoXtract: A Customizable Intermediate Level Information Extraction Engine. Journal of Natural Language Engineering (forthcoming).

    Google Scholar 

  • Srihari, R., C. Niu and Li, W. 2000a. A Hybrid Approach for Named Entity and Sub-Type Tagging. In Proceedings of ANLP 2000, pages 247-254. Seattle, Washington.

    Google Scholar 

  • Srihari, R. and Li, W. 2000b. A Question Answering System Supported by Information Extraction. In Proceedings of ANLP 2000, pages 166-172, Seattle, Washington.

    Google Scholar 

  • Srihari, R. and Li, W. 1999. Information Extraction supported Question Answering. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pages 185-196. Gaithersberg, MD.

    Google Scholar 

  • Vicedo, J.L. and A. Ferrández, 2006. Co-reference in Q&A, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering. Kluwer Academic Publishers. (in this volume).

    Google Scholar 

  • Voorhees, E. 1999. The TREC-8 Question Answering Track Report. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), page 77, Gaithersburg, MD.

    Google Scholar 

  • Voorhees, E. 2000. Overview of the TREC-9 Question Answering Track. In Proceedings of the Ninth Text Retrieval Conference (TREC-9), pages 77-82. Gaithersburg, MD.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer

About this chapter

Cite this chapter

Srihari, R.K., Li, W., Li, X. (2008). Question Answering Supported By Multiple Levels Of Information Extraction. In: Strzalkowski, T., Harabagiu, S.M. (eds) Advances in Open Domain Question Answering. Text, Speech and Language Technology, vol 32. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4746-6_11

Download citation

Publish with us

Policies and ethics