Question Answering Supported By Multiple Levels Of Information Extraction

  • Rohini K. Srihari
  • Wei Li
  • Xiaoge Li
Part of the Text, Speech and Language Technology book series (TLTB, volume 32)

This chapter discusses the importance of information extraction (IE) in question answering (QA) systems. Most QA systems are focused on sentence-level answer generation. Such systems are based on information retrieval (IR) techniques such as passage retrieval in conjunction with shallow IE techniques such as named entity tagging. Sentence-level answers may be sufficient in many applications focused on reducing information overload: instead of a list of URLs provided by search engines which need further perusal by the user, sentences containing potential answers are extracted and presented to the user. However, if the goal is precise answers consisting of a phrase, or even a single word or number, new techniques for QA must be developed. Specifically, there is a need to use more advanced IE and natural language processing (NLP) in the answer generation process. This chapter presents a system whereby multiple levels of IE are utilized in a QA system that attempts to generate the most precise answer possible, backing off to coarser levels where necessary. In particular, generic grammatical relationships are exploited in the absence of information about specific relationships between entities. An IE engine, InfoXtract, is first described to illustrate the types of IE and NLP output that is needed. Results are presented for QA based on named entity tagging, relationships between entities, and semantic parsing, including more recent work that handles caseless input. This work has implications for broader coverage QA systems since domain-independent IE results can be exploited and noisy input can be “normalized” by restoration techniques.


Noun Phrase Information Extraction Question Answering Prepositional Phrase Name Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

10. References

  1. Abney, S., Collins, M. and Singhal, A. 2000. Answer Extraction. In Proceedings of ANLP-2000, pages 296-301, Seattle, Washington.Google Scholar
  2. Bikel, D.M., R. Schwartz, and R.M. Weischedel. 1999. An Algorithm that Learns What’s in a Name. Machine Learning, Vol. 1,3: pages 211-231.CrossRefGoogle Scholar
  3. Chinchor, N. and Marsh, E. 1998a. MUC-7 Information Extraction Task Definition (version 5.1), In Proceedings of the Seventh Message Understanding Conference (MUC-7),
  4. Chinchor, N., P. Robinson and E. Brown. 1998b. HUB-4 Named Entity Task Definition Version 4.8.
  5. Chieu, H.L. and H.T. Ng. 2002. Teaching a Weaker Classifier: Named Entity Recognition on Upper Case Text. In Proceedings of ACL-2002, Philadelphia, PA, pages 481-488.Google Scholar
  6. Clarke, C.L. A., Cormack, G.V. and Lynam, T.R. 2001. Exploiting Redundancy in Question Answering. In Proceedings of SIGIR’01, pages 358-365, New Orleans, LA.Google Scholar
  7. Grunfeld, L. and K.L. Kwok 2005. Sentence Ranking Using Keywords and Meta-keywords: Experience with TREC-9 and TREC-2001 Using PIRCS System, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering, Kluwer Academic Publishers,2005(in this volume).Google Scholar
  8. Hobbs, J.R. 1993. FASTUS: A System for Extracting Information from Text. In Proceedings of the DARPA Workshop on Human Language Technology, pages 133-137,Google Scholar
  9. Princeton, NJ. Hovy, E.H., U. Hermjakob, and Chin-Yew Lin. 2001. The Use of External Knowledge of Factoid QA. In Proceedings of the 10th Text Retrieval Conference (TREC 2001), page 644, Gaithersburg, MD, U.S.A., November 13-16, 2001.Google Scholar
  10. Krupka, G.R. and Hausman, K. 1998. IsoQuest Inc.: Description of the NetOwl (TM) Extractor System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7),
  11. Kubala, F., R. Schwartz, R. Stone and R. Weischedel. 1998. Named Entity Extraction from Speech. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA.Google Scholar
  12. Kupiec, J. 1993. MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopaedia, In Proceedings of SIGIR-93, pages181-190, Pittsburgh, PA.Google Scholar
  13. Kwok, K.L., Grunfeld, L., Dinstl, N. and Chan, M. 2001. TREC2001 Question-Answer, Web and Cross Language Experiments using PIRCS. In proceedings of the Tenth Text REtrieval Conference (TREC-2001), page 452, Gaithersburg, MD.Google Scholar
  14. Li, W., R. Srihari, C. Niu and X. Li. 2003a. Question Answering on a Case Insensitive Corpus. In Proceedings of Multilingual Summarization and Question Answering - Machine Learning and Beyond (ACL-2003 Workshop), Sapporo, Japan.Google Scholar
  15. Li, W., R. Srihari, C. Niu, and X. Li 2003b. Entity Profile Extraction from Large Corpora. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.Google Scholar
  16. Li, W. and R. Srihari, 2003c. Flexible Information Extraction Learning Algorithm, Phase 2 Final Technical Report, Air Force Research Laboratory, Rome Research Site, NY.Google Scholar
  17. Li, W, R. Srihari, X. Li, M. Srikanth, X. Zhang and C. Niu. 2002. Extracting Exact Answers to Questions Based on Structural Links. In Proceedings of Multilingual Summarization and Question Answering (COLING-2002 Workshop), Taipei, Taiwan.Google Scholar
  18. Lita, L.V., A. Ittycheriah, S. Toukos and N. Kambhatla. 2003. tRuEcaSing. In Proceedings of ACL-2003. Sapporo, Japan.Google Scholar
  19. Litkowski, K.C. 1999. Question-Answering Using Semantic Relation Triples. In proceedings of the eighth Text Retrieval Conference (TREC-8), page 349, Gaithersburg, MD.Google Scholar
  20. Miller, Scott; Crystal, Michael; Fox, Heidi; Ramshaw, Lance; Schwartz, Richard; Stone, Rebecca; Weischedel, Ralph; and Annotation Group (BBN Technologies). 1998. BBN: Description of the SIFT System as Used for MUC-7. In Proceedings of the Seventh Message Understanding Conference,
  21. Miller, D., S. Boisen, R. Schwartz, R. Stone, and R. Weischedel. 2000. Named Entity Extraction from Noisy Input: Speech and OCR. In Proceedings of ANLP 2000, Seattle.Google Scholar
  22. Niu, C., W. Li, J. Ding, and R. Rohini. 2004a. Orthographic Case Restoration Using Supervised Learning Without Manual Annotation. International Journal of Artificial Intelligence Tools, Vol. 13, No. 1, 2004.Google Scholar
  23. Niu, C., W. Li and R. Srihari, 2004b. A Bootstrapping Approach to Information Extraction Domain Porting. AAAI-2004 Workshop on Adaptive Text Extraction and Mining (ATEM), California.Google Scholar
  24. Niu, C., W. Li, J. Ding, and R. Srihari 2003a. A Bootstrapping Approach to Named Entity Classification Using Successive Learners. In Proceedings of 41st Annual Meeting of ACL. Sapporo, Japan. pp. 335-342.Google Scholar
  25. Niu, C., W. Li, R. Srihari, and L. Crist, 2003b. Bootstrapping a Hidden Markov Model for Relationship Extraction Using Multi-level Contexts. In Proceedings of Pacific Association for Computational Linguistics 2003 (PACLING’03). Halifax, Nova Scotia, Canada.Google Scholar
  26. Niu, C., W.i, J. Ding and R. Srihari. 2003c. Orthographic Case Restoration Using Supervised earning Without Manual Annotation. In Proceedings of the 16th International FLAIRS Conference 2003, FloridaGoogle Scholar
  27. Palmer, D., M. Ostendorf and J.D. Burger. 2000. Robust Information Extraction from Automatically Generated Speech Transcriptions. Speech Communications, Vol. 32: 95-109.CrossRefGoogle Scholar
  28. Pasca, M. and Harabagiu, S. M. 2000. High Performance Question/Answering. In Proceedings of SIGIR 2000. pages 366-374.Google Scholar
  29. Prager, J., Radev, D., Brown, E., Coden, A. and Samn, V. 1999. The use of predictive annotation for question answering in TREC8. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, MD.Google Scholar
  30. Ravichandran, D., and Hovy, E. 2002. Learning Surface Text Patterns for a Question Answering System. In Proceedings of the 40 th Annual Meeting of the Association for Computational Linguistics (ACL), pages 41-47, Philadelphia, PA.Google Scholar
  31. Robinson, P., E. Brown, J. Burger, N. Chinchor, A. Douthat, L. Ferro, and L. Hirschman. 1999. Overview: Information Extraction from Broadcast News. In Proceedings of The DARPA Broadcast News Workshop. Herndon, Virginia: pages 27-30.Google Scholar
  32. Srihari, R., W. Li, C. Niu and T. Cornell. 2005. InfoXtract: A Customizable Intermediate Level Information Extraction Engine. Journal of Natural Language Engineering (forthcoming). Google Scholar
  33. Srihari, R., C. Niu and Li, W. 2000a. A Hybrid Approach for Named Entity and Sub-Type Tagging. In Proceedings of ANLP 2000, pages 247-254. Seattle, Washington.Google Scholar
  34. Srihari, R. and Li, W. 2000b. A Question Answering System Supported by Information Extraction. In Proceedings of ANLP 2000, pages 166-172, Seattle, Washington.Google Scholar
  35. Srihari, R. and Li, W. 1999. Information Extraction supported Question Answering. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pages 185-196. Gaithersberg, MD.Google Scholar
  36. Vicedo, J.L. and A. Ferrández, 2006. Co-reference in Q&A, in T. Strzalkowski & S. Harabagiu (eds.) Advances in Open-Domain Question Answering. Kluwer Academic Publishers. (in this volume).Google Scholar
  37. Voorhees, E. 1999. The TREC-8 Question Answering Track Report. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), page 77, Gaithersburg, MD.Google Scholar
  38. Voorhees, E. 2000. Overview of the TREC-9 Question Answering Track. In Proceedings of the Ninth Text Retrieval Conference (TREC-9), pages 77-82. Gaithersburg, MD.Google Scholar

Copyright information

© Springer 2008

Authors and Affiliations

  • Rohini K. Srihari
    • 1
  • Wei Li
    • 1
  • Xiaoge Li
    • 1
  1. 1.Janya Inc.AmherstUSA

Personalised recommendations