Abstract
A crucial step in the answering process of definition questions, such as “Who is Gordon Brown?”, is the ranking of answer candidates. In definition Question Answering (QA), sentences are normally interpreted as potential answers, and one of the most promising ranking strategies predicates upon Language Models (LMs).
However, one of the factors that makes LMs less attractive is the fact that they can suffer from data sparseness, when the training material is insufficient or candidate sentences are too long. This paper analyses two methods, different in nature, for tackling data sparseness head-on: (1) combining LMs learnt from different, but overlapping, training corpora, and (2) selective substitutions grounded upon part-of-speech (POS) taggings.
Results show that the first method improves the Mean Average Precision (MAP) of the top-ranked answers, while at the same time, it diminishes the average F-score of the final output. Conversely, the impact of the second approach depends on the test corpus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Figueroa, A., Atkinson, J.: Using Dependency Paths For Answering Definition Questions on The Web. In: 5th International Conference on Web Information Systems and Technologies, pp. 643–650 (2009)
Cui, H., Kan, M.Y., Chua, T.S.: Unsupervised Learning of Soft Patterns for Definitional Question Answering. In: Proceedings of the Thirteenth World Wide Web Conference (WWW 2004), pp. 90–99 (2004)
Cui, H., Kan, M.Y., Chua, T.S.: Soft pattern matching models for definitional question answering. ACM Trans. Inf. Syst. 25 (2007)
Cui, T., Kan, M., Xiao, J.: A comparative study on sentence retrieval for definitional question answering. In: SIGIR Workshop on Information Retrieval for Question Answering (IR4QA), pp. 383–390 (2004)
Han, K., Song, Y., Rim, H.: Probabilistic model for definitional question answering. In: Proceedings of SIGIR 2006, pp. 212–219 (2006)
Zhang, Z., Zhou, Y., Huang, X., Wu, L.: Answering Definition Questions Using Web Knowledge Bases. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 498–506. Springer, Heidelberg (2005)
Firth, J.R.: A synopsis of linguistic theory 1930-1955. Studies in Linguistic Analysis, 1–32 (1957)
Harris, Z.: Distributional Structure. Distributional structure. Word 10(23), 146–162 (1954)
Chen, Y., Zhon, M., Wang, S.: Reranking Answers for Definitional QA Using Language Modeling. In: Coling/ACL 2006, pp. 1081–1088 (2006)
Belkin, M., Goldsmith, J.: Using eigenvectors of the bigram graph to infer grammatical features and categories. In: Proceedings of the Morphology/Phonology Learning Workshop of ACL 2002 (2002)
Hildebrandt, W., Katz, B., Lin, J.: Answering Definition Questions Using Multiple Knowledge Sources. In: Proceedings of HLT-NAACL, pp. 49–56 (2004)
Soubbotin, M.M.: Patterns of Potential Answer Expressions as Clues to the Right Answers. In: Proceedings of the TREC-10 Conference (2001)
Lin, D., Pantel, P.: Discovery of Inference Rules for Question Answering. Journal of Natural Language Engineering 7, 343–360 (2001)
Bunescu, R., Mooney, R.J.: A Shortest Path Dependency Kernel for Relation Extraction. In: Proceedings of HLT/EMNLP (2005)
Chen, S., Goodman, J.: An Empirical Study of Smoothing Techniques for Language Modeling. In: Proceedings of the 34th Annual Meeting of the ACL, pp. 310–318 (1996)
Figueroa, A., Neumann, G.: A Multilingual Framework for Searching Definitions on Web Snippets. In: Hertzberg, J., Beetz, M., Englert, R. (eds.) KI 2007. LNCS (LNAI), vol. 4667, pp. 144–159. Springer, Heidelberg (2007)
Voorhees, E.M.: Evaluating Answers to Definition Questions. In: HLT-NAACL, pp. 109–111 (2003)
Lin, J., Demner-Fushman, D.: Will pyramids built of nuggets topple over? In: Proceedings of the main conference on HTL/NAACL, pp. 383–390 (2006)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Surdeanu, M., Ciaramita, M., Zaragoza, H.: Learning to Rank Answers on Large Online QA Collections. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL 2008), pp. 719–727 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Figueroa, A., Atkinson, J. (2010). Answering Definition Questions: Dealing with Data Sparseness in Lexicalised Dependency Trees-Based Language Models. In: Cordeiro, J., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2009. Lecture Notes in Business Information Processing, vol 45. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12436-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-12436-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12435-8
Online ISBN: 978-3-642-12436-5
eBook Packages: Computer ScienceComputer Science (R0)