Abstract
This paper presents two novel clustering approaches and their application to open-domain question answering. The One-Sentence-Multi-Topic clustering approach is first presented, which clusters sentences to improve the language model for retrieving sentences. Second, regarding each cluster in the results for One-Sentence-Multi-Topic clustering as aligned sentences, we present a pattern-similarity-based clustering approach that automatically learns syntactic answer patterns to answer selection through vertical and horizontal clustering. Our experiments on Chinese question answering demonstrates that One-Sentence-Multi-Topic clustering is much better than K-Means and is comparable to PLSI when used in sentence clustering of question answering. Similarly, the pattern-similarity-based clustering also proved to be efficient in learning syntactic answer patterns, the absolute improvement in syntactic pattern-based answer extraction over retrieval-based answer extraction is about 9%.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Moldovan, D., Harabagio, S., Girju, R., Morarescu, P., Lacatsu, F., Novischi, A.: LCC Tools for Question Answering. In: Proc. of TREC 2002 (2002)
Hovy, E.H., Hermjakob, U., Lin, C.Y.: The Use of External Knowledge of Factoid QA. In: Proc. of TREC 2001 (2001)
Ravichandran, D., Hovy, E.: Learning Surface Text Patterns for a Question Answering. In: Proc. of ACL Conference (2002)
Ittycheriah, A., Roukos, S.: IBM’s Statistical Question Answering System-TREC 11. In: Proc. of TREC, Gaithersburg, Maryland, November (2002)
Emmanuel, A.C., Croft, W.B., Murdock, V.: Answer Passage Retrieval for Question Answering. In: Proc. of SIGIR2004, pp. 516–517 (2004)
Murdock, V., Croft, W.B.: Simple Translation Models for Sentence Retrieval in Factoid Question Answering. In: Proc. of SIGIR2004 Workshop on IR4QA, pp. 31–35 (2004)
Nie, J.Y.: Integrating Term Relationships into Language Models for Information Retrieval. Report at ICT-CAS
Voorhees, E.M.: Overview of the TREC 2004 Question Answering Track. In: Proc. of TREC 2004 (2004)
Soubbotin, M.M., Soubbotin, S.M.: Use of Patterns for Detection of Likely Answer Strings: A Systematic Approach. In: Proc. of TREC 2002, Maryland, November, (2002)
Dumais, S., Banko, M., Brill, E., Lin, J., Ng, A.: Web Question Answering: Is More Always Better? In: Proc. of SIGIR2002, Tampere, Finland, (2002)
Du, Y.P., Huang, X.J., Li, X., Wu, L.D., Novel, A.: Pattern Learning Method for Open Domain Question Answering. In: Proc. of IJCNLP2004, Sanya, China (2004)
Wu, Y.Z., Zhao, J., Duan, X.Y., Xu, B.: Building an Evaluation Platform for Chinese Question Answering Systems. In: Proc. of First NCIRCS2004, Shanghai (2004)
Wu, Y.Z., Zhao, J., Xu, B.: Chinese Named Entity Recognition Model Based on Multiple Features. In: Proc. of HLT/EMNLP 2005, Vancouver, Canada, pp. 427–434 (2005)
Duan, X.Y., Zhao, J., Xu, B.: Building Chinese Dependency Parser Using SVM. Term Report (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, Y., Kashioka, H., Zhao, J. (2007). Using Clustering Approaches to Open-Domain Question Answering. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2007. Lecture Notes in Computer Science, vol 4394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-70939-8_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70938-1
Online ISBN: 978-3-540-70939-8
eBook Packages: Computer ScienceComputer Science (R0)