Abstract
Retrieving topically relevant passages over a huge document collection is deemed to be of central importance to many information retrieval tasks, particularly to Question Answering (QA). Indeed, Passage Retrieval (PR) is a longstanding problem in QA, that has been widely studied over the last decades and still requires further efforts in order to enable a user to have a better chance to find a relevant answer to his human natural language question. This paper describes a successful attempt to improve PR and ranking for open domain QA by finding out the most relevant passage to a given question. It uses a support vector machine (SVM) model that incorporates a set of different powerful text similarity measures constituting our features. These latter include our new proposed n-gram based metric relying on the dependency degree of n-gram words of the question in the passage, as well as other lexical and semantic features which have already been proven successful in a recent Semantic Textual Similarity task (STS). We implemented a system named PRSYS to validate our approach in different languages. Our experimental evaluations have shown a comparable performance with other similar systems endowing with strong performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abacha, A.B., Zweigenbaum, P.: MEANS: a medical question-answering system combining NLP techniques and semantic web technologies. IPM 51(5), 570–594 (2015)
Araki, J., Callan, J.: An annotation similarity model in passage ranking for historical fact validation. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in IR, pp. 1111–1114. ACM (2014)
Bilotti, M.W., Elsas, J., Carbonell, J., Nyberg, E.: Rank learning for factoid question answering with linguistic and semantic constraints. In: Proceedings of the 19th ACM International Conference on Information and KM, pp. 459–468. ACM (2010)
Buscaldi, D., Le Roux, J., Flores, J.J.G., Popescu, A.: Lipn-core: semantic text similarity using n-grams, wordnet, syntactic analysis, esa and information retrieval based features. In: Proceedings of the 2nd Joint Conference on LCS, p. 63 (2013)
Buscaldi, D., Rosso, P., Gómez-Soriano, J.M., Sanchis, E.: Answering questions with an n-gram based passage retrieval engine. JIIS 34(2), 113–134 (2010)
Correa, S., Buscaldi, D., Rosso, P.: NLEL-MAAT at ResPubliQA. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 223–228. Springer, Heidelberg (2010)
Cui, H., Sun, R., Li, K., Kan, M.Y., Chua, T.S.: Question answering passage retrieval using dependency relations. In: Proceedings of the 28th Annual International ACM SIGIR Conference, pp. 400–407. ACM (2005)
Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1156–1165. ACM (2014)
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370. ACL (2005)
Gómez, J.M., Buscaldi, D., Rosso, P., Sanchis, E.: JIRS language-independent passage retrieval system: a comparative study. In: Proceedings of the 5th International Conference on NLP (ICON-2007), pp. 4–6 (2007)
Keikha, M., Park, J.H., Croft, W.B., Sanderson, M.: Retrieving passages and finding answers. In: Proceedings of the 2014 Australasian Document Computing Symposium, p. 81. ACM (2014)
Peñas, A., Forner, P., Rodrigo, Á., Sutcliffe, R.F.E., Forascu, C., Mota, C.: Overview of respubliqa 2010: question answering evaluation over european legislation. In: CLEF 2010 LABs and Workshops, Notebook Papers (2010)
Peñas, A., Forner, P., Sutcliffe, R., Rodrigo, Á., Forăscu, C., Alegria, I., Giampiccolo, D., Moreau, N., Osenova, P.: Overview of ResPubliQA 2009: question answering evaluation over European legislation. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 174–196. Springer, Heidelberg (2010)
Radev, D., Fan, W., Qi, H., Wu, H., Grewal, A.: Probabilistic question answering on the web. JASIST 56(6), 571–583 (2005)
Ryu, P.M., Jang, M.G., Kim, H.K.: Open domain question answering using wikipedia-based knowledge model. IPM 50(5), 683–692 (2014)
Severyn, A., Nicosia, M., Moschitti, A.: Building structures from classifiers for passage reranking. In: Proceedings of the 22nd ACM International Conference on CIKM, pp. 969–978. ACM (2013)
Shen, D., Lapata, M.: Using semantic roles to improve question answering. In: Proceedings of EMNLP/CoNLL, pp. 12–21 (2007)
Sun, H., Ma, H., Yih, W.t., Tsai, C.T., Liu, J., Chang, M.W.: Open domainquestion answering via semantic enrichment. In: Proceedings of the 24th International Conference on WWW, pp. 1045–1055 (2015)
Tari, L., Tu, P.H., Lumpkin, B., Leaman, R., Gonzalez, G., Baral, C.: Passage relevancy through semantic relatedness. In: TREC (2007)
Tellex, S., Katz, B., Lin, J., Fernandes, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference, pp. 41–47. ACM (2003)
Yen, S.J., Wu, Y.C., Yang, J.C., Lee, Y.S., Lee, C.J., Liu, J.J.: A support vector machine-based context-ranking model for question answering. JIS 224, 77–87 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Othman, N., Faiz, R. (2016). A Multi-lingual Approach to Improve Passage Retrieval for Automatic Question Answering. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2016. Lecture Notes in Computer Science(), vol 9612. Springer, Cham. https://doi.org/10.1007/978-3-319-41754-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-41754-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41753-0
Online ISBN: 978-3-319-41754-7
eBook Packages: Computer ScienceComputer Science (R0)