Skip to main content

A Multi-lingual Approach to Improve Passage Retrieval for Automatic Question Answering

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9612))

  • 2121 Accesses

Abstract

Retrieving topically relevant passages over a huge document collection is deemed to be of central importance to many information retrieval tasks, particularly to Question Answering (QA). Indeed, Passage Retrieval (PR) is a longstanding problem in QA, that has been widely studied over the last decades and still requires further efforts in order to enable a user to have a better chance to find a relevant answer to his human natural language question. This paper describes a successful attempt to improve PR and ranking for open domain QA by finding out the most relevant passage to a given question. It uses a support vector machine (SVM) model that incorporates a set of different powerful text similarity measures constituting our features. These latter include our new proposed n-gram based metric relying on the dependency degree of n-gram words of the question in the passage, as well as other lexical and semantic features which have already been proven successful in a recent Semantic Textual Similarity task (STS). We implemented a system named PRSYS to validate our approach in different languages. Our experimental evaluations have shown a comparable performance with other similar systems endowing with strong performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://trec.nist.gov/.

  2. 2.

    http://www.clef-initiative.eu/.

  3. 3.

    http://sourceforge.net/projects/jirs/.

  4. 4.

    http://svmlight.joachims.org/.

  5. 5.

    https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis.

  6. 6.

    http://www.europarl.europa.eu/.

References

  1. Abacha, A.B., Zweigenbaum, P.: MEANS: a medical question-answering system combining NLP techniques and semantic web technologies. IPM 51(5), 570–594 (2015)

    Google Scholar 

  2. Araki, J., Callan, J.: An annotation similarity model in passage ranking for historical fact validation. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in IR, pp. 1111–1114. ACM (2014)

    Google Scholar 

  3. Bilotti, M.W., Elsas, J., Carbonell, J., Nyberg, E.: Rank learning for factoid question answering with linguistic and semantic constraints. In: Proceedings of the 19th ACM International Conference on Information and KM, pp. 459–468. ACM (2010)

    Google Scholar 

  4. Buscaldi, D., Le Roux, J., Flores, J.J.G., Popescu, A.: Lipn-core: semantic text similarity using n-grams, wordnet, syntactic analysis, esa and information retrieval based features. In: Proceedings of the 2nd Joint Conference on LCS, p. 63 (2013)

    Google Scholar 

  5. Buscaldi, D., Rosso, P., Gómez-Soriano, J.M., Sanchis, E.: Answering questions with an n-gram based passage retrieval engine. JIIS 34(2), 113–134 (2010)

    Google Scholar 

  6. Correa, S., Buscaldi, D., Rosso, P.: NLEL-MAAT at ResPubliQA. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 223–228. Springer, Heidelberg (2010)

    Google Scholar 

  7. Cui, H., Sun, R., Li, K., Kan, M.Y., Chua, T.S.: Question answering passage retrieval using dependency relations. In: Proceedings of the 28th Annual International ACM SIGIR Conference, pp. 400–407. ACM (2005)

    Google Scholar 

  8. Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1156–1165. ACM (2014)

    Google Scholar 

  9. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370. ACL (2005)

    Google Scholar 

  10. Gómez, J.M., Buscaldi, D., Rosso, P., Sanchis, E.: JIRS language-independent passage retrieval system: a comparative study. In: Proceedings of the 5th International Conference on NLP (ICON-2007), pp. 4–6 (2007)

    Google Scholar 

  11. Keikha, M., Park, J.H., Croft, W.B., Sanderson, M.: Retrieving passages and finding answers. In: Proceedings of the 2014 Australasian Document Computing Symposium, p. 81. ACM (2014)

    Google Scholar 

  12. Peñas, A., Forner, P., Rodrigo, Á., Sutcliffe, R.F.E., Forascu, C., Mota, C.: Overview of respubliqa 2010: question answering evaluation over european legislation. In: CLEF 2010 LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  13. Peñas, A., Forner, P., Sutcliffe, R., Rodrigo, Á., Forăscu, C., Alegria, I., Giampiccolo, D., Moreau, N., Osenova, P.: Overview of ResPubliQA 2009: question answering evaluation over European legislation. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 174–196. Springer, Heidelberg (2010)

    Google Scholar 

  14. Radev, D., Fan, W., Qi, H., Wu, H., Grewal, A.: Probabilistic question answering on the web. JASIST 56(6), 571–583 (2005)

    Article  Google Scholar 

  15. Ryu, P.M., Jang, M.G., Kim, H.K.: Open domain question answering using wikipedia-based knowledge model. IPM 50(5), 683–692 (2014)

    Google Scholar 

  16. Severyn, A., Nicosia, M., Moschitti, A.: Building structures from classifiers for passage reranking. In: Proceedings of the 22nd ACM International Conference on CIKM, pp. 969–978. ACM (2013)

    Google Scholar 

  17. Shen, D., Lapata, M.: Using semantic roles to improve question answering. In: Proceedings of EMNLP/CoNLL, pp. 12–21 (2007)

    Google Scholar 

  18. Sun, H., Ma, H., Yih, W.t., Tsai, C.T., Liu, J., Chang, M.W.: Open domainquestion answering via semantic enrichment. In: Proceedings of the 24th International Conference on WWW, pp. 1045–1055 (2015)

    Google Scholar 

  19. Tari, L., Tu, P.H., Lumpkin, B., Leaman, R., Gonzalez, G., Baral, C.: Passage relevancy through semantic relatedness. In: TREC (2007)

    Google Scholar 

  20. Tellex, S., Katz, B., Lin, J., Fernandes, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference, pp. 41–47. ACM (2003)

    Google Scholar 

  21. Yen, S.J., Wu, Y.C., Yang, J.C., Lee, Y.S., Lee, C.J., Liu, J.J.: A support vector machine-based context-ranking model for question answering. JIS 224, 77–87 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nouha Othman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Othman, N., Faiz, R. (2016). A Multi-lingual Approach to Improve Passage Retrieval for Automatic Question Answering. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2016. Lecture Notes in Computer Science(), vol 9612. Springer, Cham. https://doi.org/10.1007/978-3-319-41754-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41754-7_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41753-0

  • Online ISBN: 978-3-319-41754-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics