Skip to main content

Exploiting Multiple Translation Resources for English-Persian Cross Language Information Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8138))

Abstract

One of the most important issues in Cross Language Information Retrieval (CLIR) which affects the performance of CLIR systems is how to exploit available translation resources. This issue can be more challenging when dealing with a language that lacks appropriate translation resources. Another factor that affects the performance of a CLIR system is the degree of ambiguity of query words. In this paper, we propose to combine different translation resources for CLIR. We also propose two different methods that exploit phrases in the query translation process to solve the problem of ambiguousness of query words. Our evaluation results on English-Persian CLIR show the superiority of phrase based and combinational translation CLIR methods over other CLIR methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AleAhmad, A., Amiri, H., Darrudi, E., Rahgozar, M., Oroumchian, F.: Hamshahri: A standard persian text collection. Know.-Based Syst. 22(5), 382–387 (2009)

    Article  Google Scholar 

  2. Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)

    Google Scholar 

  3. Hashemi, H.B.: Using Comparable Corpora for English-Persian Cross-Language Information Retrieval. Master’s thesis, University of Tehran, Tehran, Iran (2011)

    Google Scholar 

  4. Baradaran Hashemi, H., Shakery, A., Faili, H.: Creating a persian-english comparable corpus. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 27–39. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Koehn, P.: Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In: Frederking, R.E., Taylor, K.B. (eds.) AMTA 2004. LNCS (LNAI), vol. 3265, pp. 115–124. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Nie, J.Y.: Cross-Language Information Retrieval. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2010)

    Google Scholar 

  7. Nie, J.Y., Isabelle, P., Plamondon, P., Foster, G.: Using a probabilistic translation model for cross-language information retrieval. In: 6th Workshop on Very Large Corpora, pp. 18–27 (1998)

    Google Scholar 

  8. Pilevar, M.T., Faili, H., Pilevar, A.H.: TEP: Tehran english-persian parallel corpus. In: Gelbukh, A. (ed.) CICLing 2011, Part II. LNCS, vol. 6609, pp. 68–79. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Talvensaari, T., Pirkola, A., Järvelin, K., Juhola, M., Laurikkala, J.: Focused web crawling in the acquisition of comparable corpora. Inf. Retr. 11(5), 427–445 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Azarbonyad, H., Shakery, A., Faili, H. (2013). Exploiting Multiple Translation Resources for English-Persian Cross Language Information Retrieval. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visualization. CLEF 2013. Lecture Notes in Computer Science, vol 8138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40802-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40802-1_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40801-4

  • Online ISBN: 978-3-642-40802-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics