Skip to main content

Improving Moore’s Sentence Alignment Method Using Bilingual Word Clustering

  • Conference paper
Knowledge and Systems Engineering

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 244))

  • 1028 Accesses

Abstract

Sentence alignment plays an extremely important role in machine translation. Most of the hybrid approaches get either a bad recall or low precision. We tackle disadvantages of several novel sentence alignment approaches, which combine length-based and word correspondences. Word clustering is applied in our method in order to improve the quality of the sentence aligner, especially when dealing with the sparse data problem. Our approach overcomes the limits of previous hybrid methods and obtains both highly recall and reasonable precision rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Braune, F., Fraser, A.: Improved unsupervised sentence alignment for symmetrical and asymmetrical parallel corpora. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 81–89 (2010)

    Google Scholar 

  2. Brown, P.F., Lai, J.C., Mercer, R.L.: Aligning sentences in parallel corpora. In: Proceedings of the 29th Annual Meeting on Association for Computational Linguistics, Berkeley, California, pp. 169–176 (1991)

    Google Scholar 

  3. Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)

    Google Scholar 

  4. Chen, S.F.: Aligning sentences in bilingual corpora using lexical information. In: Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pp. 9–16 (1993)

    Google Scholar 

  5. Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. Computational Linguistics 19(1), 75–102 (1993)

    Google Scholar 

  6. Kay, M., Röscheisen, M.: Text-translation alignment. Computational Linguistics 19(1), 121–142 (1993)

    Google Scholar 

  7. Ma, X.: Champollion: a robust parallel text sentence aligner. In: LREC 2006: Fifth International Conference on Language Resources and Evaluation, pp. 489–492 (2006)

    Google Scholar 

  8. Melamed, I.D.: A geometric approach to mapping bitext correspondence. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1–12 (1996)

    Google Scholar 

  9. Fattah, M.A.: The Use of MSVM and HMM for Sentence Alignment. JIPS 8(2), 301–314 (2012)

    Google Scholar 

  10. Moore, R.C.: Fast and Accurate Sentence Alignment of Bilingual Corpora. In: Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users, pp. 135–144 (2002)

    Google Scholar 

  11. Sennrich, R., Volk, M.: MT-based sentence alignment for OCR-generated parallel texts. In: The Ninth Conference of the Association for Machine Translation in the Americas (AMTA 2010), Denver, Colorado (2010)

    Google Scholar 

  12. Varga, D., Németh, L., Halácsy, P., Kornai, A., Trón, V., Nagy, V.: Parallel corpora for medium density languages. In: Proceedings of the RANLP 2005, pp. 590–596 (2005)

    Google Scholar 

  13. Wu, D.: Aligning a parallel English-Chinese corpus statistically with lexical criteria. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, pp. 80–87 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai-Long Trieu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Trieu, HL., Nguyen, PT., Nguyen, KA. (2014). Improving Moore’s Sentence Alignment Method Using Bilingual Word Clustering. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 244. Springer, Cham. https://doi.org/10.1007/978-3-319-02741-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02741-8_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02740-1

  • Online ISBN: 978-3-319-02741-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics