Skip to main content

Case-Based Translation: First Steps from a Knowledge-Light Approach Based on Analogy to a Knowledge-Intensive One

  • Conference paper
  • First Online:
Case-Based Reasoning Research and Development (ICCBR 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11156))

Included in the following conference series:

Abstract

This paper deals with case-based machine translation. It is based on a previous work using a proportional analogy on strings, i.e., a quaternary relation expressing that “String A is to string B as string C is to string D”. The first contribution of this paper is the rewording of this work in terms of case-based reasoning: a case is a problem-solution pair \((A, A')\) where A is a sentence in an origin language and \(A'\), its translation in the destination language. First, three cases \((A, A')\), \((B, B')\), \((C, C')\) such that “A is to B as C is to the target problem D” are retrieved. Then, the analogical equation in the destination language “\(A'\) is to \(B'\) as \(C'\) is to x” is solved and \(D'=x\) is a suggested translation of D. Although it does not involve any linguistic knowledge, this approach was effective and gave competitive results at the time it was proposed. The second contribution of this work aims at examining how this prior knowledge-light case-based machine translation approach could be improved by using additional pieces of knowledge associated with cases, domain knowledge, retrieval knowledge, and adaptation knowledge, and other principles or techniques from case-based reasoning and natural language processing.

The first author is supported by a JSPS Grant-In-Aid 18K11447: “Self-explainable and fast-to-train example-based machine translation using neural networks.”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this section, the ideas are illustrated in propositional logic, but could easily be expressed in other formalisms.

  2. 2.

    E.g., https://www.matecat.com/. The authors of this paper are currently working on a slightly different scenario and are collecting such correction cases for use in a case-based correction system.

  3. 3.

    https://tatoeba.org/.

References

  1. Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun. 7(1), 39–59 (1994)

    Google Scholar 

  2. Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In: STOC 2015, pp. 51–58. ACM (2015)

    Google Scholar 

  3. Beesley, K.R.: Consonant spreading in Arabic stems. In: COLING-ACL 1998, Montréal, vol. I, pp. 117–123 (1998)

    Google Scholar 

  4. Boitet, C.: Current state and future outlook of the research at GETA. In: MT Summit I, Hakone, pp. 26–35 (1987)

    Google Scholar 

  5. Brown, P., et al.: A statistical approach to machine translation. In: COLING 1988, pp. 71–76 (1988)

    Google Scholar 

  6. Bunke, H., Messmer, B.T.: Similarity measures for structured representations. In: Wess, S., Althoff, K.-D., Richter, M.M. (eds.) EWCBR 1993. LNCS, vol. 837, pp. 106–118. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-58330-0_80

    Chapter  Google Scholar 

  7. Cieri, C.: Addressing the language resource gap through alternative incentives, workforces and workflows (invited keynote lecture). In: LTC 2017, Poznań (2017)

    Google Scholar 

  8. Collins, B., Somers, H.: EBMT seen as case-based reasoning. In: Carl, M., Way, A. (eds.) Recent Advances in Example-Based Machine Translation. Text, Speech and Language Technology, vol. 21, pp. 115–153. Springer, Dordrecht (2003). https://doi.org/10.1007/978-94-010-0181-6_4

    Chapter  Google Scholar 

  9. Cordier, A., et al.: Taaable: a case-based system for personalized cooking. In: Montani, S., Jain, L.C. (eds.) Successful Case-Based Reasoning Applications-2. SCI, vol. 494, pp. 121–162. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-38736-4_7

    Chapter  Google Scholar 

  10. Gentner, D.: Structure mapping: a theoretical model for analogy. Cogn. Sci. 7(2), 155–170 (1983)

    Article  Google Scholar 

  11. Johnson, M., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. CoRR (2016)

    Google Scholar 

  12. Kaveeta, V., Lepage, Y.: Solving analogical equations between strings of symbols using neural networks. In: Computational Analogy Workshop at ICCBR-16, Atlanta, Georgia, pp. 67–76 (2016)

    Google Scholar 

  13. Kitano, H.: Challenges of massive parallelism. In: IJCAI 1993, vol. 1, pp. 813–834. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  14. Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: MT Summit X, Phuket, pp. 79–86 (2005)

    Google Scholar 

  15. Lepage, Y., Denoual, E.: Purest ever example-based machine translation: detailed presentation and assessment. Mach. Transl. 19, 251–282 (2005)

    Article  Google Scholar 

  16. Lieber, J., Napoli, A.: Using classification in case-based planning. In: Wahlster, W. (ed.) ECAI 1996, pp. 132–136. Wiley (1996)

    Google Scholar 

  17. Lison, P., Tiedemann, J.: OpenSubtitles2016: extracting large parallel corpora from movie and TV subtitles. In: LREC 2016, Paris, France (2016)

    Google Scholar 

  18. Ma, W., Suel, T.: Structural sentence similarity estimation for short texts. In: FLAIRS-29, pp. 232–237 (2016)

    Google Scholar 

  19. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  20. Nagao, M.: A framework of a mechanical translation between Japanese and English by analogy principle. Artif. Hum. Intell., 173–180 (1984)

    Google Scholar 

  21. Nakazawa, T., Kurohashi, S.: EBMT system of KYOTO team in patentMT task at NTCIR-9. In: NTCIR-9, Tokyo, Japan, pp. 657–660 (2011)

    Google Scholar 

  22. Pal, S., Naskar, S.K., Vela, M., van Genabith, J.: A neural network based approach to automatic post-editing. In: ACL 2016, pp. 281–286 (2016)

    Google Scholar 

  23. Pham, H., Luong, T., Manning, C.: Learning distributed representations for multilingual text sequences. In: 1st Workshop on Vector Space Modeling for NLP, Denver, Colorado, pp. 88–94 (2015)

    Google Scholar 

  24. Řeh\(\mathring{\rm u}\)řek, R.: Making sense of word2vec. Available online

    Google Scholar 

  25. Richter, M.: Knowledge containers (2003). Available online

    Google Scholar 

  26. Riesbeck, C.K., Schank, R.C.: Inside Case-Based Reasoning. Lawrence Erlbaum Associates Inc, Hillsdale (1989). Available online

    Google Scholar 

  27. Rougegrez-Loriette, S.: Prédiction de processus à partir de comportement observé: le système REBECAS. Ph.D. thesis, Université Paris 6 (1994)

    Google Scholar 

  28. Sadler, V., Vendelmans, R.: Pilot implementation of a bilingual knowledge bank. In: COLING-1990, Helsinki, vol. 3, pp. 449–451 (1990)

    Google Scholar 

  29. Sato, S.: Example-based machine translation. Ph.D. thesis, Kyoto University (1991)

    Google Scholar 

  30. Schuster, M.: The move to neural machine translation at Google (invited talk). In: IWSLT 2017 (2017)

    Google Scholar 

  31. Smyth, B., Keane, M.T.: Remembering to forget. In: IJCAI 1995, Montréal, vol. 1, pp. 377–382 (1995)

    Google Scholar 

  32. Stahl, A., Bergmann, R.: Applying recursive CBR for the customization of structured products in an electronic shop. In: Blanzieri, E., Portinale, L. (eds.) EWCBR 2000. LNCS, vol. 1898, pp. 297–308. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44527-7_26

    Chapter  Google Scholar 

  33. Takezawa, T., Sumita, E., Sugaya, F., Yamamoto, H., Yamamoto, S.: Toward a broad coverage bilingual corpus for speech translation of travel conversation in the real world. In: LREC 2002, Las Palmas, pp. 147–152 (2002)

    Google Scholar 

  34. Tiedemann, J.: News from OPUS - A collection of multilingual parallel corpora with tools and interfaces. In: RANLP-2009, vol. V, pp. 237–248. John Benjamins, Borovets (2009)

    Google Scholar 

  35. Turney, P., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)

    Article  MathSciNet  Google Scholar 

  36. Turney, P.D.: The latent relation mapping engine: algorithm and experiments. J. Artif. Int. Res. 33(1), 615–655 (2008)

    MATH  Google Scholar 

  37. Turney, P.D., Littman, M.L.: Corpus-based learning of analogies and semantic relations. Mach. Learn. 60(1–3), 251–278 (2005)

    Article  Google Scholar 

  38. Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64, 100–118 (1985)

    Article  MathSciNet  Google Scholar 

  39. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. Assoc. Comput. Mach. 21(1), 168–173 (1974)

    Article  MathSciNet  Google Scholar 

  40. Weaver, W.: Translation. Technical report, The Rockfeller Foundation, New York (1949)

    Google Scholar 

  41. Weber, R.O., Ashley, K.D., Brüninghaus, S.: Textual case-based reasoning. Knowl. Eng. Rev. 20(3), 255–260 (2005)

    Article  Google Scholar 

  42. Yang, W., Shen, H., Lepage, Y.: Inflating a small parallel corpus into a large quasi-parallel corpus using monolingual data for Chinese-Japanese machine translation. J. Inf. Process. 25, 88–99 (2017)

    Google Scholar 

  43. Ziemski, M., Junczys-Dowmunt, M., Pouliquen, B.: The United Nations Parallel Corpus v1.0. In: LREC 2016, Paris, France (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yves Lepage or Jean Lieber .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lepage, Y., Lieber, J. (2018). Case-Based Translation: First Steps from a Knowledge-Light Approach Based on Analogy to a Knowledge-Intensive One. In: Cox, M., Funk, P., Begum, S. (eds) Case-Based Reasoning Research and Development. ICCBR 2018. Lecture Notes in Computer Science(), vol 11156. Springer, Cham. https://doi.org/10.1007/978-3-030-01081-2_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01081-2_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01080-5

  • Online ISBN: 978-3-030-01081-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics