Abstract
This paper deals with case-based machine translation. It is based on a previous work using a proportional analogy on strings, i.e., a quaternary relation expressing that “String A is to string B as string C is to string D”. The first contribution of this paper is the rewording of this work in terms of case-based reasoning: a case is a problem-solution pair \((A, A')\) where A is a sentence in an origin language and \(A'\), its translation in the destination language. First, three cases \((A, A')\), \((B, B')\), \((C, C')\) such that “A is to B as C is to the target problem D” are retrieved. Then, the analogical equation in the destination language “\(A'\) is to \(B'\) as \(C'\) is to x” is solved and \(D'=x\) is a suggested translation of D. Although it does not involve any linguistic knowledge, this approach was effective and gave competitive results at the time it was proposed. The second contribution of this work aims at examining how this prior knowledge-light case-based machine translation approach could be improved by using additional pieces of knowledge associated with cases, domain knowledge, retrieval knowledge, and adaptation knowledge, and other principles or techniques from case-based reasoning and natural language processing.
The first author is supported by a JSPS Grant-In-Aid 18K11447: “Self-explainable and fast-to-train example-based machine translation using neural networks.”
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this section, the ideas are illustrated in propositional logic, but could easily be expressed in other formalisms.
- 2.
E.g., https://www.matecat.com/. The authors of this paper are currently working on a slightly different scenario and are collecting such correction cases for use in a case-based correction system.
- 3.
References
Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun. 7(1), 39–59 (1994)
Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In: STOC 2015, pp. 51–58. ACM (2015)
Beesley, K.R.: Consonant spreading in Arabic stems. In: COLING-ACL 1998, Montréal, vol. I, pp. 117–123 (1998)
Boitet, C.: Current state and future outlook of the research at GETA. In: MT Summit I, Hakone, pp. 26–35 (1987)
Brown, P., et al.: A statistical approach to machine translation. In: COLING 1988, pp. 71–76 (1988)
Bunke, H., Messmer, B.T.: Similarity measures for structured representations. In: Wess, S., Althoff, K.-D., Richter, M.M. (eds.) EWCBR 1993. LNCS, vol. 837, pp. 106–118. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-58330-0_80
Cieri, C.: Addressing the language resource gap through alternative incentives, workforces and workflows (invited keynote lecture). In: LTC 2017, Poznań (2017)
Collins, B., Somers, H.: EBMT seen as case-based reasoning. In: Carl, M., Way, A. (eds.) Recent Advances in Example-Based Machine Translation. Text, Speech and Language Technology, vol. 21, pp. 115–153. Springer, Dordrecht (2003). https://doi.org/10.1007/978-94-010-0181-6_4
Cordier, A., et al.: Taaable: a case-based system for personalized cooking. In: Montani, S., Jain, L.C. (eds.) Successful Case-Based Reasoning Applications-2. SCI, vol. 494, pp. 121–162. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-38736-4_7
Gentner, D.: Structure mapping: a theoretical model for analogy. Cogn. Sci. 7(2), 155–170 (1983)
Johnson, M., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. CoRR (2016)
Kaveeta, V., Lepage, Y.: Solving analogical equations between strings of symbols using neural networks. In: Computational Analogy Workshop at ICCBR-16, Atlanta, Georgia, pp. 67–76 (2016)
Kitano, H.: Challenges of massive parallelism. In: IJCAI 1993, vol. 1, pp. 813–834. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: MT Summit X, Phuket, pp. 79–86 (2005)
Lepage, Y., Denoual, E.: Purest ever example-based machine translation: detailed presentation and assessment. Mach. Transl. 19, 251–282 (2005)
Lieber, J., Napoli, A.: Using classification in case-based planning. In: Wahlster, W. (ed.) ECAI 1996, pp. 132–136. Wiley (1996)
Lison, P., Tiedemann, J.: OpenSubtitles2016: extracting large parallel corpora from movie and TV subtitles. In: LREC 2016, Paris, France (2016)
Ma, W., Suel, T.: Structural sentence similarity estimation for short texts. In: FLAIRS-29, pp. 232–237 (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
Nagao, M.: A framework of a mechanical translation between Japanese and English by analogy principle. Artif. Hum. Intell., 173–180 (1984)
Nakazawa, T., Kurohashi, S.: EBMT system of KYOTO team in patentMT task at NTCIR-9. In: NTCIR-9, Tokyo, Japan, pp. 657–660 (2011)
Pal, S., Naskar, S.K., Vela, M., van Genabith, J.: A neural network based approach to automatic post-editing. In: ACL 2016, pp. 281–286 (2016)
Pham, H., Luong, T., Manning, C.: Learning distributed representations for multilingual text sequences. In: 1st Workshop on Vector Space Modeling for NLP, Denver, Colorado, pp. 88–94 (2015)
Řeh\(\mathring{\rm u}\)řek, R.: Making sense of word2vec. Available online
Richter, M.: Knowledge containers (2003). Available online
Riesbeck, C.K., Schank, R.C.: Inside Case-Based Reasoning. Lawrence Erlbaum Associates Inc, Hillsdale (1989). Available online
Rougegrez-Loriette, S.: Prédiction de processus à partir de comportement observé: le système REBECAS. Ph.D. thesis, Université Paris 6 (1994)
Sadler, V., Vendelmans, R.: Pilot implementation of a bilingual knowledge bank. In: COLING-1990, Helsinki, vol. 3, pp. 449–451 (1990)
Sato, S.: Example-based machine translation. Ph.D. thesis, Kyoto University (1991)
Schuster, M.: The move to neural machine translation at Google (invited talk). In: IWSLT 2017 (2017)
Smyth, B., Keane, M.T.: Remembering to forget. In: IJCAI 1995, Montréal, vol. 1, pp. 377–382 (1995)
Stahl, A., Bergmann, R.: Applying recursive CBR for the customization of structured products in an electronic shop. In: Blanzieri, E., Portinale, L. (eds.) EWCBR 2000. LNCS, vol. 1898, pp. 297–308. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44527-7_26
Takezawa, T., Sumita, E., Sugaya, F., Yamamoto, H., Yamamoto, S.: Toward a broad coverage bilingual corpus for speech translation of travel conversation in the real world. In: LREC 2002, Las Palmas, pp. 147–152 (2002)
Tiedemann, J.: News from OPUS - A collection of multilingual parallel corpora with tools and interfaces. In: RANLP-2009, vol. V, pp. 237–248. John Benjamins, Borovets (2009)
Turney, P., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)
Turney, P.D.: The latent relation mapping engine: algorithm and experiments. J. Artif. Int. Res. 33(1), 615–655 (2008)
Turney, P.D., Littman, M.L.: Corpus-based learning of analogies and semantic relations. Mach. Learn. 60(1–3), 251–278 (2005)
Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64, 100–118 (1985)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. Assoc. Comput. Mach. 21(1), 168–173 (1974)
Weaver, W.: Translation. Technical report, The Rockfeller Foundation, New York (1949)
Weber, R.O., Ashley, K.D., Brüninghaus, S.: Textual case-based reasoning. Knowl. Eng. Rev. 20(3), 255–260 (2005)
Yang, W., Shen, H., Lepage, Y.: Inflating a small parallel corpus into a large quasi-parallel corpus using monolingual data for Chinese-Japanese machine translation. J. Inf. Process. 25, 88–99 (2017)
Ziemski, M., Junczys-Dowmunt, M., Pouliquen, B.: The United Nations Parallel Corpus v1.0. In: LREC 2016, Paris, France (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Lepage, Y., Lieber, J. (2018). Case-Based Translation: First Steps from a Knowledge-Light Approach Based on Analogy to a Knowledge-Intensive One. In: Cox, M., Funk, P., Begum, S. (eds) Case-Based Reasoning Research and Development. ICCBR 2018. Lecture Notes in Computer Science(), vol 11156. Springer, Cham. https://doi.org/10.1007/978-3-030-01081-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-01081-2_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01080-5
Online ISBN: 978-3-030-01081-2
eBook Packages: Computer ScienceComputer Science (R0)