Deep Learning MT and Logos Model

  • Bernard Scott
Part of the Machine Translation: Technologies and Applications book series (MATRA, volume 2)


In this Chapter we compare 45-year-old Logos Model with AI’s deep learning technology and the neural net translation (NMT) technology that deep learning has given rise to. At a strictly computational level, Logos Model bears zero relationship to NMT, but we point out a number of ways in which Logos Model may nevertheless be seen to have anticipated NMT, specifically at the level of architecture and function. We take note of the fact that NMT has drifted away from interest in the biological verisimilitude of its models, and we note what experts say about the negative effect this has had on so-called continual machine learning (where new learning does not interfere with old learning, an obvious, vital requirement in MT). We discuss the related need for generalizations in MT learning, generalizations that are both semantic as well as syntactic, generalizations akin to the function exhibited by the brain in continual learning and processing of language. Our discussion turns on a particular point that experts at Google Deep Mind are making about continual learning, one they say that that AI has overlooked, and that, from our perspective, bears critically on MT. It concerns the way that the declarative, similarity-based operations of the hippocampus complements the more analytical, procedure-based operations of the neocortex to support continual learning. Most telling in this regard is their assertion that hippocampal learning is more than “item specific” and that, to the contrary, it exhibits distinct powers of semantic generalization. We note with satisfaction how that assertion about the complementary nature of hippocampal-neocortex learning comports with the analogical/analytical aspects of language processing in Logos Model. The very name and nature of SAL pattern-rules in Logos Model suggest this complementarity. We contend that the views of these deep learning experts provide indirect neuroscientific support for an MT methodology that affords continual, complexity-free learning, one that is predicated upon hippocampal/neocortex-like generalizations (viz., semantico-syntactic patterns). The present Chapter concludes with a Logos Model exercise illustrating the effectiveness of these declarative, hippocampal-like processes for MT.


  1. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Oral presentation at the 3rd international conference on learning and representation (ICLR 2015). San Diego. Accessed 26 Nov 2016
  2. Cho K (2015) Introduction to neural machine translation with GPUs (part 1). Accessed 24 June 2016
  3. Cho K, von Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. In: Proceedings of the eighth workshop on syntax, Semantics and Structure in Statistical Translation (SSST-8), Doha, pp 103–111.
  4. Chomsky N (1990) On formalization and formal linguistics. Nat Lang Linguist Theory 8:143–147CrossRefGoogle Scholar
  5. Deeplearning4j Development Team (2016) Introduction to deep neural networks. Accessed 14 Aug 2016
  6. Dettmers T (2015) Deep learning in a Netshell: core concepts. Internet Blog. Accessed 8 June 2016
  7. Fillmore C (1968) The case for case. In: Bach E, Harms RT (eds) Universals in linguistic theory. Holt/Rinehart and Winston, New York/London, pp 1–88Google Scholar
  8. Fischer K, Ágel V (2010) Dependency grammar and valency theory. In: The Oxford handbook of linguistic analysis. Oxford University Press, Oxford, pp 223–255Google Scholar
  9. Goldberg AE (2009) The nature of generalization in language. Cogn Linguist 20(1):93–127Google Scholar
  10. Guise KG, Shapiro M (2017) Medial prefrontal cortex reduces memory interference by modifying hippocampal encoding. Neuron 94(1):183–192CrossRefGoogle Scholar
  11. Hassabis D, Kumaran D, Summerfield C, Botvinick M (2017) Neuroscience-inspired artificial intelligence. Neuron 95(2):245–258CrossRefGoogle Scholar
  12. Hawakawa SI, Hayakawa AR (1991) Language in thought and action, 5th edn. Houghton Mifflin Harcourt, New YorkGoogle Scholar
  13. Kalchbrenner N, Blunsom P (2013) Recurrent convolutional neural networks for discourse compositionality. In: Proceedings of the 2013 workshop on continuous vector space models and their compositionality, Sofia, pp 119–126Google Scholar
  14. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural net for modeling sentences. In: Proceedings of the 52nd annual meetings of the association for computational linguistics, Baltimore, pp 655–665Google Scholar
  15. Knowlton BJ, Squire LR (1993) The learning of categories: parallel brain systems for item memory and category knowledge. Science 262(5140):1747–1749CrossRefGoogle Scholar
  16. Koehn P (2011) Statistical machine translation. Cambridge University Press, CambridgezbMATHGoogle Scholar
  17. Koehn P, Knowles R (2017) Six challenges for neural machine translation. In: Proceedings of the first workshop on neural machine translation, Vancouver, pp 26–39. http://arXiv: 1706.03872v1. Accessed 13 Dec 2017Google Scholar
  18. Kumaran D, McClelland JL (2012) Generalization through the recurrent interaction of episodic memories: a model of the hippocampal system. Psychol Rev 119(3):573–616CrossRefGoogle Scholar
  19. Kumaran D, Hassabis D, McClelland JL (2016) What learning systems do intelligent agents need? complementary learning systems theory updated. Trends Cogn Sci 20(7). Accessed 12 Jan 2017CrossRefGoogle Scholar
  20. Kurzweil R (2013) How to create a mind: the secret of human thought revealed. Penguin Books, New YorkGoogle Scholar
  21. Liu S, Yang N, Li M, Zhou M (2014) A recursive recurrent neural network for statistical machine translation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Baltimore, pp 1491–1500Google Scholar
  22. Marblestone AH, Wayne G, Kording KP (2016) Toward an integration of deep learning and neuroscience. Front Comput Neurosci 10(19).
  23. McClelland JL, McNaughton BL, O’Reilly RC (1995) Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol Rev 102(3):419–457. CrossRefGoogle Scholar
  24. Meng F, Lu Z, Wang M, Li H, Jiang W, Liu Q (2015) Encoding source language with convolutional neural network for machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 1, Long Papers, Beijing, pp 20–30Google Scholar
  25. Palmer DC (2006) On Chomskys appraisal of Skinners verbal behavior: a half-century of misunderstanding. Behav Anal 29(2):253–267CrossRefGoogle Scholar
  26. Pothos EM (2007) Theories of artificial grammar learning. Psychol Bull 133:227–244CrossRefGoogle Scholar
  27. Pulvermüller F (2013) How neurons make meaning: brain mechanisms for embodied and abstract-symbolic semantics. Trends Cogn Sci 17(9):458–470. Accessed 13 Dec 2015CrossRefGoogle Scholar
  28. Sanborn AN, Chater N (2016) Bayesean brains without probabilities. Trends Cogn Sci 20(121):883–893. Accessed 6 Feb 2017CrossRefGoogle Scholar
  29. Scott B (1989) The logos system. In: Proceedings of MT summit II, Munich, pp 137–142Google Scholar
  30. Scott B (1990) Biological neural net for parsing long, complex sentences. Logos Corporation PublicationGoogle Scholar
  31. Scott B (2003) Logos model: an historical perspective. Mach Transl 18(1):1–72MathSciNetCrossRefGoogle Scholar
  32. Sennrich R, Haddow B (2016) Linguistic input features improve neural machine translation. arXiv:1606.02892v2 [cs.CL]. Accessed 15 Aug 2017Google Scholar
  33. Toral, Antonio and Victor M. Sánchez-Cartagena. 2017. A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions.In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, vol 1, Long Papers, Valencia, pp 1063–1073. arXiv:1701.02901 [cs.CL]
  34. Zhang J, Ye L (2010) Series feature aggregation for content-based image retrieval. Comput Electr Eng 36(4):691–701MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Bernard Scott
    • 1
  1. 1.Tarpon SpringsUSA

Personalised recommendations