Abstract
Formal analogies, that is, proportional analogies involving relations at a formal level (e.g. cordially is to cordial as appreciatively is to appreciative) have a long history in Linguistics. They can accommodate a wide variety of linguistic data without resorting to ad hoc representations and are inherently good at capturing long range dependencies between data. Unfortunately, applying analogical learning on top of formal analogy to current Natural Language Processing (NLP) tasks, which often involve massive amount of data, is quite challenging. In this chapter, we draw on our previous works and identify some issues that remain to be addressed for formal analogy to stand by itself in the landscape of NLP. As a case study, we monitor our current implementation of analogical learning on a task of transliterating English proper names into Chinese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We also use \([{{x} : {y}\,{: :}\,{z} : {t}}]\) as a predicate.
- 2.
A solver typically produces several analogical solutions, among which a few are valid.
- 3.
Anagram forms do not have to be considered separately.
- 4.
Possibly involving filtering.
- 5.
Typical values are min = max = 3, \(\kappa \) = 20 000, and \(\eta \) = 5 000.
- 6.
The time measurements reported in this study have been made on an ordinary desktop computer, and are provided for illustration purposes only.
- 7.
- 8.
http://en.wikipedia.org/wiki/Template:Transcription_into_Chinese. We could not segment lessan with this table.
- 9.
For the English-into-Chinese transliteration task we consider, there is only one reference for each test form.
- 10.
In the samp-tc configuration, 20 test forms receive several solutions with the same maximum frequency, among which the sanctioned transliteration. We do credit the system with a rank 1 solution in those cases, even if the correct transliteration is not listed first.
- 11.
We downloaded the script news_evaluation.py at http://translit.i2r.a-star.edu.sg/news2009/evaluation/.
- 12.
To the exception of the way we broke ties in the unfrequent cases where several solutions are produced with the top frequency.
- 13.
17 features were considered, based on a greedy search over the feature space, minimizing the training error rate over 100 epochs.
- 14.
We took this example just because the number of analogies involved is small enough.
References
Turney, P., Littman, M.: Corpus-based learning of analogies and semantic relations. Mach. Learn. 60, 251–278 (2005)
Duc, N.T., Bollegala, D., Ishizuka, M.: Cross-language latent relational search: mapping knowledge across languages. In: AAAI’11, pp. 1237–1242 (2011)
Marshall, J.B.: A self-watching model of analogy-making and perception. J. Exp. Theor. Artif. Intell. 18(3), 267–307 (2002)
Hofstadter, D.R.: The Copycat Project: An Experiment in Nondeterminism and Creative Analogies, vol. 755. Massachusetts Institute of Technology, Cambridge (1984)
Mitchell, M.: Analogy-Making as Perception. MIT Press/Bradford Books, Cambridge (1993)
Yvon, F.: Paradigmatic cascades: a linguistically sound model of pronunciation by analogy. In: Proceedings of 35th ACL, pp. 429–435 (1997)
Lepage, Y., Shin-ichi, A.: Saussurian analogy: a theoretical account and its application. In: 7th COLING, pp. 717–722 (1996)
Yvon, F.: Finite-state machines solving analogies on words. Technical Report D008, École Nationale Supérieure des Télécommunications (2003)
Yvon, F., Stroppa, N., Delhay, A., Miclet, L.: Solving analogies on words. Technical Report D005, École Nationale Supérieure des Télécommuncations, Paris, France (2004)
Stroppa, N., Yvon, F.: Formal Models of Analogical Proportions. Available on HAL Portal (2007)
Miclet, L., Bayroudh, S., Delhay, A.: Analogical dissimilarity: definitions, algorithms and two experiments in machine learning. J. Artif. Intell. Res. 32, 793–824 (2008)
Ben Hassena, A.: Apprentissage analogique par analogie de structures d’arbres. Ph.D. thesis, University de Rennes I, France (2011)
Bhargava, A., Kondrak, G.: How do you pronounce your name? improving g2p with transliterations. In: 49th ACL/HLT, Portland, USA, pp. 399–408 (2011)
Lavallée, J.F., Langlais, P.: Moranapho: un système multilingue d’analyse morphologique basé sur l’analogie formelle. TAL 52, 17–44 (2011)
Kurimo, M., Virpioja, S., Turunen, V., Blackwood, G., Byrne, W.: Overview and results of morpho challenge. In: 10th Workshop of the Cross-Language Evaluation Forum (CLEF 2009). Lecture Notes in Computer Science, pp. 578–597 (2009)
Creutz, M., Lagus, K.: Inducing the morphological lexicon of a natural language from unan-notated text. In: International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR’05), Espoo, Finland, pp. 106–113 (2005)
Spiegler, S.: Emma: A novel evaluation metric for morphological analysis—experimental results in detail. Technical Report CSTR-10-004, University of Bristol, Bristol (2010)
Stroppa, N., Yvon, F.: An analogical learner for morphological analysis. In: 9th Conference on Computational Natural Language Learning (CoNLL), Ann Arbor, USA, pp. 120–127 (2005)
van den Bosch, A., Daelemans, W.: Data-oriented methods for grapheme-to-phoneme conversion. In: EACL, Utrecht, Netherlands, pp. 45–53 (1993)
Lepage, Y., Denoual, E.: Purest ever example-based machine translation: detailed presentation and assesment. Mach. Translat 19, 25–252 (2005)
Lepage, Y., Lardilleux, A., Gosme, J.: The greyc translation memory for the iwslt 2009 evaluation campaign: one step beyond translation memory. In: 6th IWSLT, Tokyo, Japan, pp. 45–49 (2009)
Langlais, P., Patry, A.: Translating unknown words by analogical learning. In: EMNLP, Prague, Czech Republic, pp. 877–886 (2007)
Denoual, E.: Analogical translation of unknown words in a statistical machine translation framework. In: MT Summit XI, Copenhagen, Denmark, pp. 135–141 (2007)
Langlais, P., Yvon, F., Zweigenbaum, P.: Improvements in analogical learning: application to translating multi-terms of the medical domain. In: 12th EACL, Athens, pp. 487–495 (2009)
Koehn, P.: Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In: 6th AMTA, Washington DC (2004)
Somers, H., Sandapat, S., Naskar, S.K.: A review of ebmt using proportional analogies. In: 3rd Workshop on Example-Based Machine Translation, Dublin, Ireland, pp. 53–60 (2009)
Gosme, J., Lepage, Y.: Structure des trigrammes inconnus et lissage par analogie. In: 18e TALN, Montpellier, France (2011)
Claveau, V., L’Homme, M.C.: Structuring terminology by analogy-based machine learning. In: 7th International Conference on Terminology and Knowledge Engineering, Copenhagen, Denmark (2005)
Moreau, F., Claveau, V., Sébillot, P.: Automatic morphological query expansion using analogy-based machine learning. In: 29th European conference on IR research (ECIR’07), Berlin, Heidelberg, pp. 222–233 (2007)
Correa, W.F., Prade, H., Richard, G.: When intelligence is just a matter of copying. In: ECAI’12, pp. 276–281 (2012)
Langlais, P., Yvon, F.: Scaling up analogical learning. Technical report, Paritech, INFRES, IC2, Paris, France (2008)
Stroppa, N.: Définitions et caractérisations de modèles à base d’analogies pour l’apprentissage automatique des langues naturelles. Ph.D. thesis, Telecom Paris, ENST, Paris, France (2005)
Eisner, J.: Parameter estimation for probabilistic finite-state transducers. In: 40th ACL, Philadelphia, USA, pp. 1–8 (2002)
Lepage, Y., Lardilleux, A.: The greyc translation memory for the iwslt 2007 evaluation campaign. In: 4th IWSLT, Trento, Italy, pp. 49–54 (2008)
Ando, S., Lepage, Y.: Linguistic structure analysis by analogy: Its efficiency. In: NLPRS, Phuket, Thailand, pp. 401–406 (1997)
Dandapat, S., Morrissey, S., Naskar, S.K., Somers, H.: Mitigating problems in analogy-based ebmt with smt and vice versa: a case study with named entity transliteration. In: PACLIC, Sendai, Japan (2010)
Li, H., Kumaran, A., Pervouchine, V., Zhang, M.: Report of news 2009 machine transliteration shared task. In: Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration. NEWS ’09, pp. 1–18 (2009)
Langlais, P.: Formal analogy for natural language processing: a review of issues to be adressed. In: 1st International Workshop Similarity and Analogy-based Methods in AI (ECAI Workshop), Montpellier, France, pp. 49–55 (2012)
Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37, 277–296 (1999)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Eisner, J.: Parameter estimation for probabilistic finite-state transducers. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, pp. 1–8 (2002)
Lepage, Y.: Solving analogies on words: an algorithm. In: COLING-ACL, Montreal, Canada, pp. 728–733 (1998)
Acknowledgments
This work has been partially founded by the Natural Sciences and Engineering Research Council of Canada (NSERC). We are grateful to the anonymous reviewers of the short paper submitted to the 2012 SAMAI workshop, as well as those that reviewed this article. We found one review in particular especially inspiring.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Langlais, P., Yvon, F. (2014). Issues in Analogical Inference Over Sequences of Symbols: A Case Study on Proper Name Transliteration. In: Prade, H., Richard, G. (eds) Computational Approaches to Analogical Reasoning: Current Trends. Studies in Computational Intelligence, vol 548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54516-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-54516-0_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54515-3
Online ISBN: 978-3-642-54516-0
eBook Packages: EngineeringEngineering (R0)