Abstract
The stochastic extension of formal translations constitutes a suitable framework for dealing with many problems in Syntactic Pattern Recognition. Some estimation criteria have already been proposed and developed for the parameter estimation of Regular Syntax-Directed Translation Schemata. Here, a new criterium is proposed for dealing with situations when training data is sparse. This criterium is based on entropy measurements, somehow inspired in the Maximum Mutual Information criterium, and it takes into account the possibility of ambiguity in translations (i.e., the translation model may yield different output strings for a single input string.) The goal in the stochastic framework is to find the most probable translation of a given input string. Experiments were performed on a translation task which has a high degree of ambiguity.
Chapter PDF
Similar content being viewed by others
References
Aho, A. V. and Ullman, J. D. (1972). The Theory of Parsing, Translation and Compiling. Vol. 1. Prentice-Hall.
Amengual, J. C., Benedí, J. B., Casacuberta, F., Castaño, A., Castellanos, A., Jiménez, V. M., Llorens, D., Marzal, A., Pastor, M., Prat, F., Vidal, E., and Vilar, J. M. (1998). The Eutrans-I Speech Translation System. Submitted to Machine Translation.
Berstel, J. (1979). Transductions and Context-Free Languages. B. G. Teubner Stuttgart.
Brown, P. F. (1987). The Acoustic-Modelling Problem in Automatic Speech Recognition. Ph. Dissertation. Carnegie-Mellon University.
Casacuberta, F. (2000). Morphic Generator Translation Inference. To be submited for publication.
Casacuberta, F. (1995). Probabilistic Estimation of Stochastic Regular Syntax-Directed Translation Schemes. Proc. of the VI Spanish Symposium on Pattern Recognition and Image Analysis, pp. 201–207.
Casacuberta, F. (1996). Growth Transformations for Probabilistic Functions of Stochastic Grammars. International Journal of Pattern Recognition and Artificial Intelligence, vol. 10, n. 3, pp. 183–201, Word Scientific Publishing Company.
Casacuberta, F. (1996). Maximum Mutual Information and Conditional Maximum Likelihood Estimation of Stochastic Regular Syntax-Directed Translation Schemes. Grammatical inference: Learning Syntax from Sentences, L. Miclet and C. de la Higuera (eds.). Lecture Notes in Artificial Intelligence. Vol. 1147, pp. 282–291. Springer Verlag.
Casacuberta, F., de la Higuera, C. (1998). Computational Complexity of Problems on Probabilistic Grammars and Transducers. To be published.
Cardin, R., Normandin, Y., DeMori, R. (1994). High Performance Connected Digit Recognition using Maximum Mutual Information Estimation. IEEE Trans. on Speech and Audio Processing, vol. 2(2), pp. 300–311.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm (with discussion). Journal of the Royal Statistical Society, ser. B, vol. 39, num. 1, pp. 1–38.
Fu, K. S.(1982). Syntactic Pattern Recognition and Applications. Ed. Prentice-Hall.
Gopalakrishnan, P. S., Kanevsky, D., Nádas, A., and Nahamoo. D. (1991). An Inequality for Rational Functions with Applications to Some Statistical Estimation Problems. IEEE Transactions on Information Theory, vol. 37, no. 1.
González, R. C., and Thomason, M. G. (1978). Syntactic Pattern Recognition: An Introduction, Addison-Wesley.
Jelinek, F. (1998) Statistical Methods for Speech Recognition. MIT Press, 1998.
Oncina, J., García, P., and Vidal, E. (1993). Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 5, pp. 448–458.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, pp. 379–423 (Part I), pp. 623–656 (Part II).
Vidal, E. (1997). Finite-State Speech-to-Speech Translation. Proceedings of the International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 111–114. Munich (Germany).
Vidal, E., Casacuberta, F., and García, P. (1995). Grammatical Inference and Speech Recognition, New Advances and Trends in Speech Recognition and Coding. NATO ASI Series. pp. 174–191. Springer-Verlag.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Picó, D., Casacuberta, F. (2000). A Statistical-Estimation Method for Stochastic Finite-State Transducers Based on Entropy Measures. In: Ferri, F.J., Iñesta, J.M., Amin, A., Pudil, P. (eds) Advances in Pattern Recognition. SSPR /SPR 2000. Lecture Notes in Computer Science, vol 1876. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44522-6_43
Download citation
DOI: https://doi.org/10.1007/3-540-44522-6_43
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67946-2
Online ISBN: 978-3-540-44522-7
eBook Packages: Springer Book Archive