Abstract
Weighted automata and transducers are widely used in modern applications in bioinformatics and text, speech, and image processing. This chapter describes several fundamental weighted automata and shortest-distance algorithms including composition, determinization, minimization, and synchronization, as well as single-source and all-pairs shortest distance algorithms over general semirings. It presents the pseudocode of these algorithms, gives an analysis of their running time complexity, and illustrates their use in some simple cases. Many other complex weighted automata and transducer algorithms used in practice can be obtained by combining these core algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A.V. Aho, J.E. Hopcroft, and J.D. Ullman. The Design and Analysis of Computer Algorithms. Addison–Wesley, Reading, 1974.
J. Albert and J. Kari. Digital image compression. In this Handbook. Chapter 11. Springer, Berlin, 2009.
C. Allauzen and M. Mohri. Efficient algorithms for testing the twins property. Journal of Automata, Languages and Combinatorics, 8(2):117–144, 2003.
C. Allauzen and M. Mohri. N-way composition of weighted finite-state transducers. Technical Report TR2007-902, Courant Institute of Mathematical Sciences, New York University, August 2007.
C. Allauzen and M. Mohri. 3-way composition of weighted finite-state transducers. In Proceedings of the 13th International Conference on Implementation and Application of Automata (CIAA 2008), San Francisco, California, July 2008, volume 5148 of Lecture Notes in Computer Science. Springer, Heidelberg, 2008.
C. Allauzen, M. Mohri, and M. Riley. Statistical modeling for unit selection in speech synthesis. In 42nd Meeting of the Association for Computational Linguistics (ACL 2004), Proceedings of the Conference, Barcelona, Spain, July 2004.
C. Allauzen, M. Mohri, and B. Roark. Generalized algorithms for constructing statistical language models. In 41st Meeting of the Association for Computational Linguistics (ACL 2003), Proceedings of the Conference, Sapporo, Japan, July 2003.
C. Allauzen, M. Mohri, and B. Roark. The design principles and algorithms of a weighted grammar library. International Journal of Foundations of Computer Science, 16(3):403–421, 2005.
C. Allauzen, M. Mohri, and A. Talwalkar. Sequence kernels for predicting protein essentiality. In Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), Helsinki, Finland, July 2008.
C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri. OpenFst: A general and efficient weighted finite-state transducer library. In Proceedings of the 12th International Conference on Implementation and Application of Automata (CIAA 2007), Prague, Czech Republic, July 2007, volume 4783 of Lecture Notes in Computer Science, pages 11–23. Springer, Heidelberg, 2007.
M.-P. Béal, O. Carton, C. Prieur, and J. Sakarovitch. Squaring transducers: An efficient procedure for deciding functionality and sequentiality. In Proceedings of LATIN’2000, volume 1776 of Lecture Notes in Computer Science. Springer, Heidelberg, 2000.
J. Berstel. Transductions and Context-Free Languages. Teubner Studienbucher, Stuttgart, 1979.
J. Berstel and C. Reutenauer. Rational Series and Their Languages. Springer, Berlin, 1988.
T.M. Breuel. The OCRopus open source OCR system. In Proceedings of IS&T/SPIE 20th Annual Symposium, 2008.
C. Choffrut. Une caractérisation des fonctions séquentielles et des fonctions sous-séquentielles en tant que relations rationnelles. Theoretical Computer Science, 5:325–338, 1977.
C. Choffrut. Contributions à l’étude de quelques familles remarquables de fonctions rationnelles. PhD thesis (thèse de doctorat d’Etat), Université Paris 7, LITP, Paris, 1978.
T.H. Cormen, C.E. Leiserson, and R.E. Rivest. Introduction to Algorithms. MIT Press, Cambridge, 1992.
C. Cortes, P. Haffner, and M. Mohri. A machine learning framework for spoken-dialog classification. In L. Rabiner and F. Juang, editors, Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer, Heidelberg, 2008.
C. Cortes, M. Mohri, and A. Rastogi. L p distance and equivalence of probabilistic automata. International Journal of Foundations of Computer Science, 18(4):761–780, 2007.
M. Droste and W. Kuich. Semirings and formal power series. In this Handbook. Chapter 1. Springer, Berlin, 2009.
R. Durbin, S.R. Eddy, A. Krogh, and G.J. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, 1998.
S. Eilenberg. Automata, Languages and Machines, volume A. Academic Press, San Diego, 1974.
Z. Ésik and W. Kuich. Modern automata theory. www.dmg.tuwien.ac.at/kuich.
Z. Esik and W. Kuich. Rationally additive semirings. Journal of Universal Computer Science, 8:173–183, 2002.
R.W. Floyd. Algorithm 97 (SHORTEST PATH). Communications of the ACM, 18, 1968.
M. Goldstern. Vervollständigung von Halbringen. Master’s thesis, Technische Universität Wien, 1985.
S. Inenaga, H. Hoshino, A. Shinohara, M. Takeda, and S. Arikawa. Construction of the CDAWG for a Trie. In Proceedings of the Prague Stringology Conference (PSC’01). Czech Technical University, Prague, 2001.
D.B. Johnson. Efficient algorithms for shortest paths in sparse networks. Journal of the ACM, 24(1):1–13, 1977.
R.M. Kaplan and M. Kay. Regular models of phonological rule systems. Computational Linguistics, 20(3), 1994.
L. Karttunen. The replace operator. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 16–23. Association for Computational Linguistics, Stroudsburg, 1995. Distributed by Morgan Kaufmann, San Francisco.
D. Kirsten. A Burnside approach to the termination of Mohri’s algorithm for polynomially ambiguous min-plus-automata. Informatique Théorique et Applications, RAIRO, Special Issue on Journées Montoises d’Informatique Théorique 2006 (JM’06), 42(3):553–581, 2008.
I. Klimann, S. Lombardy, J. Mairesse, and C. Prieur. Deciding unambiguity and sequentiality from a finitely ambiguous max-plus automaton. Theoretical Computer Science, 327(3):349–373, 2004.
K. Knight and J. May. Applications of weighted automata in natural language processing. In this Handbook. Chapter 14. Springer, Berlin, 2009.
D. Krob. The equality problem for rational series with multiplicities in the tropical semiring is undecidable. Journal of Algebra and Computation, 4, 1994.
W. Kuich. Semirings and formal languages: Their relevance to formal languages and automata. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1: Word, Language, Grammar, pages 609–677. Springer, New York, 1997.
W. Kuich and A. Salomaa. Semirings, Automata, Languages, volume 5 of Monographs in Theoretical Computer Science. An EATCS Series. Springer, Berlin, 1986.
D.J. Lehmann. Algebraic structures for transitive closures. Theoretical Computer Science, 4:59–76, 1977.
M. Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23:2, 1997.
M. Mohri. Minimization algorithms for sequential transducers. Theoretical Computer Science, 234:177–201, 2000.
M. Mohri. Generic epsilon-removal and input epsilon-normalization algorithms for weighted transducers. International Journal of Foundations of Computer Science, 13(1):129–143, 2002.
M. Mohri. Semiring frameworks and algorithms for shortest-distance problems. Journal of Automata, Languages and Combinatorics, 7(3):321–350, 2002.
M. Mohri. Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science, 14(6):957–982, 2003.
M. Mohri. Statistical natural language processing. In M. Lothaire, editor, Applied Combinatorics on Words. Cambridge University Press, Cambridge, 2005.
M. Mohri and M.-J. Nederhof. Regular approximation of context-free grammars through transformation. In Robustness in Language and Speech Technology, pages 153–163. Kluwer Academic, Dordrecht, 2001.
M. Mohri and F.C.N. Pereira. Dynamic compilation of weighted context-free grammars. In 36th Meeting of the Association for Computational Linguistics (ACL’98), Proceedings of the Conference, Montréal, Québec, Canada, 1998, pages 891–897.
M. Mohri, F.C.N. Pereira, and M. Riley. Weighted automata in text and speech processing. In Proceedings of the 12th Biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended Finite State Models of Language, Budapest, Hungary, 1996. Wiley, Chichester, 1996.
M. Mohri, F.C.N. Pereira, and M. Riley. The design principles of a weighted finite-state transducer library. Theoretical Computer Science, 231:17–32, 2000.
M. Mohri, F.C.N. Pereira, and M. Riley. Speech recognition with weighted finite-state transducers. In L. Rabiner, F. Juang, editors, Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer, Heidelberg, 2008.
M. Mohri and R. Sproat. An efficient compiler for weighted rewrite rules. In 34th Meeting of the Association for Computational Linguistics (ACL’96), Proceedings of the Conference, Santa Cruz, California, 1996.
F.C.N. Pereira and M.D. Riley. Speech recognition by composition of weighted finite automata. In Finite-State Language Processing, pages 431–453. MIT Press, Cambridge, 1997.
D. Perrin. Finite automata. In J. Van Leuwen, editor, Handbook of Theoretical Computer Science, volume B: Formal Models and Semantics, pages 1–57. Elsevier, Amsterdam, 1990.
I. Petre and A. Salomaa. Algebraic systems and pushdown automata. In this Handbook. Chapter 7. Springer, Berlin, 2009.
D. Revuz. Minimisation of acyclic deterministic automata in linear time. Theoretical Computer Science, 92(1):181–189, 1992.
A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer, New York, 1978.
M.-P. Schützenberger. On the definition of a family of automata. Information and Control, 4, 1961.
R. Sproat. A finite-state architecture for tokenization and grapheme-to-phoneme conversion in multilingual text analysis. In Proceedings of the ACL SIGDAT Workshop, Dublin, Ireland. Association for Computational Linguistics, Stroudsburg, 1995.
S. Warshall. A theorem on Boolean matrices. Journal of the ACM, 9(1):11–12, 1962.
A. Weber and R. Klemm. Economy of description for single-valued transducers. Information and Computation, 118(2):327–340, 1995.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mohri, M. (2009). Weighted Automata Algorithms. In: Droste, M., Kuich, W., Vogler, H. (eds) Handbook of Weighted Automata. Monographs in Theoretical Computer Science. An EATCS Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01492-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-01492-5_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01491-8
Online ISBN: 978-3-642-01492-5
eBook Packages: Computer ScienceComputer Science (R0)