Weighted Automata Algorithms

Mohri, Mehryar

doi:10.1007/978-3-642-01492-5_6

Mehryar Mohri^4,5

Part of the book series: Monographs in Theoretical Computer Science. An EATCS Series ((EATCS))

2103 Accesses
79 Citations
9 Altmetric

Abstract

Weighted automata and transducers are widely used in modern applications in bioinformatics and text, speech, and image processing. This chapter describes several fundamental weighted automata and shortest-distance algorithms including composition, determinization, minimization, and synchronization, as well as single-source and all-pairs shortest distance algorithms over general semirings. It presents the pseudocode of these algorithms, gives an analysis of their running time complexity, and illustrates their use in some simple cases. Many other complex weighted automata and transducer algorithms used in practice can be obtained by combining these core algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A.V. Aho, J.E. Hopcroft, and J.D. Ullman. The Design and Analysis of Computer Algorithms. Addison–Wesley, Reading, 1974.
MATH Google Scholar
J. Albert and J. Kari. Digital image compression. In this Handbook. Chapter 11. Springer, Berlin, 2009.
Google Scholar
C. Allauzen and M. Mohri. Efficient algorithms for testing the twins property. Journal of Automata, Languages and Combinatorics, 8(2):117–144, 2003.
MATH MathSciNet Google Scholar
C. Allauzen and M. Mohri. N-way composition of weighted finite-state transducers. Technical Report TR2007-902, Courant Institute of Mathematical Sciences, New York University, August 2007.
Google Scholar
C. Allauzen and M. Mohri. 3-way composition of weighted finite-state transducers. In Proceedings of the 13th International Conference on Implementation and Application of Automata (CIAA 2008), San Francisco, California, July 2008, volume 5148 of Lecture Notes in Computer Science. Springer, Heidelberg, 2008.
Google Scholar
C. Allauzen, M. Mohri, and M. Riley. Statistical modeling for unit selection in speech synthesis. In 42nd Meeting of the Association for Computational Linguistics (ACL 2004), Proceedings of the Conference, Barcelona, Spain, July 2004.
Google Scholar
C. Allauzen, M. Mohri, and B. Roark. Generalized algorithms for constructing statistical language models. In 41st Meeting of the Association for Computational Linguistics (ACL 2003), Proceedings of the Conference, Sapporo, Japan, July 2003.
Google Scholar
C. Allauzen, M. Mohri, and B. Roark. The design principles and algorithms of a weighted grammar library. International Journal of Foundations of Computer Science, 16(3):403–421, 2005.
Article MATH MathSciNet Google Scholar
C. Allauzen, M. Mohri, and A. Talwalkar. Sequence kernels for predicting protein essentiality. In Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), Helsinki, Finland, July 2008.
Google Scholar
C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri. OpenFst: A general and efficient weighted finite-state transducer library. In Proceedings of the 12th International Conference on Implementation and Application of Automata (CIAA 2007), Prague, Czech Republic, July 2007, volume 4783 of Lecture Notes in Computer Science, pages 11–23. Springer, Heidelberg, 2007.
Google Scholar
M.-P. Béal, O. Carton, C. Prieur, and J. Sakarovitch. Squaring transducers: An efficient procedure for deciding functionality and sequentiality. In Proceedings of LATIN’2000, volume 1776 of Lecture Notes in Computer Science. Springer, Heidelberg, 2000.
Google Scholar
J. Berstel. Transductions and Context-Free Languages. Teubner Studienbucher, Stuttgart, 1979.
MATH Google Scholar
J. Berstel and C. Reutenauer. Rational Series and Their Languages. Springer, Berlin, 1988.
MATH Google Scholar
T.M. Breuel. The OCRopus open source OCR system. In Proceedings of IS&T/SPIE 20th Annual Symposium, 2008.
Google Scholar
C. Choffrut. Une caractérisation des fonctions séquentielles et des fonctions sous-séquentielles en tant que relations rationnelles. Theoretical Computer Science, 5:325–338, 1977.
Article MathSciNet Google Scholar
C. Choffrut. Contributions à l’étude de quelques familles remarquables de fonctions rationnelles. PhD thesis (thèse de doctorat d’Etat), Université Paris 7, LITP, Paris, 1978.
Google Scholar
T.H. Cormen, C.E. Leiserson, and R.E. Rivest. Introduction to Algorithms. MIT Press, Cambridge, 1992.
Google Scholar
C. Cortes, P. Haffner, and M. Mohri. A machine learning framework for spoken-dialog classification. In L. Rabiner and F. Juang, editors, Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer, Heidelberg, 2008.
Google Scholar
C. Cortes, M. Mohri, and A. Rastogi. L _p distance and equivalence of probabilistic automata. International Journal of Foundations of Computer Science, 18(4):761–780, 2007.
Article MATH MathSciNet Google Scholar
M. Droste and W. Kuich. Semirings and formal power series. In this Handbook. Chapter 1. Springer, Berlin, 2009.
Google Scholar
R. Durbin, S.R. Eddy, A. Krogh, and G.J. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, 1998.
MATH Google Scholar
S. Eilenberg. Automata, Languages and Machines, volume A. Academic Press, San Diego, 1974.
Google Scholar
Z. Ésik and W. Kuich. Modern automata theory. www.dmg.tuwien.ac.at/kuich.
Z. Esik and W. Kuich. Rationally additive semirings. Journal of Universal Computer Science, 8:173–183, 2002.
MathSciNet Google Scholar
R.W. Floyd. Algorithm 97 (SHORTEST PATH). Communications of the ACM, 18, 1968.
Google Scholar
M. Goldstern. Vervollständigung von Halbringen. Master’s thesis, Technische Universität Wien, 1985.
Google Scholar
S. Inenaga, H. Hoshino, A. Shinohara, M. Takeda, and S. Arikawa. Construction of the CDAWG for a Trie. In Proceedings of the Prague Stringology Conference (PSC’01). Czech Technical University, Prague, 2001.
Google Scholar
D.B. Johnson. Efficient algorithms for shortest paths in sparse networks. Journal of the ACM, 24(1):1–13, 1977.
Article MATH Google Scholar
R.M. Kaplan and M. Kay. Regular models of phonological rule systems. Computational Linguistics, 20(3), 1994.
Google Scholar
L. Karttunen. The replace operator. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 16–23. Association for Computational Linguistics, Stroudsburg, 1995. Distributed by Morgan Kaufmann, San Francisco.
Chapter Google Scholar
D. Kirsten. A Burnside approach to the termination of Mohri’s algorithm for polynomially ambiguous min-plus-automata. Informatique Théorique et Applications, RAIRO, Special Issue on Journées Montoises d’Informatique Théorique 2006 (JM’06), 42(3):553–581, 2008.
MATH MathSciNet Google Scholar
I. Klimann, S. Lombardy, J. Mairesse, and C. Prieur. Deciding unambiguity and sequentiality from a finitely ambiguous max-plus automaton. Theoretical Computer Science, 327(3):349–373, 2004.
Article MATH MathSciNet Google Scholar
K. Knight and J. May. Applications of weighted automata in natural language processing. In this Handbook. Chapter 14. Springer, Berlin, 2009.
Google Scholar
D. Krob. The equality problem for rational series with multiplicities in the tropical semiring is undecidable. Journal of Algebra and Computation, 4, 1994.
Google Scholar
W. Kuich. Semirings and formal languages: Their relevance to formal languages and automata. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1: Word, Language, Grammar, pages 609–677. Springer, New York, 1997.
Google Scholar
W. Kuich and A. Salomaa. Semirings, Automata, Languages, volume 5 of Monographs in Theoretical Computer Science. An EATCS Series. Springer, Berlin, 1986.
MATH Google Scholar
D.J. Lehmann. Algebraic structures for transitive closures. Theoretical Computer Science, 4:59–76, 1977.
Article MATH MathSciNet Google Scholar
M. Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23:2, 1997.
MathSciNet Google Scholar
M. Mohri. Minimization algorithms for sequential transducers. Theoretical Computer Science, 234:177–201, 2000.
Article MATH MathSciNet Google Scholar
M. Mohri. Generic epsilon-removal and input epsilon-normalization algorithms for weighted transducers. International Journal of Foundations of Computer Science, 13(1):129–143, 2002.
Article MATH MathSciNet Google Scholar
M. Mohri. Semiring frameworks and algorithms for shortest-distance problems. Journal of Automata, Languages and Combinatorics, 7(3):321–350, 2002.
MATH MathSciNet Google Scholar
M. Mohri. Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science, 14(6):957–982, 2003.
Article MATH MathSciNet Google Scholar
M. Mohri. Statistical natural language processing. In M. Lothaire, editor, Applied Combinatorics on Words. Cambridge University Press, Cambridge, 2005.
Google Scholar
M. Mohri and M.-J. Nederhof. Regular approximation of context-free grammars through transformation. In Robustness in Language and Speech Technology, pages 153–163. Kluwer Academic, Dordrecht, 2001.
Google Scholar
M. Mohri and F.C.N. Pereira. Dynamic compilation of weighted context-free grammars. In 36th Meeting of the Association for Computational Linguistics (ACL’98), Proceedings of the Conference, Montréal, Québec, Canada, 1998, pages 891–897.
Google Scholar
M. Mohri, F.C.N. Pereira, and M. Riley. Weighted automata in text and speech processing. In Proceedings of the 12th Biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended Finite State Models of Language, Budapest, Hungary, 1996. Wiley, Chichester, 1996.
Google Scholar
M. Mohri, F.C.N. Pereira, and M. Riley. The design principles of a weighted finite-state transducer library. Theoretical Computer Science, 231:17–32, 2000.
Article MATH MathSciNet Google Scholar
M. Mohri, F.C.N. Pereira, and M. Riley. Speech recognition with weighted finite-state transducers. In L. Rabiner, F. Juang, editors, Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer, Heidelberg, 2008.
Google Scholar
M. Mohri and R. Sproat. An efficient compiler for weighted rewrite rules. In 34th Meeting of the Association for Computational Linguistics (ACL’96), Proceedings of the Conference, Santa Cruz, California, 1996.
Google Scholar
F.C.N. Pereira and M.D. Riley. Speech recognition by composition of weighted finite automata. In Finite-State Language Processing, pages 431–453. MIT Press, Cambridge, 1997.
Google Scholar
D. Perrin. Finite automata. In J. Van Leuwen, editor, Handbook of Theoretical Computer Science, volume B: Formal Models and Semantics, pages 1–57. Elsevier, Amsterdam, 1990.
Google Scholar
I. Petre and A. Salomaa. Algebraic systems and pushdown automata. In this Handbook. Chapter 7. Springer, Berlin, 2009.
Google Scholar
D. Revuz. Minimisation of acyclic deterministic automata in linear time. Theoretical Computer Science, 92(1):181–189, 1992.
Article MATH MathSciNet Google Scholar
A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer, New York, 1978.
MATH Google Scholar
M.-P. Schützenberger. On the definition of a family of automata. Information and Control, 4, 1961.
Google Scholar
R. Sproat. A finite-state architecture for tokenization and grapheme-to-phoneme conversion in multilingual text analysis. In Proceedings of the ACL SIGDAT Workshop, Dublin, Ireland. Association for Computational Linguistics, Stroudsburg, 1995.
Google Scholar
S. Warshall. A theorem on Boolean matrices. Journal of the ACM, 9(1):11–12, 1962.
Article MATH MathSciNet Google Scholar
A. Weber and R. Klemm. Economy of description for single-valued transducers. Information and Computation, 118(2):327–340, 1995.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, NY, 10012, USA
Mehryar Mohri
Google Research, 76 Ninth Avenue, New York, NY, 10011, USA
Mehryar Mohri

Authors

Mehryar Mohri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mehryar Mohri .

Editor information

Editors and Affiliations

Inst. Informatik, Universität Leipzig, Augustusplatz 10-11, Leipzig, 04109, Germany
Manfred Droste
Institut für Diskrete, TU Wien, Wiedner Hauptstr. 8-10, Wien, 1040, Austria
Werner Kuich
Fak. Informatik, TU Dresden, Nöthnitzer Str. 46, Dresden, 01187, Germany
Heiko Vogler

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mohri, M. (2009). Weighted Automata Algorithms. In: Droste, M., Kuich, W., Vogler, H. (eds) Handbook of Weighted Automata. Monographs in Theoretical Computer Science. An EATCS Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01492-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-01492-5_6
Published: 16 September 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01491-8
Online ISBN: 978-3-642-01492-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics