Skip to main content

Weighted Automata Algorithms

  • Chapter
  • First Online:
Handbook of Weighted Automata

Abstract

Weighted automata and transducers are widely used in modern applications in bioinformatics and text, speech, and image processing. This chapter describes several fundamental weighted automata and shortest-distance algorithms including composition, determinization, minimization, and synchronization, as well as single-source and all-pairs shortest distance algorithms over general semirings. It presents the pseudocode of these algorithms, gives an analysis of their running time complexity, and illustrates their use in some simple cases. Many other complex weighted automata and transducer algorithms used in practice can be obtained by combining these core algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, J.E. Hopcroft, and J.D. Ullman. The Design and Analysis of Computer Algorithms. Addison–Wesley, Reading, 1974.

    MATH  Google Scholar 

  2. J. Albert and J. Kari. Digital image compression. In this Handbook. Chapter 11. Springer, Berlin, 2009.

    Google Scholar 

  3. C. Allauzen and M. Mohri. Efficient algorithms for testing the twins property. Journal of Automata, Languages and Combinatorics, 8(2):117–144, 2003.

    MATH  MathSciNet  Google Scholar 

  4. C. Allauzen and M. Mohri. N-way composition of weighted finite-state transducers. Technical Report TR2007-902, Courant Institute of Mathematical Sciences, New York University, August 2007.

    Google Scholar 

  5. C. Allauzen and M. Mohri. 3-way composition of weighted finite-state transducers. In Proceedings of the 13th International Conference on Implementation and Application of Automata (CIAA 2008), San Francisco, California, July 2008, volume 5148 of Lecture Notes in Computer Science. Springer, Heidelberg, 2008.

    Google Scholar 

  6. C. Allauzen, M. Mohri, and M. Riley. Statistical modeling for unit selection in speech synthesis. In 42nd Meeting of the Association for Computational Linguistics (ACL 2004), Proceedings of the Conference, Barcelona, Spain, July 2004.

    Google Scholar 

  7. C. Allauzen, M. Mohri, and B. Roark. Generalized algorithms for constructing statistical language models. In 41st Meeting of the Association for Computational Linguistics (ACL 2003), Proceedings of the Conference, Sapporo, Japan, July 2003.

    Google Scholar 

  8. C. Allauzen, M. Mohri, and B. Roark. The design principles and algorithms of a weighted grammar library. International Journal of Foundations of Computer Science, 16(3):403–421, 2005.

    Article  MATH  MathSciNet  Google Scholar 

  9. C. Allauzen, M. Mohri, and A. Talwalkar. Sequence kernels for predicting protein essentiality. In Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), Helsinki, Finland, July 2008.

    Google Scholar 

  10. C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri. OpenFst: A general and efficient weighted finite-state transducer library. In Proceedings of the 12th International Conference on Implementation and Application of Automata (CIAA 2007), Prague, Czech Republic, July 2007, volume 4783 of Lecture Notes in Computer Science, pages 11–23. Springer, Heidelberg, 2007.

    Google Scholar 

  11. M.-P. Béal, O. Carton, C. Prieur, and J. Sakarovitch. Squaring transducers: An efficient procedure for deciding functionality and sequentiality. In Proceedings of LATIN’2000, volume 1776 of Lecture Notes in Computer Science. Springer, Heidelberg, 2000.

    Google Scholar 

  12. J. Berstel. Transductions and Context-Free Languages. Teubner Studienbucher, Stuttgart, 1979.

    MATH  Google Scholar 

  13. J. Berstel and C. Reutenauer. Rational Series and Their Languages. Springer, Berlin, 1988.

    MATH  Google Scholar 

  14. T.M. Breuel. The OCRopus open source OCR system. In Proceedings of IS&T/SPIE 20th Annual Symposium, 2008.

    Google Scholar 

  15. C. Choffrut. Une caractérisation des fonctions séquentielles et des fonctions sous-séquentielles en tant que relations rationnelles. Theoretical Computer Science, 5:325–338, 1977.

    Article  MathSciNet  Google Scholar 

  16. C. Choffrut. Contributions à l’étude de quelques familles remarquables de fonctions rationnelles. PhD thesis (thèse de doctorat d’Etat), Université Paris 7, LITP, Paris, 1978.

    Google Scholar 

  17. T.H. Cormen, C.E. Leiserson, and R.E. Rivest. Introduction to Algorithms. MIT Press, Cambridge, 1992.

    Google Scholar 

  18. C. Cortes, P. Haffner, and M. Mohri. A machine learning framework for spoken-dialog classification. In L. Rabiner and F. Juang, editors, Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer, Heidelberg, 2008.

    Google Scholar 

  19. C. Cortes, M. Mohri, and A. Rastogi. L p distance and equivalence of probabilistic automata. International Journal of Foundations of Computer Science, 18(4):761–780, 2007.

    Article  MATH  MathSciNet  Google Scholar 

  20. M. Droste and W. Kuich. Semirings and formal power series. In this Handbook. Chapter 1. Springer, Berlin, 2009.

    Google Scholar 

  21. R. Durbin, S.R. Eddy, A. Krogh, and G.J. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, 1998.

    MATH  Google Scholar 

  22. S. Eilenberg. Automata, Languages and Machines, volume A. Academic Press, San Diego, 1974.

    Google Scholar 

  23. Z. Ésik and W. Kuich. Modern automata theory. www.dmg.tuwien.ac.at/kuich.

  24. Z. Esik and W. Kuich. Rationally additive semirings. Journal of Universal Computer Science, 8:173–183, 2002.

    MathSciNet  Google Scholar 

  25. R.W. Floyd. Algorithm 97 (SHORTEST PATH). Communications of the ACM, 18, 1968.

    Google Scholar 

  26. M. Goldstern. Vervollständigung von Halbringen. Master’s thesis, Technische Universität Wien, 1985.

    Google Scholar 

  27. S. Inenaga, H. Hoshino, A. Shinohara, M. Takeda, and S. Arikawa. Construction of the CDAWG for a Trie. In Proceedings of the Prague Stringology Conference (PSC’01). Czech Technical University, Prague, 2001.

    Google Scholar 

  28. D.B. Johnson. Efficient algorithms for shortest paths in sparse networks. Journal of the ACM, 24(1):1–13, 1977.

    Article  MATH  Google Scholar 

  29. R.M. Kaplan and M. Kay. Regular models of phonological rule systems. Computational Linguistics, 20(3), 1994.

    Google Scholar 

  30. L. Karttunen. The replace operator. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 16–23. Association for Computational Linguistics, Stroudsburg, 1995. Distributed by Morgan Kaufmann, San Francisco.

    Chapter  Google Scholar 

  31. D. Kirsten. A Burnside approach to the termination of Mohri’s algorithm for polynomially ambiguous min-plus-automata. Informatique Théorique et Applications, RAIRO, Special Issue on Journées Montoises d’Informatique Théorique 2006 (JM’06), 42(3):553–581, 2008.

    MATH  MathSciNet  Google Scholar 

  32. I. Klimann, S. Lombardy, J. Mairesse, and C. Prieur. Deciding unambiguity and sequentiality from a finitely ambiguous max-plus automaton. Theoretical Computer Science, 327(3):349–373, 2004.

    Article  MATH  MathSciNet  Google Scholar 

  33. K. Knight and J. May. Applications of weighted automata in natural language processing. In this Handbook. Chapter 14. Springer, Berlin, 2009.

    Google Scholar 

  34. D. Krob. The equality problem for rational series with multiplicities in the tropical semiring is undecidable. Journal of Algebra and Computation, 4, 1994.

    Google Scholar 

  35. W. Kuich. Semirings and formal languages: Their relevance to formal languages and automata. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1: Word, Language, Grammar, pages 609–677. Springer, New York, 1997.

    Google Scholar 

  36. W. Kuich and A. Salomaa. Semirings, Automata, Languages, volume 5 of Monographs in Theoretical Computer Science. An EATCS Series. Springer, Berlin, 1986.

    MATH  Google Scholar 

  37. D.J. Lehmann. Algebraic structures for transitive closures. Theoretical Computer Science, 4:59–76, 1977.

    Article  MATH  MathSciNet  Google Scholar 

  38. M. Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23:2, 1997.

    MathSciNet  Google Scholar 

  39. M. Mohri. Minimization algorithms for sequential transducers. Theoretical Computer Science, 234:177–201, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  40. M. Mohri. Generic epsilon-removal and input epsilon-normalization algorithms for weighted transducers. International Journal of Foundations of Computer Science, 13(1):129–143, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  41. M. Mohri. Semiring frameworks and algorithms for shortest-distance problems. Journal of Automata, Languages and Combinatorics, 7(3):321–350, 2002.

    MATH  MathSciNet  Google Scholar 

  42. M. Mohri. Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science, 14(6):957–982, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  43. M. Mohri. Statistical natural language processing. In M. Lothaire, editor, Applied Combinatorics on Words. Cambridge University Press, Cambridge, 2005.

    Google Scholar 

  44. M. Mohri and M.-J. Nederhof. Regular approximation of context-free grammars through transformation. In Robustness in Language and Speech Technology, pages 153–163. Kluwer Academic, Dordrecht, 2001.

    Google Scholar 

  45. M. Mohri and F.C.N. Pereira. Dynamic compilation of weighted context-free grammars. In 36th Meeting of the Association for Computational Linguistics (ACL’98), Proceedings of the Conference, Montréal, Québec, Canada, 1998, pages 891–897.

    Google Scholar 

  46. M. Mohri, F.C.N. Pereira, and M. Riley. Weighted automata in text and speech processing. In Proceedings of the 12th Biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended Finite State Models of Language, Budapest, Hungary, 1996. Wiley, Chichester, 1996.

    Google Scholar 

  47. M. Mohri, F.C.N. Pereira, and M. Riley. The design principles of a weighted finite-state transducer library. Theoretical Computer Science, 231:17–32, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  48. M. Mohri, F.C.N. Pereira, and M. Riley. Speech recognition with weighted finite-state transducers. In L. Rabiner, F. Juang, editors, Handbook on Speech Processing and Speech Communication, Part E: Speech Recognition. Springer, Heidelberg, 2008.

    Google Scholar 

  49. M. Mohri and R. Sproat. An efficient compiler for weighted rewrite rules. In 34th Meeting of the Association for Computational Linguistics (ACL’96), Proceedings of the Conference, Santa Cruz, California, 1996.

    Google Scholar 

  50. F.C.N. Pereira and M.D. Riley. Speech recognition by composition of weighted finite automata. In Finite-State Language Processing, pages 431–453. MIT Press, Cambridge, 1997.

    Google Scholar 

  51. D. Perrin. Finite automata. In J. Van Leuwen, editor, Handbook of Theoretical Computer Science, volume B: Formal Models and Semantics, pages 1–57. Elsevier, Amsterdam, 1990.

    Google Scholar 

  52. I. Petre and A. Salomaa. Algebraic systems and pushdown automata. In this Handbook. Chapter 7. Springer, Berlin, 2009.

    Google Scholar 

  53. D. Revuz. Minimisation of acyclic deterministic automata in linear time. Theoretical Computer Science, 92(1):181–189, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  54. A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer, New York, 1978.

    MATH  Google Scholar 

  55. M.-P. Schützenberger. On the definition of a family of automata. Information and Control, 4, 1961.

    Google Scholar 

  56. R. Sproat. A finite-state architecture for tokenization and grapheme-to-phoneme conversion in multilingual text analysis. In Proceedings of the ACL SIGDAT Workshop, Dublin, Ireland. Association for Computational Linguistics, Stroudsburg, 1995.

    Google Scholar 

  57. S. Warshall. A theorem on Boolean matrices. Journal of the ACM, 9(1):11–12, 1962.

    Article  MATH  MathSciNet  Google Scholar 

  58. A. Weber and R. Klemm. Economy of description for single-valued transducers. Information and Computation, 118(2):327–340, 1995.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehryar Mohri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mohri, M. (2009). Weighted Automata Algorithms. In: Droste, M., Kuich, W., Vogler, H. (eds) Handbook of Weighted Automata. Monographs in Theoretical Computer Science. An EATCS Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01492-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01492-5_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01491-8

  • Online ISBN: 978-3-642-01492-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics