Skip to main content

Tiburon: A Weighted Tree Automata Toolkit

  • Conference paper
Book cover Implementation and Application of Automata (CIAA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4094))

Included in the following conference series:

Abstract

The availability of weighted finite-state string automata toolkits made possible great advances in natural language processing. However, recent advances in syntax-based NLP model design are unsuitable for these toolkits. To combat this problem, we introduce a weighted finite-state tree automata toolkit, which incorporates recent developments in weighted tree automata theory and is useful for natural language applications such as machine translation, sentence compression, question answering, and many more.

The authors wish to thank Steve DeNeefe, Jonathan Graehl, Mark Hopkins, Liang Huang, Daniel Marcu, and Magnus Steinby for their advice and comments. This work was partially supported by NSF grant IIS-0428020 and by GALE-DARPA Contract HR0011-06-C-0022.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kaplan, R.M., Kay, M.: Phonological rules and finite-state transducers. In: Linguistic Society of America Meeting Handbook, Fifty-Sixth Annual Meeting (1981) (abstract)

    Google Scholar 

  2. Koskenniemi, K.: Two-level morphology: A general computational model for word-form recognition and production. Publication 11, University of Helsinki, Department of General Linguistics, Helsinki (1983)

    Google Scholar 

  3. Karttunen, L., Beesley, K.R.: A short history of two-level morphology. In: ESSLLI 2001, Special Event titled Twenty Years of Finite-State Morphology, Helsinki, Finland (2001)

    Google Scholar 

  4. Karttunen, L., Beesley, K.R.: Two-level rule compiler. Technical Report ISTL-92-2, Xerox Palo Alto Research Center, Palo Alto, CA (1992)

    Google Scholar 

  5. Karttunen, L., Kaplan, R.M., Zaenen, A.: Two-level morphology with composition. In: COLING Proceedings (1992)

    Google Scholar 

  6. Karttunen, L.: The replace operator. In: ACL Proceedings (1995)

    Google Scholar 

  7. Karttunen, L.: Directed replacement. In: ACL Proceedings (1996)

    Google Scholar 

  8. Riccardi, G., Pieraccini, R., Bocchieri, E.: Stochastic automata for language modeling. Computer Speech & Language 10(4) (1996)

    Google Scholar 

  9. Ljolje, A., Riley, M.D.: Optimal speech recognition using phone recognition and lexical access. In: ICSLP Proceedings (1992)

    Google Scholar 

  10. Mohri, M., Pereira, F.C.N., Riley, M.: The design principles of a weighted finite-state transducer library. Theoretical Computer Science 231 (2000)

    Google Scholar 

  11. Mohri, M., Pereira, F.C.N., Riley, M.: A rational design for a weighted finite-state transducer library. In: Proceedings of the 7th Annual AT&T Software Symposium (1997)

    Google Scholar 

  12. van Noord, G., Gerdemann, D.: An extendible regular expression compiler for finite-state approaches in natural language processing. In: 4th International Workshop on Implementing Automata (2000)

    Google Scholar 

  13. Kanthak, S., Ney, H.: Fsa: An efficient and flexible c++ toolkit for finite state automata using on-demand computation. In: ACL Proceedings (2004)

    Google Scholar 

  14. Graehl, J.: Carmel finite-state toolkit (1997), http://www.isi.edu/licensed-sw/carmel

  15. Kaiser, E., Schalkwyk, J.: Building a robust, skipping parser within the AT&T FSM toolkit. Technical report, Center for Human Computer Communication, Oregon Graduate Institute of Science and Technology (2001)

    Google Scholar 

  16. van Noord, G.: Treatment of epsilon moves in subset construction. Comput. Linguist. 26(1) (2000)

    Google Scholar 

  17. Koehn, P., Knight, K.: Feature-rich statistical translation of noun phrases. In: ACL Proceedings (2003)

    Google Scholar 

  18. Pereira, F., Riley, M.: Speech recognition by composition of weighted finite automata. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing. MIT Press, Cambridge (1997)

    Google Scholar 

  19. Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2) (1997)

    Google Scholar 

  20. Rounds, W.C.: Mappings and grammars on trees. Mathematical Systems Theory 4 (1970)

    Google Scholar 

  21. Och, F.J., Tillmann, C., Ney, H.: Improved alignment models for statistical machine translation. In: EMNLP/VLC Proceedings (1999)

    Google Scholar 

  22. Yamada, K., Knight, K.: A syntax-based statistical translation model. In: ACL Proceedings (2001)

    Google Scholar 

  23. Eisner, J.: Learning non-isomorphic tree mappings for machine translation. In: ACL Proceedings (companion volume) (2003)

    Google Scholar 

  24. Knight, K., Marcu, D.: Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139 (2002)

    Google Scholar 

  25. Pang, B., Knight, K., Marcu, D.: Syntax-based alignment of multiple translations extracting paraphrases and generating new sentences. In: NAACL Proceedings (2003)

    Google Scholar 

  26. Charniak, E.: Immediate-head parsing for language models. In: ACL Proceedings (2001)

    Google Scholar 

  27. Yamada, K.: A Syntax-Based Translation Model. PhD thesis, University of Southern California (2002)

    Google Scholar 

  28. Allauzen, C., Mohri, M., Roark, B.: A general weighted grammar library. In: Domaratzki, M., Okhotin, A., Salomaa, K., Yu, S. (eds.) CIAA 2004. LNCS, vol. 3317, pp. 23–34. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  29. Knight, K., Graehl, J.: An overview of probabilistic tree transducers for natural language processing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 1–24. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  30. Thatcher, J.W.: Generalized2 sequential machines. J. Comput. System Sci. 4 (1970)

    Google Scholar 

  31. Gécseg, F., Steinby, M.: Tree Automata. Akadémiai Kiadó, Budapest (1984)

    MATH  Google Scholar 

  32. Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (1997) (release October 1, 2002), Available on: http://www.grappa.univ-lille3.fr/tata

  33. Genet, T., Tong, V.V.T.: Reachability analysis of term rewriting systems with timbuk. In: Nieuwenhuis, R., Voronkov, A. (eds.) LPAR 2001. LNCS, vol. 2250, p. 695. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  34. Borovansky, P., Kirchner, C., Kirchner, H., Moreau, P., Vittek, M.: Elan: A logical framework based on computational systems. In: Proceedings of the first international workshop on rewriting logic (1996)

    Google Scholar 

  35. Henriksen, J., Jensen, J., Jørgensen, M., Klarlund, N., Paige, B., Rauhe, T., Sandholm, A.: Mona: Monadic second-order logic in practice. In: Brinksma, E., Steffen, B., Cleaveland, W.R., Larsen, K.G., Margaria, T. (eds.) TACAS 1995. LNCS, vol. 1019. Springer, Heidelberg (1995)

    Google Scholar 

  36. Magidor, M., Moran, G.: Probabilistic tree automata. Israel Journal of Mathematics 8 (1969)

    Google Scholar 

  37. Fülöp, Z., Vogler, H.: Weighted tree transducers. J. Autom. Lang. Comb. 9(1) (2004)

    Google Scholar 

  38. Kuich, W.: Tree transducers and formal tree series. Acta Cybernet 14 (1999)

    Google Scholar 

  39. Brainerd, W.S.: Tree generating regular systems. Inform. and Control 14 (1969)

    Google Scholar 

  40. Knuth, D.: A generalization of Dijkstra’s algorithm. Inform. Process. Lett. 6(1) (1977)

    Google Scholar 

  41. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische Mathematik 1 (1959)

    Google Scholar 

  42. Huang, L., Chiang, D.: Better k-best parsing. In: IWPT Proceedings (2005)

    Google Scholar 

  43. Galley, M., Hopkins, M., Knight, K., Marcu, D.: What’s in a translation rule? In: HLT-NAACL Proceedings (2004)

    Google Scholar 

  44. Bod, R.: An efficient implementation of a new DOP model. In: EACL Proceedings (2003)

    Google Scholar 

  45. May, J., Knight, K.: A better n-best list: Practical determinization of weighted finite tree automata. In: NAACL Proceedings (2006)

    Google Scholar 

  46. Siztus, A., Ortmanns, S.: High quality word graphs using forward-backward pruning. In: Proceedings of the IEEE Conference on Acoustic, Speech and Signal Processing (1999)

    Google Scholar 

  47. Graehl, J.: Context-free algorithms (unpublished handout) (2005)

    Google Scholar 

  48. Lari, K., Young, S.J.: The estimation of stochastic context-free grammars using the inside-outside algorithm. Computer Speech and Language 4 (1990)

    Google Scholar 

  49. Aho, A.V., Ullman, J.D.: Translations of a context-free grammar. Inform. and Control 19 (1971)

    Google Scholar 

  50. Shieber, S.M.: Synchronous grammars as tree transducers. In: TAG+7 Proceedings (2004)

    Google Scholar 

  51. Schabes, Y.: Mathematical and Computational Aspects of Lexicalized Grammars. PhD thesis, Univ. of Pennsylvania, Phila., PA (1990)

    Google Scholar 

  52. Engelfriet, J.: Bottom-up and top-down tree transformations. a comparison. Mathematical Systems Theory 9 (1976)

    Google Scholar 

  53. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1) (1977)

    Google Scholar 

  54. Graehl, J., Knight, K.: Training tree transducers. In: HLT-NAACL Proceedings (2004)

    Google Scholar 

  55. Graehl, J., Knight, K., May, J.: Training tree transducers. Comput. Linguist. (submitted)

    Google Scholar 

  56. Knight, K., Graehl, J.: Machine transliteration. Comput. Linguist. 24(4) (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

May, J., Knight, K. (2006). Tiburon: A Weighted Tree Automata Toolkit. In: Ibarra, O.H., Yen, HC. (eds) Implementation and Application of Automata. CIAA 2006. Lecture Notes in Computer Science, vol 4094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11812128_11

Download citation

  • DOI: https://doi.org/10.1007/11812128_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37213-4

  • Online ISBN: 978-3-540-37214-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics