Skip to main content

Automatic Construction of FSA Language Model for Speech Recognition by FSA DP-Matching

  • Chapter
Trends in Intelligent Systems and Computer Engineering

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 6))

For accurate speech recognition, a well-defined language model is necessary. When a large amount of learning texts (corpus) is available, a statistical language model such as bi-gram or tri-gram generated from a corpus is quite powerful. For example, most of current dictation systems employ bi-gram or tri-gram language models. However, if the size of the corpus is not sufficiently large, reliability of statistical information calculated from the corpus decreases and then so does the effectiveness of the generated statistical language model (sparseness problem). Furthermore, preparing a sufficient amount of texts for spoken language is generally very expensive. Therefore, a finite state automaton (FSA) language model is generally used for small- or middle-size (around 1000 words) vocabulary speech recognition. However, defining a FSA model by hand requires much human effort. In some systems, a FSA model is automatically generated by converting from regular-grammar type grammar rules (e.g., see Hparse [1]), but it is still a very time- and effort-consuming task to prepare a grammar with sufficient coverage and consistency.

In this chapter, we propose a new method to construct a FSA language model, by using a FSA DP (dynamic programming) matching method. We also report some experimental results; the method was applied to a travel conversation corpus (with about a thousand sentences) to generate a FSA model, and then, speech recognition experiments using the language model were conducted. The result shows that the recognition correct rate is high enough for closed data, but is not so satisfactory for open data. To cope with this problem, we also propose an additional mechanism that decides whether a recognized result can be accepted or should be rejected by evaluating a distance of the result from the learning corpus. After the mechanism is applied, the recognition correct rate for accepted results is considerably improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Young et al.: The HTK Book (for Ver. 3.0), 1999 (http://htk.eng.cam.ac.uk/)

  2. K. J. Lang, B. A. Pearlmutter, and R. Price: Results of the Abbadingo one DFA learning competition and a new evidence driven state merging algorithm, Proceedings of International Colloquium Grammatical Inference, pp. 1–12, 1998

    Google Scholar 

  3. S. M. Lucas and T. J. Reynolds: Learning deterministic finite automata with a smart state labeling evolutionary algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 7, July 2006

    Google Scholar 

  4. C. Kermorvant, C. de la Hinguera, and P. Dupont: Learning typed automata from automatically labeled data, Journal Électronique d’Intelligence Artificielle, Vol. 6, No. 45, 2004

    Google Scholar 

  5. J. Hu, W. Turin, and M. K. Brown: Language modeling with stochastic automata, Proceedings of International Conference on Spoken Language Processing (ICSLP)-1996, 1996

    Google Scholar 

  6. G. Riccardi, R. Pieraccini, and E. Boccieri: Stochastic automata for language modeling, Computer Speech and Language, Vol. 10, No. 4, pp. 265–293, 1996

    Article  Google Scholar 

  7. A. V. Aho, J. D. Ullman, and J. E. Hopcroft: Data Structures and Algorithms, Addison-Wesley, 1983

    Google Scholar 

  8. T. Kawahara, A. Lee, K. Takeda, K. Itou, and K. Shikano: Recent progress of open-source lvcsr engine julius and japanese model repository—Software of continuous speech recognition consortium, Proceedings of International Conference on Spoken Language Processing (ICSLP)-2004, 2004 (http://julius.sourceforge.jp/en/julius.html)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Morimoto, T., Takahashi, Sy. (2008). Automatic Construction of FSA Language Model for Speech Recognition by FSA DP-Matching. In: Castillo, O., Xu, L., Ao, SI. (eds) Trends in Intelligent Systems and Computer Engineering. Lecture Notes in Electrical Engineering, vol 6. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-74935-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-74935-8_36

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-74934-1

  • Online ISBN: 978-0-387-74935-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics