Automatic Construction of FSA Language Model for Speech Recognition by FSA DP-Matching

Morimoto, Tsuyoshi; Takahashi, Shin-ya

doi:10.1007/978-0-387-74935-8_36

Tsuyoshi Morimoto⁴ &
Shin-ya Takahashi⁴

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 6))

800 Accesses
1 Citations

For accurate speech recognition, a well-defined language model is necessary. When a large amount of learning texts (corpus) is available, a statistical language model such as bi-gram or tri-gram generated from a corpus is quite powerful. For example, most of current dictation systems employ bi-gram or tri-gram language models. However, if the size of the corpus is not sufficiently large, reliability of statistical information calculated from the corpus decreases and then so does the effectiveness of the generated statistical language model (sparseness problem). Furthermore, preparing a sufficient amount of texts for spoken language is generally very expensive. Therefore, a finite state automaton (FSA) language model is generally used for small- or middle-size (around 1000 words) vocabulary speech recognition. However, defining a FSA model by hand requires much human effort. In some systems, a FSA model is automatically generated by converting from regular-grammar type grammar rules (e.g., see Hparse [1]), but it is still a very time- and effort-consuming task to prepare a grammar with sufficient coverage and consistency.

In this chapter, we propose a new method to construct a FSA language model, by using a FSA DP (dynamic programming) matching method. We also report some experimental results; the method was applied to a travel conversation corpus (with about a thousand sentences) to generate a FSA model, and then, speech recognition experiments using the language model were conducted. The result shows that the recognition correct rate is high enough for closed data, but is not so satisfactory for open data. To cope with this problem, we also propose an additional mechanism that decides whether a recognized result can be accepted or should be rejected by evaluating a distance of the result from the learning corpus. After the mechanism is applied, the recognition correct rate for accepted results is considerably improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Young et al.: The HTK Book (for Ver. 3.0), 1999 (http://htk.eng.cam.ac.uk/)
K. J. Lang, B. A. Pearlmutter, and R. Price: Results of the Abbadingo one DFA learning competition and a new evidence driven state merging algorithm, Proceedings of International Colloquium Grammatical Inference, pp. 1–12, 1998
Google Scholar
S. M. Lucas and T. J. Reynolds: Learning deterministic finite automata with a smart state labeling evolutionary algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 7, July 2006
Google Scholar
C. Kermorvant, C. de la Hinguera, and P. Dupont: Learning typed automata from automatically labeled data, Journal Électronique d’Intelligence Artificielle, Vol. 6, No. 45, 2004
Google Scholar
J. Hu, W. Turin, and M. K. Brown: Language modeling with stochastic automata, Proceedings of International Conference on Spoken Language Processing (ICSLP)-1996, 1996
Google Scholar
G. Riccardi, R. Pieraccini, and E. Boccieri: Stochastic automata for language modeling, Computer Speech and Language, Vol. 10, No. 4, pp. 265–293, 1996
Article Google Scholar
A. V. Aho, J. D. Ullman, and J. E. Hopcroft: Data Structures and Algorithms, Addison-Wesley, 1983
Google Scholar
T. Kawahara, A. Lee, K. Takeda, K. Itou, and K. Shikano: Recent progress of open-source lvcsr engine julius and japanese model repository—Software of continuous speech recognition consortium, Proceedings of International Conference on Spoken Language Processing (ICSLP)-2004, 2004 (http://julius.sourceforge.jp/en/julius.html)

Download references

Author information

Authors and Affiliations

Electronics and Computer Science Department, Fukuoka University, 8-19-1 Nanakuma, Jonan-ku, Fukuoka, 814-0180, Japan
Tsuyoshi Morimoto & Shin-ya Takahashi

Authors

Tsuyoshi Morimoto
View author publications
You can also search for this author in PubMed Google Scholar
Shin-ya Takahashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Tijuana Institute of Technology, 4207, Chula Vista, CA, 91909, USA
Oscar Castillo
Department of Systems Science and Engineering Yu-Quan Campus, Zhejiang University College of Electrical Engineering, 310027, Hangzhou, People's Republic of China
Li Xu
IAENG Secretariat, 37–39 Hung To Road Unit 1, 1/F, Hong Kong, People's Republic of China
Sio-Iong Ao

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Morimoto, T., Takahashi, Sy. (2008). Automatic Construction of FSA Language Model for Speech Recognition by FSA DP-Matching. In: Castillo, O., Xu, L., Ao, SI. (eds) Trends in Intelligent Systems and Computer Engineering. Lecture Notes in Electrical Engineering, vol 6. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-74935-8_36

Download citation

DOI: https://doi.org/10.1007/978-0-387-74935-8_36
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-74934-1
Online ISBN: 978-0-387-74935-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics