Abstract
Speech recognition has been drawn extensive attentions for identity verification, man-machine interactive control, security appliance and so on, while the recognition rate of the speech text is essentially affected by the pronounce mode of different language. In order to recognize the Chinese text, which is syllabic in nature with severe word boundary uncertainty problem, a combined algorithm has been established based on Linde–Buzo–Gray (LBG) method to achieve vector quantization (VQ) for the MFCC processed Chinese speech text. Simulation indicates that the proposed algorithm is effective to construct codebooks for each solicited Chinese text word constricted by the minimum average distortion measure criterion, and the speech recognition is achieved through comparing between the codebooks and the unknown speech signal. Results also show that the codebook template dimension plays important role to affect the recognition rate, with 32 dimensions as the optimal size which can result in higher recognition rate and few computational resources occupancy. The proposed algorithm can be integrated in Chinese speech analyzing software to achieve voice signal recognition.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Zue, V.: The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE 73, 1602–1615 (2005)
Valin, J., Yamamoto, S., Rouat, J., Michaud, F., Nakadai, K., Okuno, H., Zue, V.: Robust recognition of simultaneous speech by a mobile robot. IEEE Transactions on Robotics 23, 742–752 (2007)
Reddy, D.R.: Speech recognition by machine: A review. Proceedings of the IEEE 64, 501–531 (2005)
Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, C., Rose, R., Tyagi, V., Wellekens, C.: Automatic speech recognition and speech variability: A review. Speech Communication 49, 763–786 (2007)
Sha, F., Saul, L.: Large margin hidden Markov models for automatic speech recognition. In: Advances in Neural Information Processing Systems, vol. 1, pp. 1249–1256 (January 2007)
Jiang, H.: Confidence measures for speech recognition: A survey. Speech Communication 45, 455–470 (2005)
He, J., Liu, L., Palm, G.: A discriminative training algorithm for VQ-Based speaker identification. IEEE Transactions on Speech and Audio Processing 7, 353–356 (1999)
Kinnunen, T., Kärkkäinen, I.: Class-Discriminative Weighted Distortion Measure for VQ-based Speaker Identification. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 681–688. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
Wang, X., Lai, W. (2012). Chinese Text Speech Recognition Derived from VQ-LBG Algorithm. In: Luo, J. (eds) Affective Computing and Intelligent Interaction. Advances in Intelligent and Soft Computing, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27866-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-27866-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27865-5
Online ISBN: 978-3-642-27866-2
eBook Packages: EngineeringEngineering (R0)