Skip to main content

Chinese Text Speech Recognition Derived from VQ-LBG Algorithm

  • Chapter
  • 143 Accesses

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 137))

Abstract

Speech recognition has been drawn extensive attentions for identity verification, man-machine interactive control, security appliance and so on, while the recognition rate of the speech text is essentially affected by the pronounce mode of different language. In order to recognize the Chinese text, which is syllabic in nature with severe word boundary uncertainty problem, a combined algorithm has been established based on Linde–Buzo–Gray (LBG) method to achieve vector quantization (VQ) for the MFCC processed Chinese speech text. Simulation indicates that the proposed algorithm is effective to construct codebooks for each solicited Chinese text word constricted by the minimum average distortion measure criterion, and the speech recognition is achieved through comparing between the codebooks and the unknown speech signal. Results also show that the codebook template dimension plays important role to affect the recognition rate, with 32 dimensions as the optimal size which can result in higher recognition rate and few computational resources occupancy. The proposed algorithm can be integrated in Chinese speech analyzing software to achieve voice signal recognition.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zue, V.: The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE 73, 1602–1615 (2005)

    Article  Google Scholar 

  2. Valin, J., Yamamoto, S., Rouat, J., Michaud, F., Nakadai, K., Okuno, H., Zue, V.: Robust recognition of simultaneous speech by a mobile robot. IEEE Transactions on Robotics 23, 742–752 (2007)

    Article  Google Scholar 

  3. Reddy, D.R.: Speech recognition by machine: A review. Proceedings of the IEEE 64, 501–531 (2005)

    Article  Google Scholar 

  4. Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, C., Rose, R., Tyagi, V., Wellekens, C.: Automatic speech recognition and speech variability: A review. Speech Communication 49, 763–786 (2007)

    Article  Google Scholar 

  5. Sha, F., Saul, L.: Large margin hidden Markov models for automatic speech recognition. In: Advances in Neural Information Processing Systems, vol. 1, pp. 1249–1256 (January 2007)

    Google Scholar 

  6. Jiang, H.: Confidence measures for speech recognition: A survey. Speech Communication 45, 455–470 (2005)

    Article  Google Scholar 

  7. He, J., Liu, L., Palm, G.: A discriminative training algorithm for VQ-Based speaker identification. IEEE Transactions on Speech and Audio Processing 7, 353–356 (1999)

    Article  Google Scholar 

  8. Kinnunen, T., Kärkkäinen, I.: Class-Discriminative Weighted Distortion Measure for VQ-based Speaker Identification. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 681–688. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag GmbH Berlin Heidelberg

About this chapter

Cite this chapter

Wang, X., Lai, W. (2012). Chinese Text Speech Recognition Derived from VQ-LBG Algorithm. In: Luo, J. (eds) Affective Computing and Intelligent Interaction. Advances in Intelligent and Soft Computing, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27866-2_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27866-2_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27865-5

  • Online ISBN: 978-3-642-27866-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics