Skip to main content

Vowel recognition for speaker independent Chinese speech recognition

  • Learning and Machine Vision
  • Conference paper
  • First Online:
  • 122 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1342))

Abstract

Vowel recognition is essential in Chinese speech recognition, especially in the speaker independent tasks. In this paper, the authors argued that the fixed length frame segmentation of the speech signal makes the feature extracting process lose essential features and introduces some irrelevant information so that the extracted features may be less expressive and consistent. Using the pitch-based dynamically adaptive frames will improve the process of extracting the speech features so that they can be more expressive for the phonemes to be recognized, and more consistent among different speakers. The algorithm for dynamically segmenting the speech signals is discussed, and a variety of features has been tested with the pitch-based adaptive frames, and a new type of feature, the FFT magnitude pattern, shows that it is very expressive and consistent and may help to simplify the recognition models. By the use of the FFT magnitude patterns, definite algorithms can be adopted at the recognition stage. This will simplify the calculation and speed up the process. The experiment is done using a finite-state machine model. The results showed that the pitch-based FFT magnitude patterns are more expressive and consistent than other features and suitable for speaker independent Chinese speech recognition tasks.

This work was supported by the Trans-Century Training Programme Foundation for the Talents by the State Education Commission, China.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Xu Bo, et al.: Large Vocabulary Isolated-Word Chinese Speech recognition Based on HMM/VQ. Proc. of National Conf. on Man-Machine Sound Communication-94, pp 146–152, Oct, 1994, Chongqing, China

    Google Scholar 

  2. Baosheng Yuan, et al.: An Unlimited Vocabulary Speaker-Dependent Chinese Speech Recognition System. Proc. NCMMSC-94 ppl57–160, Oct, 1994, Chongqing, China

    Google Scholar 

  3. Ji Tianying, et al. Continuous Speech Recognition on Chinese Limited Commands. Proc. NCMMSC-94 pp273–276, Oct, 1994, Chongqing, China

    Google Scholar 

  4. Lin-shah Lee, et al. Golden Mandarin (1)—A Real Time Mandarin Speech Dictation Machine for Chinese Language with Very Large Vocabulary, IEEE Trans. Speech & Audio Processing, Vol. 1 No 2, April 1993

    Google Scholar 

  5. Yoav Medan, et al: Super Resolution Pitch Detennination of Speech Signals, IEEE Trans. on Signal Processing. Vol.39, No. 1. pp40–48, Jan. 1991.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Abdul Sattar

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baolin, Y., Tao, Y. (1997). Vowel recognition for speaker independent Chinese speech recognition. In: Sattar, A. (eds) Advanced Topics in Artificial Intelligence. AI 1997. Lecture Notes in Computer Science, vol 1342. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63797-4_83

Download citation

  • DOI: https://doi.org/10.1007/3-540-63797-4_83

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63797-4

  • Online ISBN: 978-3-540-69649-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics