Advertisement

Improving the syllable-synchronous network search algorithm for word decoding in continuous Chinese speech recognition

  • Zheng Fang Email author
  • Wu Jian 
  • Song Zhanjiang 
Article

Abstract

The previously proposed syllable-synchronous network search (SSNS) algorithm plays a very important role in the word decoding of the continuous Chinese speech recognition and achieves satisfying performance. Several related key factors that may affect the overall word decoding effect are carefully studied in this paper, including the perfecting of the vocabulary, the big-discount Turing re-estimating of theN-Gram probabilities, and the managing of the searching path buffers. Based on these discussions, corresponding approaches to improving the SSNS algorithm are proposed. Compared with the previous version of SSNS algorithm, the new version decreases the Chinese character error rate (CCER) in the word decoding by 42.1% across a database consisting of a large number of testing sentences (syllable strings).

Keywords

large-vocabulary continuous Chinese speech recognition word decoding syllable-synchronous network search word segmentation 

References

  1. [1]
    Zheng F. A syllable-synchronous network search algorithm for word decoding in Chinese speech recognition.IEEE International Conf. Acoust., Speech and Signal Processing (ICASSP) March 15–19, 1999, Phoenix, USA, pp.II-601-604.Google Scholar
  2. [2]
    Viterbi A J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm.IEEE Trans. IT, 1967, 13(2).CrossRefGoogle Scholar
  3. [3]
    Lee C-H, Rabiner L R. A frame-synchronous network search algorithm for connected word recognition.IEEE Trans. ASSP, Nov. 1989, 37(11): 1649–1658.Google Scholar
  4. [4]
    Zheng F. Studies on approaches to keyword spotting in unconstrained continuous speech [dissertation]. Dept. of Computer Sci. & Tech., Tsinghua University, May 1997.Google Scholar
  5. [5]
    Zheng F, Mou X-L, Xu M-Xet al. The implementation of a speech-to-text editor. In5th National Conference on Man-Machine Speech Communication (NCMMSC-98), 1998, pp.280–285. (in Chinese)Google Scholar
  6. [6]
    Zheng F, Song Z-J, Xu M-X,et al. EasyTalk: A large-vocabulary speaker-independent Chinese dictation machine. InEuroSpeech’99, Budapest, Sept. 1999, 2: 819–822.Google Scholar
  7. [7]
    Nadas A. On turing’s formula for word probabilities.IEEE Trans. on ASSP, June 1985, ASSP-33(6).Google Scholar
  8. [8]
    Mou X-L, Zhan J-M, Zheng F, Wu W-H. The back-off algorithm based N-gram language model. In5th National Conference on Man-Machine Speech Communication (NCMMSC-98), 1998, pp.206–209. (in Chinese)Google Scholar
  9. [9]
    Zheng F, Mou X-L, Wu W-H, Fang D-T. On the embedded multiple-model scoring scheme for speech recognition.International Symposium on Chinese Spoken Language Processing (ISCSLP’98), Singapore, Dec. 7–9, 1998, ASR-A3, pp.49–53.Google Scholar
  10. [10]
    Katz S M. Estimation of probabilities from sparse data for the language model component of a speech recognizer.IEEE Trans. ASSP, March 1987, 35(3): 400–401.CrossRefGoogle Scholar

Copyright information

© Science Press, Beijing China and Allerton Press Inc. 2000

Authors and Affiliations

  1. 1.Center of Speech Technology, State Key Lab of Intelligent Technology and Systems Department of Computer Science and TechnologyTsinghua UniversityBeijingP.R. China

Personalised recommendations