A Study of Knowledge-Based Features for Obstruent Detection and Classification in Continuous Mandarin Speech

Sung, Kuang-Ting; Wang, Hsiao-Chuan

doi:10.1007/11939993_14

Kuang-Ting Sung²² &
Hsiao-Chuan Wang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1571 Accesses
3 Citations

Abstract

A study on acoustic-phonetic features for the obstruent detection and classification based on the knowledge of Mandarin speech is proposed. Seneff auditory model is used as the front-end processor for extracting acoustic-phonetic features. These features are rich in their information content in a hierarchical decision process to detect and classify the Mandarin obstruents. The preliminary experiments showed that accuracy of obstruent detection is about 84%. An algorithm based on the information of feature distribution is applied to further classify the obstruents into stops, fricatives, and affricates. The average accuracy of obstruent classification is about 80%. The proposed approach based on the feature distribution is simple and effective. It could be a very promising method for improving the phone detection in continuous speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lee, C.-H.: From Knowledge- Ignorant to Knowledge-Rich Modeling: A New Speech Research Paradigm for Next Generation Automatic Speech Recognition. In: International Conference on Spoken Language Processing, ICSLP 2004, Plenary Session, Jeju, Korea (2004)
Google Scholar
Stevens, K.N.: Toward a model for lexical access base on acoustic landmarks and distinctive features. J. Acoust. Soc. Am. 111(4), 1872–1891 (2002)
Article Google Scholar
Seneff, S.: A Joint Synchrony/ Mean Rate Model of Auditory Speech Processing. J. Phonetics 16, 55–76 (1988)
Google Scholar
Seneff, S.: A Computational Model for the Peripheral Auditory System: Application to Speech Recognition Research. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1983–1986 (1986)
Google Scholar
Abdelatty Ali, A.M.: Auditory-Based Speech Processing Based on the Average Localized Synchrony Detection. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2000)
Google Scholar
Abdelatty Ali, A.M.: Segmentation and Categorization of Phonemes in Continuous Speech. Technical Report, TRCST25JUL98, Center for Sensor Technologies, University of Pennsylvania (1998)
Google Scholar
Aversano, G.: A New Text-Independent Method for Phoneme Segmentation. In: IEEE International Conference on Circuit and System (2001)
Google Scholar
Hongtao, H.: Temporal pre-classification for Chinese voiceless consonant speech. In: IEEE International Conference on Signal Processing (1996)
Google Scholar
Abdelatty Ali, A.M.: An Acoustic-Phonetic Feature-Based System for the Automatic Recognition of Fricative Consonants. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (1988)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, National Tsing Hua University, Hsinchu
Kuang-Ting Sung & Hsiao-Chuan Wang

Authors

Kuang-Ting Sung
View author publications
You can also search for this author in PubMed Google Scholar
Hsiao-Chuan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sung, KT., Wang, HC. (2006). A Study of Knowledge-Based Features for Obstruent Detection and Classification in Continuous Mandarin Speech. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_14

Download citation

DOI: https://doi.org/10.1007/11939993_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics