Characterization of Consonant Sounds Using Features Related to Place of Articulation
Abstract
Speech sounds are classified into 5 classes, grouped based on place and manner of articulation: velar, palatal, retroflex, dental and labial. In this paper, an attempt has been made to explore the role of place of articulation and vocal tract length in characterizing the different class of speech sounds. Formants and vocal tract length available for the production of each class of sound are extracted from the region of transition from consonant burst to the rising profile of the immediate following vowel. These features along with their statistical variations are considered for the analysis. Based on the non-linear nature of the features Random Forest (RF) is used for the classification. From the results, it is observed that the proposed features are efficient in discriminating the class of consonants: velar and palatal, palatal and retroflex and palatal and labial sounds with an accuracy of 92.9%, 93.83 and 94.07 respectively.
Keywords
Formants Manner of articulation Place of articulation Random forest Vocal tract lengthReferences
- 1.Jones, D.: The phoneme: its nature and Use. Cambridge, England, Heffer (1950)Google Scholar
- 2.Denes, P.B.: On the statistics of spoken English. J. Acoust. Soc. Am. 35(6), 892–904 (1963)CrossRefGoogle Scholar
- 3.Clements, G.N.: Place of articulation in consonants and vowels: a unified theory. Work. Pap. Cornell Phon. Lab. 5, 77–123 (1991)Google Scholar
- 4.Rabiner, L. R., Juang, B. H.: Fundamentals of speech recognition. Tsinghua University Press (1999)Google Scholar
- 5.Hogg, R.M.: Phonology and morphology. Camb. Hist. Engl. Lang. 1, 67–167 (1992)CrossRefGoogle Scholar
- 6.Grunwell, P.: Phonological Assessment of Child Speech (PACS). College Hill Press (1985)Google Scholar
- 7.Shriberg, L.D., Kwiatkowski, J.: Phonological disorders I: A diagnostic classification system. J. Speech Hear. Disord. 47(3), 226–241 (1982)CrossRefGoogle Scholar
- 8.Eskenazi, M.: Using automatic speech processing for foreign language pronunciation tutoring: Some issues and a prototype. Lang. Learn. Tech. 2(2), 62–76 (1999)Google Scholar
- 9.Fukada, T., Tokuda, K., Kobayashi, T., Imai, S.: An adaptive algorithm for mel-cepstral analysis of speech. In: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992. ICASSP-92, vol. 1, pp. 137–140. IEEE (1992)Google Scholar
- 10.Shrawankar, U., Thakare, V.M.: Techniques for feature extraction in speech recognition system: A comparative study arXiv:1305.1145 (2013)
- 11.Zue, V.W.: The use of speech knowledge in automatic speech recognition. Proc. IEEE 73(11), 1602–1615 (1985)CrossRefGoogle Scholar
- 12.Milone, D.H., Rubio, A.J.: Prosodic and accentual information for automatic speech recognition. IEEE Trans. Speech Audio Process. 11(4), 321–333 (2003)CrossRefGoogle Scholar
- 13.Manjunath, K.E., Sreenivasa Rao, K.: Articulatory and excitation source features for speech recognition in read, extempore and conversation modes. Int. J. Speech Technol. 19(1), 121–134 (2016)CrossRefGoogle Scholar
- 14.Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus. NASA STI/Recon technical report n 93, (1993)Google Scholar
- 15.McCandless, S.: An algorithm for automatic formant extraction using linear prediction spectra. IEEE Trans. Acoust., Speech, Signal Process. 22(2), 135–141 (1974)CrossRefGoogle Scholar
- 16.Stevens, K.N.: Acoustic Phonetics, vol. 30. MIT Press, Cambridge (2000)Google Scholar
- 17.Paige, A., Zue, V.: Calculation of vocal tract length. IEEE Trans. Audio Electroacoust. 18(3), 268–270 (1970)CrossRefGoogle Scholar
- 18.Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar