Abstract
A framework for semi-automated detection of selected types of sigmatism is presented in this paper. A database of speech recordings was collected containing sibilant /s/ surrounded by vowels in different articulation phases. Recordings of three pronunciation modes were included into the database: normal, simulated lateral sigmatism, and simulated interdental sigmatism. The data was collected under the supervision of a speech therapy expert, who also provided labelling and annotation of each database entry. Twenty eight features of four types were extracted from each time frame within the sibilant: the mel-frequency cepstral coefficients, filter bank energies, spectral brightness, and zero-crossing rate. A feature aggregation procedure weighing the time frame location influence was proposed to describe each phoneme by a single feature vector. At the three-class classification stage, two tools were employed and compared: the random forest and support vector machine. The latter provides more accurate and repeatable classification results in each articulation phase with a median sensitivity, specificity, and accuracy exceeding 0.71, 0.85, and 0.80, respectively. The results also show that the assessment is generally more efficient when the phoneme is located at the beginning or ending of the word than when in the middle position.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lobacz, P., Dobrzanska, K.: Opis akustyczny glosek sybilantnych w wymowie dzieci przedszkolnych. Audiofonologia 15, 7–26 (1999). (in Polish)
Miodońska, Z., Kręcichwost, M., Szymańska, A.: Computer-aided evaluation of sibilants in preschool children sigmatism diagnosis. In: Information Technologies in Medicine, pp. 367–376. Springer (2016)
Wielgat, R., Zielinski, T., Wozniak, T., Grabias, S., Król, D.: Automatic recognition of pathological phoneme production. Folia Phoniatr Logop 60(6), 323–331 (2008). Spoken Language Technology for Education
Valentini-Botinhao, C., Degenkolb-Weyers, S., Maier, A., Nöth, E., Eysholdt, U., Bocklet, T.: Automatic detection of sigmatism in children. In: WOCCI, pp. 1–4 (2012)
Seddik, A.F., El Adawy, M., Shahin, A.I.: A computer-aided speech disorders correction system for arabic language, pp. 18–21, September 2013
Bodusz, W., Miodońska, Z., Badura, P.: Approach for spectrogram analysis in detection of selected pronunciation pathologies. In: Innovations in Biomedical Engineering, vol. 623, pp. 3–11. Springer (2018)
Kostera, K., Więclawek, W., Kręcichwost, M.: Prototype measurement system for spatial analysis of speech signal for speech therapy. In: Innovations in Biomedical Engineering, vol. 623, pp. 79–86. Springer (2018)
Kręcichwost, M., Miodońska, Z., Trzaskalik, J., Pyttel, J., Spinczyk, D.: Acoustic mask for air flow distribution analysis in speech therapy. In: Information Technologies in Medicine, pp. 377–387. Springer (2016)
Król, D., Lorenc, A.: Acoustic field distribution in speech with the use of the microphone array. Tarnowskie Colloquia Naukowe 3(4), 9–16 (2017)
Sebkhi, N., Desai, D., Islam, M., Lu, J., Wilson, K., Ghovanloo, M.: Multimodal speech capture system for speech rehabilitation and learning. IEEE Trans. Biomed. Eng. 64(11), 2639–2649 (2017)
Aron, M., Berger, M.-O., Kerrien, E., Wrobel-Dautcourt, B., Potard, B., Laprie, Y.: Multimodal acquisition of articulatory data: geometrical and temporal registration. J. Acoust. Soc. Am. 139(2), 636–648 (2016)
Opielinski, K.J., Gudra, T., Migda, J.: Computer ultrasonic imaging of the tongue shape changes in the process of articulation of vowels. In: Computer Recognition Systems 2, pp. 629–636. Springer, Berlin (2007)
Wielgat, R., Mik, L., Lorenc, A.: Correlational and regressive analysis of the relationship between tongue and lips motion - an EMA and video study of selected polish speech sounds, pp. 509–514, June 2017
Martony, J.: On the synthesis and perception of voiceless fricatives. STL-QPSR 3(1), 17–22 (1962)
Young, S.J., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book Version 3.4. Cambridge University Press, Cambridge (2006)
Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, 1st edn. Prentice Hall PTR, Upper Saddle River (2001)
Paliwal, K.K.: Decorrelated and liftered filter-bank energies for robust speech recognition. In: EUROSPEECH (1999)
Jensen, K., Andersen, T.H.: Real-time beat estimation using feature extraction. In: Computer Music Modeling and Retrieval, pp. 13–22. Springer, Berlin (2004)
Bachu, R.G., Kopparthi, S., Adapa, B., Barkana, B.D.: Voiced/unvoiced decision for speech signals based on zero-crossing rate and energy. In: Advanced Techniques in Computing Sciences and Software Engineering, pp. 279–282. Springer, Dordrecht (2010)
Reidy, P.F.: Spectral dynamics of sibilant fricatives are contrastive and language specific. J. Acoust. Soc. Am. 140(4), 2518–2529 (2016)
Klesla, J.: Analiza akustyczna polskich spolglosek tracych bezdzwiecznych realizowanych przez dzieci nieslyszace. Audiofonologia Problemy teorii i praktyki 26, 107–118 (2004). (in Polish)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Soli, S.D.: Second formants in fricatives: acoustic consequences of fricative vowel coarticulation. J. Acoust. Soc. Am. 70(4), 976–984 (1981)
Sereno, J.A., Baum, S.R., Marean, G.C., Lieberman, P.: Acoustic analyses and perceptual data on anticipatory labial coarticulation in adults and children. J. Acoust. Soc. Am. 77(S1), S7–S8 (1985)
Acknowledgements
This research was supported by the Polish Ministry of Science and Silesian University of Technology statutory financial support No. BK-209/RIB1/2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kręcichwost, M., Rasztabiga, P., Woloshuk, A., Badura, P., Miodońska, Z. (2019). Approach for Spectral Analysis in Detection of Selected Pronunciation Pathologies. In: Tkacz, E., Gzik, M., Paszenda, Z., Piętka, E. (eds) Innovations in Biomedical Engineering. IBE 2018. Advances in Intelligent Systems and Computing, vol 925. Springer, Cham. https://doi.org/10.1007/978-3-030-15472-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-15472-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15471-4
Online ISBN: 978-3-030-15472-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)