Skip to main content

Convolutional Neural Networks for Computer Aided Diagnosis of Interdental and Rustling Sigmatism

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1011))

Abstract

Sigmatism (lisping), is the misarticulation of sibilant sounds. Multiple classes of sigmatism exist, and the treatment for each type differs. An automatic classifier may improve therapeutic options with the supervision of a speech therapist. A database containing 1188 multichannel recordings of children diagnosed as having normative pronunciation, interdental sigmatism, or rustling sigmatism was used to create visual representations of the spectrum, mel-filter bank energies (FBE), and mel-frequency cepstral coefficients. These images were used to train a convolutional neural network. The network achieved a binary accuracy of 97.67% using FBE images to distinguish between normative pronunciation and any other of the analyzed types of sigmatism.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Khinda, V., Grewal, N.: Relationship of tongue-thrust swallowing and anterior open bite with articulation disorders: a clinical study. J. Indian Soc. Pedod. Prev. Dent. 17(2), 33–39 (1999)

    Google Scholar 

  2. Black, L.I., Vahratian, A., Hoffman, H.J.: Communication disorders and use of intervention services among children aged 3–17 years: United States, 2012. NCHS Data Brief 205, 1–8 (2015)

    Google Scholar 

  3. Jerome, A., Fujiki, M., Brinton, B., James, S.: Self-esteem in children with specific language impairment. J. Speech, Lang. Hear. Res. 45(4), 700–714 (2002)

    Article  Google Scholar 

  4. Blood, G., Blood, I., Tellis, G., Gabel, R.: A preliminary study of self-esteem, stigma, and disclosure in adolescents who stutter. J. Fluen. Disord. 28(2), 143–159 (2003)

    Article  Google Scholar 

  5. McKinnon, S., Hess, C., Landry, R.: Reactions of college students to speech disorders. J. Commun. Disord. 19(1), 75–82 (1986)

    Article  Google Scholar 

  6. Miodońska, Z., Kręcichwost, M., Szymańska, A.: Computer-aided evaluation of sibilants in preschool children sigmatism diagnosis. In: Information Technologies in Medicine, pp. 367–376. Springer International Publishing (2016)

    Google Scholar 

  7. Hu, W., Qian, Y., Soong, F., Wang, Y.: Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers. Speech Commun. 67, 154–166 (2015)

    Article  Google Scholar 

  8. Ali, S.M., Dr. Karule, P.T.: MFCC, LPCC, formants and pitch proven to be best features in diagnosis of speech disorder using neural networks and SVM. Int. J. Appl. Eng. Res. 11(2), 897–903 (2016)

    Google Scholar 

  9. Krecichwost, Michal, Miodonska, Zuzanna, Badura, Pawel, Trzaskalik, Joanna, Mocko, Natalia: Multi-channel acoustic analysis of phoneme /s/ mispronunciation for lateral sigmatism detection. Biocybern. Biomed. Eng. 39(1), 246–255 (2019)

    Article  Google Scholar 

  10. Bugdol, M.N., Bugdol, M., Lipowicz, A.M., Mitas, A.W., Bienkowska, M.J., Wijata, A.M.: Prediction of menarcheal status of girls using voice features. Comput. Biol. Med. 100, 296–304 (2018)

    Article  Google Scholar 

  11. Akbari, A., Arjmandi, M.: An efficient voice pathology classification scheme based on applying multi-layer linear discriminant analysis to wavelet packet-based features. Biomed. Signal Proc. Control 10, 209–223 (2014)

    Article  Google Scholar 

  12. Majidnezhad, V.: A novel hybrid of genetic algorithm and ANN for developing a high efficient method for vocal fold pathology diagnosis. EURASIP J. Audio Speech Music. Process. 2015(1), 3 (2015)

    Google Scholar 

  13. Huzaifah, M.: Comparison of time-frequency representations for environmental sound classification using convolutional neural networks. CoRR (2017). arXiv:1706.07156

  14. Badshah, A.M., Ahmad, J., Rahim, N., Baik, S.W.: Speech emotion recognition from spectrograms with deep convolutional neural network. In: 2017 International Conference on Platform Technology and Service (PlatCon), pp. 1–5 (2017)

    Google Scholar 

  15. Costa, Y., Oliveira, L., Silla, C.: An evaluation of convolutional neural networks for music classification using spectrograms. Appl. Soft Comput. 52, 28–38 (2017)

    Article  Google Scholar 

  16. Reed, R., Marks, R.J., Oh, S.: Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter. IEEE Trans. Neural Netw. 6(3), 529–538 (1995)

    Article  Google Scholar 

  17. Woloshuk, A., Kręcichwost, M., Miodońska, Z., Badura, P., Trzaskalik, J., Pietka, E.: CAD of sigmatism using neural networks. In: Pietka, E., Badura, P., Kawa, J., Wieclawek, W. (eds.) Information Technology in Biomedicine, pp. 260–271. Springer International Publishing, Cham (2019)

    Google Scholar 

  18. Kręcichwost, M., Miodońska, Z., Trzaskalik, J., Pyttel, J., Spinczyk, D.: Acoustic mask for air flow distribution analysis in speech therapy. In: Information Technologies in Medicine, pp. 377–387. Springer International Publishing (2016)

    Google Scholar 

  19. Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)

    Article  Google Scholar 

  20. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)

    Google Scholar 

  21. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR (2015). arXiv:1502.03167

  22. Soli, S.D.: Second formants in fricatives: acoustic consequences of fricative vowel coarticulation. J. Acoust. Soc. Am. 70(4), 976–984 (1981)

    Article  Google Scholar 

  23. Sereno, J.A., Baum, S.R., Marean, G.C., Lieberman, P.: Acoustic analyses and perceptual data on anticipatory labial coarticulation in adults and children. J. Acoust. Soc. Am. 81(2), 512–519 (1987)

    Article  Google Scholar 

  24. Sahidullah, Md, Saha, G.: Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Commun. 54(4), 543–565 (2012)

    Article  Google Scholar 

  25. Nadeu, C., Macho, D., Hernando, J.: Time and frequency filtering of filter-bank energies for robust HMM speech recognition. Speech Commun. 34(1), 93–114 (2001). (Noise Robust ASR)

    Google Scholar 

  26. Gelzinis, A., Verikas, A., Bacauskiene, M.: Automated speech analysis applied to laryngeal disease categorization. Comput. Methods Programs Biomed. 91(1), 36–47 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Krecichwost .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Woloshuk, A., Krecichwost, M., Miodonska, Z., Korona, D., Badura, P. (2019). Convolutional Neural Networks for Computer Aided Diagnosis of Interdental and Rustling Sigmatism. In: Pietka, E., Badura, P., Kawa, J., Wieclawek, W. (eds) Information Technology in Biomedicine. ITIB 2019. Advances in Intelligent Systems and Computing, vol 1011. Springer, Cham. https://doi.org/10.1007/978-3-030-23762-2_16

Download citation

Publish with us

Policies and ethics