Advertisement

Reconstructed Phase Space and Convolutional Neural Networks for Classifying Voice Pathologies

  • João Vilian de Moraes Lima MarinusEmail author
  • Joseana Macedo Fechine Regis de Araújo
  • Herman Martins Gomes
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11401)

Abstract

In this paper, we present a new method for classifying voice pathologies. Reconstructed Phase Space (RPS) images are employed to represent the nonlinear dynamics of the signals, and a Convolutional Neural Network (CNN) is designed to automatically learn spatial features and a classification decision from the RPS images. Due to the large parameter space of the CNN, we augmented the Massachusetts Eye and Ear Infirmary (MEEI) database with synthetic training data obtained by slowing down or speeding up the audio signal. The proposed method was evaluated in the pairwise classification of 5 voice pathologies: paralysis, edema, nodule, polyp and keratosis. Experiments were also carried out on a broader pathology class, called benign lesion, consisting of nodule, polyp and cyst signals. Accuracies similar to state-of-the-art approaches support the relevance of the method. Best accuracy was achieved in the polyp vs. nodule classification. Data augmentation was beneficial to most of the classification experiments.

Keywords

Reconstruction Phase Space Convolutional Neural Network Voice pathology 

References

  1. 1.
    Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z.: Investigation of voice pathology detection and classification on different frequency regions using correlation functions. J. Voice 31(1), 3–15 (2016)CrossRefGoogle Scholar
  2. 2.
    Cordeiro, H.T., Fonseca, J.M., Ribeiro, C.M.: Reinke’s Edema and Nodules identification in vowels using spectral features and pitch jitter. Procedia Technol. 17, 202–208 (2014)CrossRefGoogle Scholar
  3. 3.
    Ali, Z., Elamvazuthi, I., Alsulaiman, M., Muhammad, G.: Automatic voice pathology detection with running speech by using estimation of auditory spectrum and cepstral coefficients based on the all-pole model. J. Voice 30(6), 757.e7–757.e19 (2016)CrossRefGoogle Scholar
  4. 4.
    Salma, C., Asma, B., Aicha, B., Noureddine, E.: Recognition of pathological voices. In: IEEE International Multi-Conference on Systems, Signals & Devices (SSD14), Barcelona, pp. 1–6 (2014)Google Scholar
  5. 5.
    Teager, H.M., Teager, S.M.: Evidence for nonlinear sound production mechanisms in the vocal tract. In: Hardcastle, W.J., Marchal, A. (eds.) Speech Production and Speech Modelling. NATO ASI Series (Series D: Behavioural and Social Sciences), vol. 55, pp. 241–261. Springer, Dordrecht (1990).  https://doi.org/10.1007/978-94-009-2037-8_10CrossRefGoogle Scholar
  6. 6.
    Costa, W.C.A., Assis, F.M., Neto, B.G.A., Costa, S.C., Vieira, V.J.D.: Pathological voice assessment by recurrence quantification analysis. In: ISSNIP Biosignals and Biorobotics Conference (BRC), pp. 1–6 (2012)Google Scholar
  7. 7.
    Ghasemzadeh, H., Khass, M.T., Arjmandi, M.K., Pooyan, M.: Detection of vocal disorders based on phase space parameters and Lyapunov spectrum. Biomed. Signal Process. Control 22, 135–145 (2015)CrossRefGoogle Scholar
  8. 8.
    Travieso, C.M., Alonso, J.B., Orozco-Arroyave, J.R., Vargas-Bonilla, J.F., Nöth, E., Ravelo-García, A.G.: Detection of different voice diseases based on the nonlinear characterization of speech signals. Expert Syst. Appl. 82, 184–195 (2017)CrossRefGoogle Scholar
  9. 9.
    Kay Elemetrics Corp.: Disordered Voice Database, Version 1.03 (CDROM). MEEI, Voice and Speech Lab, Boston, MA, October 1994Google Scholar
  10. 10.
    Fang, C., Li, H., Ma, L., Zhang, M.: Intelligibility evaluation of pathological speech through multigranularity feature extraction and optimization. Comput. Math. Methods Med. 2017, 1–8 (2017). https://www.hindawi.com/journals/cmmm/2017/2431573/cta/
  11. 11.
    Frid, A., Kantor, A., Svechin, D., Manevitz, L.M.: Diagnosis of Parkinson’s disease from continuous speech using deep convolutional networks without manual selection of features. In: IEEE International Conference on the Science of Electrical Engineering (ICSEE), pp. 1–4 (2016)Google Scholar
  12. 12.
    Fang, S., et al.: Detection of pathological voice using cepstrum vectors: a deep learning approach. J. Voice (2018). https://www.sciencedirect.com/science/article/pii/S089219971730509X
  13. 13.
    Harar, P., Alonso-Hernandezy, J.B., Mekyska, J., Galaz, Z., Burget, Z., Smekal, Z.: Voice pathology detection using deep learning: a preliminary study. In: International Conference and Workshop on Bioinspired Intelligence (IWOBI), pp. 1–4 (2017)Google Scholar
  14. 14.
    Barry, W.J., Pützer, M.: Saarbrucken voice database. Institute of Phonetics, University of Saarland (2016). http://www.stimmdatenbank.coli.uni-saarland.de/
  15. 15.
    Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)Google Scholar
  16. 16.
    Salamon, J., Bello, J.P.: Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process. Lett. 24(3), 279–283 (2017)CrossRefGoogle Scholar
  17. 17.
    Verdolini, K., Rosen, C.A., Rosen, C.A., Branski, R.C.: Classification Manual for Voice Disorders-I. Psychology Press, Oxon (2014)CrossRefGoogle Scholar
  18. 18.
    Takens, F.: Detecting strange attractors in turbulence. In: Rand, D., Young, L.-S. (eds.) Dynamical Systems and Turbulence, Warwick 1980. LNM, vol. 898, pp. 366–381. Springer, Heidelberg (1981).  https://doi.org/10.1007/BFb0091924CrossRefGoogle Scholar
  19. 19.
    Packard, N.H.: Geometry from a time series. Phys. Rev. Lett. 45(9), 712 (1980)CrossRefGoogle Scholar
  20. 20.
    Fraser, A., Swinney, H.: Independent coordinates for strange attractors from mutual information. Phys. Rev. A 33(2), 1134 (1986)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Li, W.: Mutual information functions versus correlation functions. J. Stat. Phys. 60(5–6), 823–837 (1990)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • João Vilian de Moraes Lima Marinus
    • 1
    Email author
  • Joseana Macedo Fechine Regis de Araújo
    • 2
  • Herman Martins Gomes
    • 2
  1. 1.Federal Institute of AlagoasBatalhaBrazil
  2. 2.Federal University of Campina GrandeCampina GrandeBrazil

Personalised recommendations