Phonological Posteriors and GRU Recurrent Units to Assess Speech Impairments of Patients with Parkinson’s Disease

  • Juan Camilo Vásquez-CorreaEmail author
  • Nicanor Garcia-Ospina
  • Juan Rafael Orozco-Arroyave
  • Milos Cernak
  • Elmar Nöth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11107)


Parkinson’s disease is a neurodegenerative disorder characterized by a variety of motor symptoms, including several impairments in the speech production process. Recent studies show that deep learning models are highly accurate to assess the speech deficits of the patients; however most of the architectures consider static features computed from a complete utterance. Such an approach is not suitable to model the dynamics of the speech signal when the patients pronounce different sounds. Phonological features can be used to characterize the voice quality of the speech, which is highly impaired in patients suffering from Parkinson’s disease. This study proposes a deep architecture based on recurrent neural networks with gated recurrent units combined with phonological posteriors to assess the speech deficits of Parkinson’s patients. The aim is to model the time-dependence of consecutive phonological posteriors, which follow the sound patterns of English phonological model. The results show that the proposed approach is more accurate than a baseline based on standard acoustic features to assess the speech deficits of the patients.


Parkinson’s disease Dysarthria assessment Phonological posteriors Gated recurrent units Recurrent neural network 



The work reported here was financed by CODI from University of Antioquia by grants Number 2015–7683. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 766287.


  1. 1.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  2. 2.
    Goetz, C.G., et al.: Movement Disorder Society-sponsored revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord. 23(15), 2129–2170 (2008)CrossRefGoogle Scholar
  3. 3.
    Cernak, M., Potard, B., Garner, P.N.: Phonological vocoding using artificial neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pp. 4844–4848. IEEE (2015)Google Scholar
  4. 4.
    Cernak, M., et al.: Characterisation of voice quality of Parkinsons disease using differential phonological posterior features. Comput. Speech Lang. 46, 96–208 (2017)CrossRefGoogle Scholar
  5. 5.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1724–1734 (2014)Google Scholar
  6. 6.
    Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Deep Learning and Representation Learning Workshop (2014)Google Scholar
  7. 7.
    Hlavnicka, J., Cmejla, R., Tykalova, T., Sonka, K., Ruzicka, E., Rusz, J.: Automated analysis of connected speech reveals early biomarkers of Parkinson’s disease in patients with rapid eye movement sleep behaviour disorder. Nat. Sci. Rep. 7(12), 1–13 (2017)Google Scholar
  8. 8.
    Hornykiewicz, O.: Biochemical aspects of Parkinson’s disease. Neurology 51(2 Suppl. 2), S2–S9 (1998)CrossRefGoogle Scholar
  9. 9.
    Irie, K., Tüske, Z., Alkhouli, T., Schlüter, R., Ney, H.: LSTM, GRU, highway and a bit of attention: an empirical overview for language modeling in speech recognition. In: Proceedings of INTERSPEECH, pp. 3519–3523 (2016)Google Scholar
  10. 10.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  11. 11.
    Logemann, J.A., Fisher, H.B., Boshes, B., Blonsky, E.R.: Frequency and cooccurrence of vocal tract dysfunctions in the speech of a large sample of Parkinson patients. J. Speech Hear. Disord. 43(1), 47–57 (1978)CrossRefGoogle Scholar
  12. 12.
    Orozco-Arroyave, J.R., Vásquez-Correa, J.C., et al.: NeuroSpeech: an open-source software for Parkinson’s speech analysis. Dig. Signal Process. (2017, in press)Google Scholar
  13. 13.
    Orozco-Arroyave, J.R., et al.: New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease. In: Language Resources and Evaluation Conference, LREC, pp. 342–347 (2014)Google Scholar
  14. 14.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, NIPS, pp. 2951–2959 (2012)Google Scholar
  15. 15.
    Tu, M., Berisha, V., Liss, J.: Interpretable objective assessment of dysarthric speech based on deep neural networks. In: Proceedings of INTERSPEECH, pp. 1849–1853 (2017)Google Scholar
  16. 16.
    Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Nöth, E.: Convolutional neural network to model articulation impairments in patients with Parkinson’s disease. In: Proceedings of INTERSPEECH, pp. 314–318 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Juan Camilo Vásquez-Correa
    • 1
    • 2
    Email author
  • Nicanor Garcia-Ospina
    • 1
  • Juan Rafael Orozco-Arroyave
    • 1
    • 2
  • Milos Cernak
    • 3
  • Elmar Nöth
    • 2
  1. 1.Faculty of EngineeringUniversity of Antioquia UdeAMedellínColombia
  2. 2.University of Erlangen-NürembergErlangenGermany
  3. 3.LogitechLausanneSwitzerland

Personalised recommendations