Parkinson’s Disease Detection from Speech Using Convolutional Neural Networks

Vaiciukynas, Evaldas; Gelzinis, Adas; Verikas, Antanas; Bacauskiene, Marija

doi:10.1007/978-3-319-76111-4_21

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 233))

Included in the following conference series:

International Conference on Smart Objects and Technologies for Social Good

963 Accesses
5 Citations

Abstract

Application of deep learning tends to outperform hand-crafted features in many domains. This study uses convolutional neural networks to explore effectiveness of various segments of a speech signal, – text-dependent pronunciation of a short sentence, – in Parkinson’s disease detection task. Besides the common Mel-frequency spectrogram and its first and second derivatives, inclusion of various other input feature maps is also considered. Image interpolation is investigated as a solution to obtain a spectrogram of fixed length. The equal error rate (EER) for sentence segments varied from 20.3% to 29.5%. Fusion of decisions from sentence segments achieved EER of 14.1%, whereas the best result when using the full sentence exhibited EER of 16.8%. Therefore, splitting speech into segments could be recommended for Parkinson’s disease detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

de Rijk, M., Launer, L., Berger, K., Breteler, M., Dartigues, J., Baldereschi, M., Fratiglioni, L., Lobo, A., Martinez-Lage, J., Trenkwalder, C., Hofman, A.: Prevalence of Parkinson’s disease in Europe: a collaborative study of population-based cohorts. Neurologic diseases in the elderly research group. Neurology 54(11 Suppl 5), S21–S23 (2016)
Google Scholar
Orozco-Arroyave, J.R., Hönig, F., Arias-Londoño, J.D., Vargas-Bonilla, J.F., Daqrouq, K., Skodda, S., Rusz, J., Nöth, E.: Automatic detection of Parkinson’s disease in running speech spoken in three different languages. J. Acoust. Soc. Am. 139(1), 481–500 (2016)
Article Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Abdel-Hamid, O., Mohamed, A.R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Article Google Scholar
Sainath, T.N., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A.R., Dahl, G., Ramabhadran, B.: Deep convolutional neural networks for large-scale speech tasks. Neural Netw. 64, 39–48 (2015). Special Issue on “Deep Learning of Representations”
Article Google Scholar
Zhang, H., McLoughlin, I., Song, Y.: Robust sound event recognition using convolutional neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 559–563, April 2015
Google Scholar
Thomas, S., Ganapathy, S., Saon, G., Soltau, H.: Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2519–2523, May 2014
Google Scholar
Han, Y., Lee, K.: Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation. Computing Research Repository (CoRR) arXiv:1607.02383 (2016)
Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2011)
Article Google Scholar
Deng, L., Abdel-Hamid, O., Yu, D.: A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6669–6673, May 2013
Google Scholar
Adi, Y., Keshet, J., Goldrick, M.: Vowel duration measurement using deep neural networks. In: 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2015
Google Scholar
Godino-Llorente, J.I., Gomez-Vilda, P.: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans. Biomed. Eng. 51(2), 380–384 (2004)
Article Google Scholar
Dibazar, A.A., Narayanan, S., Berger, T.W.: Feature analysis for automatic detection of pathological speech. In: Proceedings of the 2th Joint EMBS/BMES Conference, Houston, USA, pp. 182–183 (2002)
Google Scholar
Verikas, A., Gelzinis, A., Vaiciukynas, E., Bacauskiene, M., Minelga, J., Hållander, M., Uloza, V., Padervinskis, E.: Data dependent random forest applied to screening for laryngeal disorders through analysis of sustained phonation: acoustic versus contact microphone. Med. Eng. Phys. 37(2), 210–218 (2015)
Article Google Scholar
Muhammad, G.: Voice pathology detection using vocal tract area. In: 2013 European Modelling Symposium, pp. 164–168, November 2013
Google Scholar
Hrúz, M., Kunešová, M.: Convolutional neural network in the task of speaker change detection. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS (LNAI), vol. 9811, pp. 191–198. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43958-7_22
Chapter Google Scholar
Faundez-Zanuy, M., Monte-Moreno, E.: State-of-the-art in speaker recognition. IEEE Aerosp. Electron. Syst. Mag. 20(5), 7–12 (2005)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar

Download references

Acknowledgements

Funding for this work was provided by a grant (No. MIP-075/2015) from the Research Council of Lithuania. The dataset was collected by the Department of Otorhinolaryngology at Lithuanian University of Health Sciences.

Author information

Authors and Affiliations

Department of Electrical Power Systems, Kaunas University of Technology, Studentu 50, 51368, Kaunas, Lithuania
Evaldas Vaiciukynas, Adas Gelzinis, Antanas Verikas & Marija Bacauskiene
Department of Information Systems, Kaunas University of Technology, Studentu 50, 51368, Kaunas, Lithuania
Evaldas Vaiciukynas
Centre for Applied Intelligent Systems Research, Halmstad University, Kristian IV:s väg 3, PO Box 823, 30118, Halmstad, Sweden
Antanas Verikas

Authors

Evaldas Vaiciukynas
View author publications
You can also search for this author in PubMed Google Scholar
Adas Gelzinis
View author publications
You can also search for this author in PubMed Google Scholar
Antanas Verikas
View author publications
You can also search for this author in PubMed Google Scholar
Marija Bacauskiene
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evaldas Vaiciukynas .

Editor information

Editors and Affiliations

University of Pisa, Pisa, Italy
Barbara Guidi
University of Pisa, Pisa, Italy
Laura Ricci
Polytechnic University of Valencia, Valencia, Spain
Carlos Calafate
University of Padua, Padua, Italy
Ombretta Gaggi
University of Antwerp, Antwerp, Belgium
Johann Marquez-Barja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vaiciukynas, E., Gelzinis, A., Verikas, A., Bacauskiene, M. (2018). Parkinson’s Disease Detection from Speech Using Convolutional Neural Networks. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds) Smart Objects and Technologies for Social Good. GOODTECHS 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 233. Springer, Cham. https://doi.org/10.1007/978-3-319-76111-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-76111-4_21
Published: 17 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76110-7
Online ISBN: 978-3-319-76111-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics