Abstract
Application of deep learning tends to outperform hand-crafted features in many domains. This study uses convolutional neural networks to explore effectiveness of various segments of a speech signal, – text-dependent pronunciation of a short sentence, – in Parkinson’s disease detection task. Besides the common Mel-frequency spectrogram and its first and second derivatives, inclusion of various other input feature maps is also considered. Image interpolation is investigated as a solution to obtain a spectrogram of fixed length. The equal error rate (EER) for sentence segments varied from 20.3% to 29.5%. Fusion of decisions from sentence segments achieved EER of 14.1%, whereas the best result when using the full sentence exhibited EER of 16.8%. Therefore, splitting speech into segments could be recommended for Parkinson’s disease detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
de Rijk, M., Launer, L., Berger, K., Breteler, M., Dartigues, J., Baldereschi, M., Fratiglioni, L., Lobo, A., Martinez-Lage, J., Trenkwalder, C., Hofman, A.: Prevalence of Parkinson’s disease in Europe: a collaborative study of population-based cohorts. Neurologic diseases in the elderly research group. Neurology 54(11 Suppl 5), S21–S23 (2016)
Orozco-Arroyave, J.R., Hönig, F., Arias-Londoño, J.D., Vargas-Bonilla, J.F., Daqrouq, K., Skodda, S., Rusz, J., Nöth, E.: Automatic detection of Parkinson’s disease in running speech spoken in three different languages. J. Acoust. Soc. Am. 139(1), 481–500 (2016)
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Abdel-Hamid, O., Mohamed, A.R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Sainath, T.N., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A.R., Dahl, G., Ramabhadran, B.: Deep convolutional neural networks for large-scale speech tasks. Neural Netw. 64, 39–48 (2015). Special Issue on “Deep Learning of Representations”
Zhang, H., McLoughlin, I., Song, Y.: Robust sound event recognition using convolutional neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 559–563, April 2015
Thomas, S., Ganapathy, S., Saon, G., Soltau, H.: Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2519–2523, May 2014
Han, Y., Lee, K.: Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation. Computing Research Repository (CoRR) arXiv:1607.02383 (2016)
Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2011)
Deng, L., Abdel-Hamid, O., Yu, D.: A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6669–6673, May 2013
Adi, Y., Keshet, J., Goldrick, M.: Vowel duration measurement using deep neural networks. In: 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2015
Godino-Llorente, J.I., Gomez-Vilda, P.: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans. Biomed. Eng. 51(2), 380–384 (2004)
Dibazar, A.A., Narayanan, S., Berger, T.W.: Feature analysis for automatic detection of pathological speech. In: Proceedings of the 2th Joint EMBS/BMES Conference, Houston, USA, pp. 182–183 (2002)
Verikas, A., Gelzinis, A., Vaiciukynas, E., Bacauskiene, M., Minelga, J., Hållander, M., Uloza, V., Padervinskis, E.: Data dependent random forest applied to screening for laryngeal disorders through analysis of sustained phonation: acoustic versus contact microphone. Med. Eng. Phys. 37(2), 210–218 (2015)
Muhammad, G.: Voice pathology detection using vocal tract area. In: 2013 European Modelling Symposium, pp. 164–168, November 2013
Hrúz, M., Kunešová, M.: Convolutional neural network in the task of speaker change detection. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS (LNAI), vol. 9811, pp. 191–198. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43958-7_22
Faundez-Zanuy, M., Monte-Moreno, E.: State-of-the-art in speaker recognition. IEEE Aerosp. Electron. Syst. Mag. 20(5), 7–12 (2005)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Acknowledgements
Funding for this work was provided by a grant (No. MIP-075/2015) from the Research Council of Lithuania. The dataset was collected by the Department of Otorhinolaryngology at Lithuanian University of Health Sciences.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Vaiciukynas, E., Gelzinis, A., Verikas, A., Bacauskiene, M. (2018). Parkinson’s Disease Detection from Speech Using Convolutional Neural Networks. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds) Smart Objects and Technologies for Social Good. GOODTECHS 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 233. Springer, Cham. https://doi.org/10.1007/978-3-319-76111-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-76111-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76110-7
Online ISBN: 978-3-319-76111-4
eBook Packages: Computer ScienceComputer Science (R0)