Abstract
Humans utilize many muscles to produce intelligible speech, including lips, face and throat. Dysarthria is a speech disorder that surfaces when one has weal muscles due to brain damage. Primary characteristics of a dysarthric patient are slurred and slow speech that can be difficult to understand based on the severity of the condition. This paper proposes an approach to improve the intelligibility of the dysarthric speech using a simple yet effective speech-transformation technique such as warping the frequency of LPC poles and mapping coefficients of linear predictive coding. This technique was applied to dysarthric audio from the UA-speech database to obtain the desired results. Both objective and subjective measures are used to evaluate the transformed speech. The obtained results pointed towards a significant improvement in the dysarthric speech’s intelligibility. This method can be used to develop special voice-enabled search platforms for dysarthric patients and in helping rehabilitation of the patients by developing a speech therapy based on auditory feedback.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rudzicz, F.: Adjusting dysarthric speech signals to be more intelligible. Comput. Speech Lang. 27(6), 1163–1177 (2013). Special Issue on Speech and Language Processing for Assistive Technology
Hosom, J.-P., Kain, A.B., Mishra, T., van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 1, pp. 924–927 (2003 April)
Green, P., Carmichael, J., Hatzis, A., Enderby, P., Hawley, M., Parker, M.: Automatic speech recognition with sparse training data for dysarthric speakers. In: Proceedings of Eurospeech 2003, Geneva, pp. 1189–1192 (2003)
Tolba, H., El Torgoman, A.S.: Towards the improvement of automatic recognition of dysarthric speech. In: 2009 2nd International Conference on Computer Science and Information Technology. IEEE (2009)
Dhanalakshmi, M., Vijayalakshmi, P.: Intelligibility modification of dysarthric speech using HMM-based adaptive synthesis system. In: 2015 2nd International Conference on Biomedical Engineering (ICoBE), IEEE (2015)
Yates, A.J.: Delayed auditory feedback. Psychol. Bull., 213 (1963)
Blanchet, P.G., Hoffman, P.R.: Factors influencing the effects of delayed auditory feedback on dysarthric speech associated with Parkinsons disease. J. Commun. Disord., Deaf. Stud. Hear. Aids (2014)
Downie, A.W., Low, J.M., Lindsay, D.D.: Speech disorder in parkinsonism usefulness of delayed auditory feedback in selected cases. Int. J. Lang. Commun. Disord., 135–139 (1981)
Menendez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E.. Bunnel, H.T.: The nemours database of dysarthric speech. In: Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, vol. 3, pp. 1962–1965 (1996)
Hosom, J.-P., Kain, A.B. Mishra, T., Van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03), vol. 1, p. I-924. IEEE (2003)
Das, D., Santhosh Kumar, C., Reghu Raj, P.C.: Dysarthric speech enhancement using formant trajectory refinement. Int. J. Latest Trends Eng. Technol. (IJLTET) 2(4) 2013
Selouani, S.-A., Yakoub, M.S., O’Shaughnessy, D.: Alternative speech communication system for persons with severe speech disorders. In: EURASIP Journal on Advances in Signal Processing, p. 6 (2009)
Tomik, B., Krupinski, J., Glodzik-Sobanska, L., BalaSlodowska, M., Wszolek, W., Kusiak, M., Lechwacka, A.: Acoustic analysis of dysarthria profile in ALS patients. J. Neurol. Sci. 169(1), 35–42 (1999)
Alexander, K., Niu, X., Hosom, J.-P., Miao, Q., van Santen, J.P.H.: Formant Re-synthesis of Dysarthric Speech. In: Fifth ISCA Workshop on Speech Synthesis, pp. 25–30 (2004)
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)
Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition (1993)
Ellis, D.: Dynamic time warp (DTW) in Matlab. Web resource http://www.ee.columbia.edu/dpwe/resources/matlab/dtw (2003)
Boersma, P.: Praat, a system for doing phonetics by computer. Glot International, pp. 341–345 (2002)
Slaney, M.: Auditory Toolbox. Interval Research Corporation, Tech. Rep 10 (1998)
van Heuven, V.J., Pols, L.C.: Analysis and synthesis of speech: strategic research towards high-quality text-to-speech generation, vol. 11. Walter de Gruyter (1993)
O’shaughnessy, D.: Speech Communication: Human and Machine. Universities Press (1987)
Harma A., Laine, U.K.: A comparison of warped and conventional linear predictive coding. In: IEEE Transactions on Speech and Audio Processing, pp. 579–588 (2001)
Loizou, P.C.: Speech quality assessment. In: Multimedia Analysis, Processing and Communications, Springer, Berlin Heidelberg, pp. 623–654 (2011)
Gu, P.L., Harris, J.G., Shrivastav, R., Sapienza, C.: Disordered speech evaluation using objective quality measures. In: ICASSP, pp. 321–324 (2005)
Declaration
We the authors declare that this manuscript is original, has not been published before and is not currently being considered for publication elsewhere.
We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.
We further confirm that the UA-speech database used in the work covered in this manuscript has been acquired from the protected ISLE data server with access provided by Mark Hasegawa-Johnson, Professor ECE, University of Illinois. The audio recordings of human patients have been conducted with the ethical approval of all relevant bodies and subjects who refused permission are not represented in the database distribution.
We understand that the Corresponding Author is the sole contact for the Editorial process. The Corresponding Author is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Roy, A., Thakur, L., Vyas, G., Raj, G. (2019). Towards Improving the Intelligibility of Dysarthric Speech. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds) Soft Computing and Signal Processing . Advances in Intelligent Systems and Computing, vol 898. Springer, Singapore. https://doi.org/10.1007/978-981-13-3393-4_56
Download citation
DOI: https://doi.org/10.1007/978-981-13-3393-4_56
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3392-7
Online ISBN: 978-981-13-3393-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)