Towards Improving the Intelligibility of Dysarthric Speech
Humans utilize many muscles to produce intelligible speech, including lips, face and throat. Dysarthria is a speech disorder that surfaces when one has weal muscles due to brain damage. Primary characteristics of a dysarthric patient are slurred and slow speech that can be difficult to understand based on the severity of the condition. This paper proposes an approach to improve the intelligibility of the dysarthric speech using a simple yet effective speech-transformation technique such as warping the frequency of LPC poles and mapping coefficients of linear predictive coding. This technique was applied to dysarthric audio from the UA-speech database to obtain the desired results. Both objective and subjective measures are used to evaluate the transformed speech. The obtained results pointed towards a significant improvement in the dysarthric speech’s intelligibility. This method can be used to develop special voice-enabled search platforms for dysarthric patients and in helping rehabilitation of the patients by developing a speech therapy based on auditory feedback.
KeywordsDysarthria LPC Speech intelligibility Frequency warping Speech therapy Speech enhancement Auditory feedback
We the authors declare that this manuscript is original, has not been published before and is not currently being considered for publication elsewhere.
We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.
We further confirm that the UA-speech database used in the work covered in this manuscript has been acquired from the protected ISLE data server with access provided by Mark Hasegawa-Johnson, Professor ECE, University of Illinois. The audio recordings of human patients have been conducted with the ethical approval of all relevant bodies and subjects who refused permission are not represented in the database distribution.
We understand that the Corresponding Author is the sole contact for the Editorial process. The Corresponding Author is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author.
- 1.Rudzicz, F.: Adjusting dysarthric speech signals to be more intelligible. Comput. Speech Lang. 27(6), 1163–1177 (2013). Special Issue on Speech and Language Processing for Assistive TechnologyGoogle Scholar
- 2.Hosom, J.-P., Kain, A.B., Mishra, T., van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 1, pp. 924–927 (2003 April)Google Scholar
- 3.Green, P., Carmichael, J., Hatzis, A., Enderby, P., Hawley, M., Parker, M.: Automatic speech recognition with sparse training data for dysarthric speakers. In: Proceedings of Eurospeech 2003, Geneva, pp. 1189–1192 (2003)Google Scholar
- 4.Tolba, H., El Torgoman, A.S.: Towards the improvement of automatic recognition of dysarthric speech. In: 2009 2nd International Conference on Computer Science and Information Technology. IEEE (2009)Google Scholar
- 5.Dhanalakshmi, M., Vijayalakshmi, P.: Intelligibility modification of dysarthric speech using HMM-based adaptive synthesis system. In: 2015 2nd International Conference on Biomedical Engineering (ICoBE), IEEE (2015)Google Scholar
- 7.Blanchet, P.G., Hoffman, P.R.: Factors influencing the effects of delayed auditory feedback on dysarthric speech associated with Parkinsons disease. J. Commun. Disord., Deaf. Stud. Hear. Aids (2014)Google Scholar
- 9.Menendez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E.. Bunnel, H.T.: The nemours database of dysarthric speech. In: Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, vol. 3, pp. 1962–1965 (1996)Google Scholar
- 10.Hosom, J.-P., Kain, A.B. Mishra, T., Van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03), vol. 1, p. I-924. IEEE (2003)Google Scholar
- 11.Das, D., Santhosh Kumar, C., Reghu Raj, P.C.: Dysarthric speech enhancement using formant trajectory refinement. Int. J. Latest Trends Eng. Technol. (IJLTET) 2(4) 2013Google Scholar
- 12.Selouani, S.-A., Yakoub, M.S., O’Shaughnessy, D.: Alternative speech communication system for persons with severe speech disorders. In: EURASIP Journal on Advances in Signal Processing, p. 6 (2009)Google Scholar
- 14.Alexander, K., Niu, X., Hosom, J.-P., Miao, Q., van Santen, J.P.H.: Formant Re-synthesis of Dysarthric Speech. In: Fifth ISCA Workshop on Speech Synthesis, pp. 25–30 (2004)Google Scholar
- 15.Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)Google Scholar
- 16.Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition (1993)Google Scholar
- 17.Ellis, D.: Dynamic time warp (DTW) in Matlab. Web resource http://www.ee.columbia.edu/dpwe/resources/matlab/dtw (2003)
- 18.Boersma, P.: Praat, a system for doing phonetics by computer. Glot International, pp. 341–345 (2002)Google Scholar
- 19.Slaney, M.: Auditory Toolbox. Interval Research Corporation, Tech. Rep 10 (1998)Google Scholar
- 20.van Heuven, V.J., Pols, L.C.: Analysis and synthesis of speech: strategic research towards high-quality text-to-speech generation, vol. 11. Walter de Gruyter (1993)Google Scholar
- 21.O’shaughnessy, D.: Speech Communication: Human and Machine. Universities Press (1987)Google Scholar
- 24.Gu, P.L., Harris, J.G., Shrivastav, R., Sapienza, C.: Disordered speech evaluation using objective quality measures. In: ICASSP, pp. 321–324 (2005)Google Scholar