Towards Improving the Intelligibility of Dysarthric Speech

Roy, Arpan; Thakur, Lakshya; Vyas, Garima; Raj, Gaurav

doi:10.1007/978-981-13-3393-4_56

Arpan Roy¹⁸,
Lakshya Thakur¹⁸,
Garima Vyas¹⁹ &
…
Gaurav Raj¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 898))

776 Accesses
1 Citations

Abstract

Humans utilize many muscles to produce intelligible speech, including lips, face and throat. Dysarthria is a speech disorder that surfaces when one has weal muscles due to brain damage. Primary characteristics of a dysarthric patient are slurred and slow speech that can be difficult to understand based on the severity of the condition. This paper proposes an approach to improve the intelligibility of the dysarthric speech using a simple yet effective speech-transformation technique such as warping the frequency of LPC poles and mapping coefficients of linear predictive coding. This technique was applied to dysarthric audio from the UA-speech database to obtain the desired results. Both objective and subjective measures are used to evaluate the transformed speech. The obtained results pointed towards a significant improvement in the dysarthric speech’s intelligibility. This method can be used to develop special voice-enabled search platforms for dysarthric patients and in helping rehabilitation of the patients by developing a speech therapy based on auditory feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rudzicz, F.: Adjusting dysarthric speech signals to be more intelligible. Comput. Speech Lang. 27(6), 1163–1177 (2013). Special Issue on Speech and Language Processing for Assistive Technology
Google Scholar
Hosom, J.-P., Kain, A.B., Mishra, T., van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 1, pp. 924–927 (2003 April)
Google Scholar
Green, P., Carmichael, J., Hatzis, A., Enderby, P., Hawley, M., Parker, M.: Automatic speech recognition with sparse training data for dysarthric speakers. In: Proceedings of Eurospeech 2003, Geneva, pp. 1189–1192 (2003)
Google Scholar
Tolba, H., El Torgoman, A.S.: Towards the improvement of automatic recognition of dysarthric speech. In: 2009 2nd International Conference on Computer Science and Information Technology. IEEE (2009)
Google Scholar
Dhanalakshmi, M., Vijayalakshmi, P.: Intelligibility modification of dysarthric speech using HMM-based adaptive synthesis system. In: 2015 2nd International Conference on Biomedical Engineering (ICoBE), IEEE (2015)
Google Scholar
Yates, A.J.: Delayed auditory feedback. Psychol. Bull., 213 (1963)
Article Google Scholar
Blanchet, P.G., Hoffman, P.R.: Factors influencing the effects of delayed auditory feedback on dysarthric speech associated with Parkinsons disease. J. Commun. Disord., Deaf. Stud. Hear. Aids (2014)
Google Scholar
Downie, A.W., Low, J.M., Lindsay, D.D.: Speech disorder in parkinsonism usefulness of delayed auditory feedback in selected cases. Int. J. Lang. Commun. Disord., 135–139 (1981)
Article Google Scholar
Menendez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E.. Bunnel, H.T.: The nemours database of dysarthric speech. In: Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, vol. 3, pp. 1962–1965 (1996)
Google Scholar
Hosom, J.-P., Kain, A.B. Mishra, T., Van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03), vol. 1, p. I-924. IEEE (2003)
Google Scholar
Das, D., Santhosh Kumar, C., Reghu Raj, P.C.: Dysarthric speech enhancement using formant trajectory refinement. Int. J. Latest Trends Eng. Technol. (IJLTET) 2(4) 2013
Google Scholar
Selouani, S.-A., Yakoub, M.S., O’Shaughnessy, D.: Alternative speech communication system for persons with severe speech disorders. In: EURASIP Journal on Advances in Signal Processing, p. 6 (2009)
Google Scholar
Tomik, B., Krupinski, J., Glodzik-Sobanska, L., BalaSlodowska, M., Wszolek, W., Kusiak, M., Lechwacka, A.: Acoustic analysis of dysarthria profile in ALS patients. J. Neurol. Sci. 169(1), 35–42 (1999)
Article Google Scholar
Alexander, K., Niu, X., Hosom, J.-P., Miao, Q., van Santen, J.P.H.: Formant Re-synthesis of Dysarthric Speech. In: Fifth ISCA Workshop on Speech Synthesis, pp. 25–30 (2004)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)
Google Scholar
Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition (1993)
Google Scholar
Ellis, D.: Dynamic time warp (DTW) in Matlab. Web resource http://www.ee.columbia.edu/dpwe/resources/matlab/dtw (2003)
Boersma, P.: Praat, a system for doing phonetics by computer. Glot International, pp. 341–345 (2002)
Google Scholar
Slaney, M.: Auditory Toolbox. Interval Research Corporation, Tech. Rep 10 (1998)
Google Scholar
van Heuven, V.J., Pols, L.C.: Analysis and synthesis of speech: strategic research towards high-quality text-to-speech generation, vol. 11. Walter de Gruyter (1993)
Google Scholar
O’shaughnessy, D.: Speech Communication: Human and Machine. Universities Press (1987)
Google Scholar
Harma A., Laine, U.K.: A comparison of warped and conventional linear predictive coding. In: IEEE Transactions on Speech and Audio Processing, pp. 579–588 (2001)
Article Google Scholar
Loizou, P.C.: Speech quality assessment. In: Multimedia Analysis, Processing and Communications, Springer, Berlin Heidelberg, pp. 623–654 (2011)
Chapter Google Scholar
Gu, P.L., Harris, J.G., Shrivastav, R., Sapienza, C.: Disordered speech evaluation using objective quality measures. In: ICASSP, pp. 321–324 (2005)
Google Scholar

Download references

Declaration

We the authors declare that this manuscript is original, has not been published before and is not currently being considered for publication elsewhere.

We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.

We further confirm that the UA-speech database used in the work covered in this manuscript has been acquired from the protected ISLE data server with access provided by Mark Hasegawa-Johnson, Professor ECE, University of Illinois. The audio recordings of human patients have been conducted with the ethical approval of all relevant bodies and subjects who refused permission are not represented in the database distribution.

We understand that the Corresponding Author is the sole contact for the Editorial process. The Corresponding Author is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, ASET, Amity University Uttar Pradesh, Noida, UP, India
Arpan Roy & Lakshya Thakur
Computer Science Engineering, ASET, Amity University Uttar Pradesh, Noida, UP, India
Garima Vyas & Gaurav Raj

Authors

Arpan Roy
View author publications
You can also search for this author in PubMed Google Scholar
Lakshya Thakur
View author publications
You can also search for this author in PubMed Google Scholar
Garima Vyas
View author publications
You can also search for this author in PubMed Google Scholar
Gaurav Raj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arpan Roy .

Editor information

Editors and Affiliations

Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, NJ, USA
Jiacun Wang
Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangaluru, Karnataka, India
G. Ram Mohana Reddy
Department of Computer Science and Engineering, JNTUH College of Engineering Hyderabad, Hyderabad, Telangana, India
V. Kamakshi Prasad
Department of Electronics and Communication Engineering, Malla Reddy College of Engineering & Technology, Secunderabad, Telangana, India
V. Sivakumar Reddy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Roy, A., Thakur, L., Vyas, G., Raj, G. (2019). Towards Improving the Intelligibility of Dysarthric Speech. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds) Soft Computing and Signal Processing . Advances in Intelligent Systems and Computing, vol 898. Springer, Singapore. https://doi.org/10.1007/978-981-13-3393-4_56

Download citation

DOI: https://doi.org/10.1007/978-981-13-3393-4_56
Published: 14 February 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3392-7
Online ISBN: 978-981-13-3393-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics