Skip to main content

Towards Improving the Intelligibility of Dysarthric Speech

  • Conference paper
  • First Online:
Soft Computing and Signal Processing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 898))

Abstract

Humans utilize many muscles to produce intelligible speech, including lips, face and throat. Dysarthria is a speech disorder that surfaces when one has weal muscles due to brain damage. Primary characteristics of a dysarthric patient are slurred and slow speech that can be difficult to understand based on the severity of the condition. This paper proposes an approach to improve the intelligibility of the dysarthric speech using a simple yet effective speech-transformation technique such as warping the frequency of LPC poles and mapping coefficients of linear predictive coding. This technique was applied to dysarthric audio from the UA-speech database to obtain the desired results. Both objective and subjective measures are used to evaluate the transformed speech. The obtained results pointed towards a significant improvement in the dysarthric speech’s intelligibility. This method can be used to develop special voice-enabled search platforms for dysarthric patients and in helping rehabilitation of the patients by developing a speech therapy based on auditory feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rudzicz, F.: Adjusting dysarthric speech signals to be more intelligible. Comput. Speech Lang. 27(6), 1163–1177 (2013). Special Issue on Speech and Language Processing for Assistive Technology

    Google Scholar 

  2. Hosom, J.-P., Kain, A.B., Mishra, T., van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 1, pp. 924–927 (2003 April)

    Google Scholar 

  3. Green, P., Carmichael, J., Hatzis, A., Enderby, P., Hawley, M., Parker, M.: Automatic speech recognition with sparse training data for dysarthric speakers. In: Proceedings of Eurospeech 2003, Geneva, pp. 1189–1192 (2003)

    Google Scholar 

  4. Tolba, H., El Torgoman, A.S.: Towards the improvement of automatic recognition of dysarthric speech. In: 2009 2nd International Conference on Computer Science and Information Technology. IEEE (2009)

    Google Scholar 

  5. Dhanalakshmi, M., Vijayalakshmi, P.: Intelligibility modification of dysarthric speech using HMM-based adaptive synthesis system. In: 2015 2nd International Conference on Biomedical Engineering (ICoBE), IEEE (2015)

    Google Scholar 

  6. Yates, A.J.: Delayed auditory feedback. Psychol. Bull., 213 (1963)

    Article  Google Scholar 

  7. Blanchet, P.G., Hoffman, P.R.: Factors influencing the effects of delayed auditory feedback on dysarthric speech associated with Parkinsons disease. J. Commun. Disord., Deaf. Stud. Hear. Aids (2014)

    Google Scholar 

  8. Downie, A.W., Low, J.M., Lindsay, D.D.: Speech disorder in parkinsonism usefulness of delayed auditory feedback in selected cases. Int. J. Lang. Commun. Disord., 135–139 (1981)

    Article  Google Scholar 

  9. Menendez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E.. Bunnel, H.T.: The nemours database of dysarthric speech. In: Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, vol. 3, pp. 1962–1965 (1996)

    Google Scholar 

  10. Hosom, J.-P., Kain, A.B. Mishra, T., Van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03), vol. 1, p. I-924. IEEE (2003)

    Google Scholar 

  11. Das, D., Santhosh Kumar, C., Reghu Raj, P.C.: Dysarthric speech enhancement using formant trajectory refinement. Int. J. Latest Trends Eng. Technol. (IJLTET) 2(4) 2013

    Google Scholar 

  12. Selouani, S.-A., Yakoub, M.S., O’Shaughnessy, D.: Alternative speech communication system for persons with severe speech disorders. In: EURASIP Journal on Advances in Signal Processing, p. 6 (2009)

    Google Scholar 

  13. Tomik, B., Krupinski, J., Glodzik-Sobanska, L., BalaSlodowska, M., Wszolek, W., Kusiak, M., Lechwacka, A.: Acoustic analysis of dysarthria profile in ALS patients. J. Neurol. Sci. 169(1), 35–42 (1999)

    Article  Google Scholar 

  14. Alexander, K., Niu, X., Hosom, J.-P., Miao, Q., van Santen, J.P.H.: Formant Re-synthesis of Dysarthric Speech. In: Fifth ISCA Workshop on Speech Synthesis, pp. 25–30 (2004)

    Google Scholar 

  15. Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)

    Google Scholar 

  16. Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition (1993)

    Google Scholar 

  17. Ellis, D.: Dynamic time warp (DTW) in Matlab. Web resource http://www.ee.columbia.edu/dpwe/resources/matlab/dtw (2003)

  18. Boersma, P.: Praat, a system for doing phonetics by computer. Glot International, pp. 341–345 (2002)

    Google Scholar 

  19. Slaney, M.: Auditory Toolbox. Interval Research Corporation, Tech. Rep 10 (1998)

    Google Scholar 

  20. van Heuven, V.J., Pols, L.C.: Analysis and synthesis of speech: strategic research towards high-quality text-to-speech generation, vol. 11. Walter de Gruyter (1993)

    Google Scholar 

  21. O’shaughnessy, D.: Speech Communication: Human and Machine. Universities Press (1987)

    Google Scholar 

  22. Harma A., Laine, U.K.: A comparison of warped and conventional linear predictive coding. In: IEEE Transactions on Speech and Audio Processing, pp. 579–588 (2001)

    Article  Google Scholar 

  23. Loizou, P.C.: Speech quality assessment. In: Multimedia Analysis, Processing and Communications, Springer, Berlin Heidelberg, pp. 623–654 (2011)

    Chapter  Google Scholar 

  24. Gu, P.L., Harris, J.G., Shrivastav, R., Sapienza, C.: Disordered speech evaluation using objective quality measures. In: ICASSP, pp. 321–324 (2005)

    Google Scholar 

Download references

Declaration

We the authors declare that this manuscript is original, has not been published before and is not currently being considered for publication elsewhere.

We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.

We further confirm that the UA-speech database used in the work covered in this manuscript has been acquired from the protected ISLE data server with access provided by Mark Hasegawa-Johnson, Professor ECE, University of Illinois. The audio recordings of human patients have been conducted with the ethical approval of all relevant bodies and subjects who refused permission are not represented in the database distribution.

We understand that the Corresponding Author is the sole contact for the Editorial process. The Corresponding Author is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arpan Roy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Roy, A., Thakur, L., Vyas, G., Raj, G. (2019). Towards Improving the Intelligibility of Dysarthric Speech. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds) Soft Computing and Signal Processing . Advances in Intelligent Systems and Computing, vol 898. Springer, Singapore. https://doi.org/10.1007/978-981-13-3393-4_56

Download citation

Publish with us

Policies and ethics