Advertisement

Towards Improving the Intelligibility of Dysarthric Speech

  • Arpan RoyEmail author
  • Lakshya Thakur
  • Garima Vyas
  • Gaurav Raj
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 898)

Abstract

Humans utilize many muscles to produce intelligible speech, including lips, face and throat. Dysarthria is a speech disorder that surfaces when one has weal muscles due to brain damage. Primary characteristics of a dysarthric patient are slurred and slow speech that can be difficult to understand based on the severity of the condition. This paper proposes an approach to improve the intelligibility of the dysarthric speech using a simple yet effective speech-transformation technique such as warping the frequency of LPC poles and mapping coefficients of linear predictive coding. This technique was applied to dysarthric audio from the UA-speech database to obtain the desired results. Both objective and subjective measures are used to evaluate the transformed speech. The obtained results pointed towards a significant improvement in the dysarthric speech’s intelligibility. This method can be used to develop special voice-enabled search platforms for dysarthric patients and in helping rehabilitation of the patients by developing a speech therapy based on auditory feedback.

Keywords

Dysarthria LPC Speech intelligibility Frequency warping Speech therapy Speech enhancement Auditory feedback 

Notes

Declaration

We the authors declare that this manuscript is original, has not been published before and is not currently being considered for publication elsewhere.

We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.

We further confirm that the UA-speech database used in the work covered in this manuscript has been acquired from the protected ISLE data server with access provided by Mark Hasegawa-Johnson, Professor ECE, University of Illinois. The audio recordings of human patients have been conducted with the ethical approval of all relevant bodies and subjects who refused permission are not represented in the database distribution.

We understand that the Corresponding Author is the sole contact for the Editorial process. The Corresponding Author is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. We confirm that we have provided a current, correct email address which is accessible by the Corresponding Author.

References

  1. 1.
    Rudzicz, F.: Adjusting dysarthric speech signals to be more intelligible. Comput. Speech Lang. 27(6), 1163–1177 (2013). Special Issue on Speech and Language Processing for Assistive TechnologyGoogle Scholar
  2. 2.
    Hosom, J.-P., Kain, A.B., Mishra, T., van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 1, pp. 924–927 (2003 April)Google Scholar
  3. 3.
    Green, P., Carmichael, J., Hatzis, A., Enderby, P., Hawley, M., Parker, M.: Automatic speech recognition with sparse training data for dysarthric speakers. In: Proceedings of Eurospeech 2003, Geneva, pp. 1189–1192 (2003)Google Scholar
  4. 4.
    Tolba, H., El Torgoman, A.S.: Towards the improvement of automatic recognition of dysarthric speech. In: 2009 2nd International Conference on Computer Science and Information Technology. IEEE (2009)Google Scholar
  5. 5.
    Dhanalakshmi, M., Vijayalakshmi, P.: Intelligibility modification of dysarthric speech using HMM-based adaptive synthesis system. In: 2015 2nd International Conference on Biomedical Engineering (ICoBE), IEEE (2015)Google Scholar
  6. 6.
    Yates, A.J.: Delayed auditory feedback. Psychol. Bull., 213 (1963)CrossRefGoogle Scholar
  7. 7.
    Blanchet, P.G., Hoffman, P.R.: Factors influencing the effects of delayed auditory feedback on dysarthric speech associated with Parkinsons disease. J. Commun. Disord., Deaf. Stud. Hear. Aids (2014)Google Scholar
  8. 8.
    Downie, A.W., Low, J.M., Lindsay, D.D.: Speech disorder in parkinsonism usefulness of delayed auditory feedback in selected cases. Int. J. Lang. Commun. Disord., 135–139 (1981)CrossRefGoogle Scholar
  9. 9.
    Menendez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E.. Bunnel, H.T.: The nemours database of dysarthric speech. In: Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, vol. 3, pp. 1962–1965 (1996)Google Scholar
  10. 10.
    Hosom, J.-P., Kain, A.B. Mishra, T., Van Santen, J.P.H., Fried-Oken, M., Staehely, J.: Intelligibility of modifications to dysarthric speech. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03), vol. 1, p. I-924. IEEE (2003)Google Scholar
  11. 11.
    Das, D., Santhosh Kumar, C., Reghu Raj, P.C.: Dysarthric speech enhancement using formant trajectory refinement. Int. J. Latest Trends Eng. Technol. (IJLTET) 2(4) 2013Google Scholar
  12. 12.
    Selouani, S.-A., Yakoub, M.S., O’Shaughnessy, D.: Alternative speech communication system for persons with severe speech disorders. In: EURASIP Journal on Advances in Signal Processing, p. 6 (2009)Google Scholar
  13. 13.
    Tomik, B., Krupinski, J., Glodzik-Sobanska, L., BalaSlodowska, M., Wszolek, W., Kusiak, M., Lechwacka, A.: Acoustic analysis of dysarthria profile in ALS patients. J. Neurol. Sci. 169(1), 35–42 (1999)CrossRefGoogle Scholar
  14. 14.
    Alexander, K., Niu, X., Hosom, J.-P., Miao, Q., van Santen, J.P.H.: Formant Re-synthesis of Dysarthric Speech. In: Fifth ISCA Workshop on Speech Synthesis, pp. 25–30 (2004)Google Scholar
  15. 15.
    Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)Google Scholar
  16. 16.
    Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition (1993)Google Scholar
  17. 17.
    Ellis, D.: Dynamic time warp (DTW) in Matlab. Web resource http://www.ee.columbia.edu/dpwe/resources/matlab/dtw (2003)
  18. 18.
    Boersma, P.: Praat, a system for doing phonetics by computer. Glot International, pp. 341–345 (2002)Google Scholar
  19. 19.
    Slaney, M.: Auditory Toolbox. Interval Research Corporation, Tech. Rep 10 (1998)Google Scholar
  20. 20.
    van Heuven, V.J., Pols, L.C.: Analysis and synthesis of speech: strategic research towards high-quality text-to-speech generation, vol. 11. Walter de Gruyter (1993)Google Scholar
  21. 21.
    O’shaughnessy, D.: Speech Communication: Human and Machine. Universities Press (1987)Google Scholar
  22. 22.
    Harma A., Laine, U.K.: A comparison of warped and conventional linear predictive coding. In: IEEE Transactions on Speech and Audio Processing, pp. 579–588 (2001)CrossRefGoogle Scholar
  23. 23.
    Loizou, P.C.: Speech quality assessment. In: Multimedia Analysis, Processing and Communications, Springer, Berlin Heidelberg, pp. 623–654 (2011)CrossRefGoogle Scholar
  24. 24.
    Gu, P.L., Harris, J.G., Shrivastav, R., Sapienza, C.: Disordered speech evaluation using objective quality measures. In: ICASSP, pp. 321–324 (2005)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Arpan Roy
    • 1
    Email author
  • Lakshya Thakur
    • 1
  • Garima Vyas
    • 2
  • Gaurav Raj
    • 2
  1. 1.Department of Electronics and Communication EngineeringASET, Amity University Uttar PradeshNoidaIndia
  2. 2.Computer Science EngineeringASET, Amity University Uttar PradeshNoidaIndia

Personalised recommendations