Automated speech signal analysis based on feature extraction and classification of spasmodic dysphonia: a performance comparison of different classifiers
Spasmodic Dysphonia is a voice disorder caused due to spasm of involuntary muscles in the voice box. These spasms can leads to breathy, soundless voice breaks, strangled voice by interrupting the opening of the vocal folds. There is no specific test for the diagnosis of spasmodic dysphonia. The cause of occurrence is unknown, there is no cure for the disorder, but treatments can improve the quality of voice. The main aim and objectives of the study are (i) to diagnose the dysphonia and to have comparative analysis on both continuous speech signal and sustained phonation /a/ by extracting the acoustic features. (ii) to extract the acoustic features by means of semi automated method using PRAAT software and automated method using FFT algorithm (ii) to classify the normal and spasmodic dysphonic patients using different classifiers such as Levenberg Marquardt Back propagation algorithm, K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) based on sensitivity and accuracy. Thirty normal and thirty abnormal patients were considered in the proposed study. The performance of three different classifiers was studied and it was observed that SVM and KNN were 100% accurate, whereas Levinberg BPN network produced an accuracy of about 96.7%. The voice sample of dysphonia patients showed variations from the normal speech samples. Automated analysis method was able to detect dysphonia and provides better results compared to semi automated method.
KeywordsDysphonia SVM Back propagation Speech signal Acoustic features
- Arjmandi, M. K., Pooyan, M., Mohammadnejad, H., & Vali, M. (2010) Voice Disorders Identification Based on Different Feature Reduction Methodologies and Support Vector Machine, Proceedings of ICEE, IEEE, doi: 10.1109/IRANIANCEE.2010.5507106.
- Bhagvathi, S., & Padma, S. I. (2017). Neural network based voiced and unvoiced classification using EGG and MFCC feature. International Research Journal of Engineering and Technology, 4(4), 1934–1937.Google Scholar
- Boersm, P., & Weenink, .D.: (2003) PRAAT: doing phonetics by computer. http://www.fon.hum.uva.nl/praat.
- Hernandez-Espinosa, C., Gomez-Vilda, P., Godino-Llorente, J. I., & Aguilera-Navarro, S. (2000). Diagnosis of Vocal and Voice Disorders by the Speech Signal. Proceedings of IEEE-INNS-ENNS International joint conference on neural networks doi:10.1109/IJCNN.2000.860781.Google Scholar
- Kayal, A. J., & Nirmal, J. (2016). Multilingual vocal emotion recognition and classification using back propagation neural network. AIP conference Proceedings 1715, 020054: doi: 10.1063/1.4942736.
- Khushboo Batra, Swati, & Bhasin, Amandeep Singh (2015). Acoustic analysis of voice samples to differentiate healthy and asthmatic persons. International Journal of Engineering and Computer Science, 4(7), 13161–13164.Google Scholar
- Kizi, O., & Uncuoglu, E((2005). Comparison of three back propagation training algorithm for two case studies. Indian Journal of Engineering and Material Sciences, 12, 434–442.Google Scholar
- Konadath, S., Suma, C., Jayaram, G., Sandeep, M., Mahima, G., & Shreyank, P. S. (2013). A prevalence of communication disorders in a rural population of republic of India. Journal of hearing system, 3(2), OA41-49.Google Scholar
- Majstorovic, N., Andric, M., & Mikluc, D. (2011). Entropy-based algorithm for speech recognition in noisy environment. 19th Telecommunication forum; pp. 667–670.Google Scholar
- Orozco-Arrovave, J. R., Belalcazar-Balanos, E. A., Arias-Londono, J. D., Vargas-Bonilla, J. F., Skodda, S., Rusz, J., Daqrouq, K., Honig, F., & Noth, E. (2015). Characterization methods for the detection of multiple voice disorders: Neurological, functional and Laryngeal diseases. IEEE J Biomed Health Inform, 19(6), 1820–1828.CrossRefGoogle Scholar
- Panek, D., Skalski, A., Gajda, J., & Tadeusiewicz, R. (2015). Acoustic analysis assessment in speech pathology detection. International Journal of Applied Maths and Computer Science, 25(3), 631–643.Google Scholar
- Rani, P., Kakkar, S., & Rani, S.(2015), Speech recognition using neural network. International journal of computer applications 11–14.Google Scholar
- Salhi, L., Mourad, T., & Cherif (2010). A Voice disorders identification using multilayer neural network. International Arab Journal of Information Technology, 7(2), 177–185.Google Scholar
- Salhi, L., Talbi, M., & Cherif, A. (2008). Gamma chirp wavelet and neural network for identification of pathological voices. Journal of Engineering and Applied Science, 3(11), 822–828.Google Scholar
- Schuck, A., Guimaraes, L. V., & Wisbeck, J. O. (2003). Dysphonic voice classification using wavelet packet transform and artificial neural network. Proceedings of the 25th international conference of the IEEE EMBS, 2958–2961.Google Scholar
- Shah, J. L., Smolenski, B. Y., Yantomo, R. E., & Iyer, A. (2004) Sequential K-Nearest neighbor pattern recognition for usable speech classification. Proceedings of 12th European signal processing conference pp. 741–744.Google Scholar
- Sonkamble, B. A., Doye, D. D., & Sonkamble, S. (2009). An efficient use of support vector machines for speech signal classification,.Proc Eighth WSEAS Int Conf Computational Intelligence., Man-Machine systems and Cybernetics, pp. 117–120.Google Scholar
- Srinivas, V., Rani, C. S., & Madhu, T. (2014). Neural network based classification for speaker identification International journal of signal processing. Image Processing and Pattern, 7(1), 109–120.Google Scholar
- Uma Rani, K., & Holi, M. S. (2014). A comparative study of neural networks and support vector machines for neurological disordered voice classification. International Journal of Engineering Research and Technology, 3(4), 652–658.Google Scholar