Automatic Speech Recognition Using Deep Neural Network
- 17 Downloads
Automatic speech recognition acknowledges the spoken words and converts them to a machine-readable format of text. By converting spoken audio into text, this technology allows users to control digital devices by speaking instead of using conventional tools like keystrokes and buttons. The challenges in speech recognition are the improvisation of the accuracy, varying user responsiveness, performance, reliability and fault tolerance. The audio signal quality affects the recognition accuracy rate. Delayed speech recognition is used to overcome the issues by user responsiveness. This is because the pronunciation of a word differs when used under different contexts. Since the world is moving at a rapid pace towards digitisation, new technologies are being developed to make lives easy. Interactive Voice Response System is an example. The Interactive Voice Response System allows the computer to interact with human by using their voices. We have proposed an Interactive Voice Response System for railway reservation system. The proposed approach uses LSTM with CTC to recognise the spoken word. The methods used in the creation of this model outperform other models where testing is done to arrive at the resultant with a better accuracy.
KeywordsAutomatic speech recognition MFCC LSTM CTC
We would like to thank the management of SSN College of Engineering for funding GPU system, which helps us to carry out the deep learning-related research work.
- 1.Dhanashri, D., Dhonde, S.B.: Speech recognition using neural networks: a review. Int. J. Multidiscip. Res. Dev. 2(6), 226–229 (2015)Google Scholar
- 2.Geetha, K., Dr. Vadivel, R.: Phoneme segmentation of Tamil speech signals using spectral transition measure. Orient. J. Comput. Sci. Technol. 10, 114–119 (2017)Google Scholar
- 3.Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural network. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (2013)Google Scholar
- 4.Halageri, A., Bidappa, A., Arjun, C., Sarathy, M., Sultana, S.: Speech recognition using deep learning. Int. J. Comput. Sci. Inf. Technol. 6(3), 3206–3209 (2015)Google Scholar
- 5.Kim, S., Hori, T., Watanabe, S.: Joint CTC-attention based end-to-end speech recognition using multi-task learning (2017). arXiv:1609.06773v2
- 6.Lekshmi, K., Dr. Sherly, E.: Automatic speech recognition using different neural network architectures a survey. Int. J. Comput. Sci. Inf. Technol. 7(6), 2422–2427 (2016)Google Scholar
- 7.Liu, E.: Deep convolutional and LSTM neural networks for acoustic modelling in automatic speech recognition. In: vol. 8, no. 6. Pearson Education Inc. (2011)Google Scholar
- 8.Panzner, M., Cimiano, P: Comparing hidden Markov models and long short term memory neural networks for learning action representations. In: Proceedings of International Workshop on Machine Learning, Optimization, and Big Data, pp. 94–105 (2016)Google Scholar
- 9.Rubi, C.: Rana: review on speech recognition with deep learning method. Int. J. Comput. Sci. Mobile Comput. 4(8), 301–307 (2015)Google Scholar
- 10.Tebelskis, J.: Speech recognition using neural networks. In: Proceedings of CMU-CS-95-142 (1995)Google Scholar