Advertisement

International Journal of Speech Technology

, Volume 22, Issue 4, pp 959–969 | Cite as

A novel system for effective speech recognition based on artificial neural network and opposition artificial bee colony algorithm

  • Shilpi Shukla
  • Madhu JainEmail author
Article

Abstract

The problem related to speech recognition system becomes challenging if vocabularies are having too many similar-sounding words. To overcome these types of challenges, an effective speech recognition system using artificial neural network (ANN) with optimization technique is proposed. In this system, distinct words spoken by different people are considered as input speech signal. The features of these input speech signals are extracted using amplitude modulation spectrogram. The extracted features are then the input to the ANN for training. The trained ANN inputs are used for predicting the isolated words during testing. In this work, the default structure of ANN is redesigned using Levenberg–Marquardt algorithm, to retrieve optimal prediction rate with accuracy. The hidden layers and neurons of the hidden layers are further optimized using the opposition artificial bee colony optimization technique. The outcome of the system demonstrates that the sensitivity, specificity, and accuracy of the proposed technique is 90.41%, 99.66%, and 99.36%, respectively, which is better than all the existing methods.

Keywords

Speech signal Amplitude modulation spectrogram Artificial neural network Levenberg–Marquardt algorithm Opposition artificial bee colony 

Notes

References

  1. Abdel-rahman, M., George, E. D., & Geoffrey, H. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech and Language Processing,20(1), 14–22.Google Scholar
  2. Albadr, M. A. A., Tiun, S., Ayob, M., & AL-Dhief, F. T. (2019). Spoken language identification based on optimised genetic algorithm–extreme learning machine approach. International Journal of Speech Technology.  https://doi.org/10.1007/s10772-019-09621-w.Google Scholar
  3. Ali, Z., & Talha, M. (2018). Innovative method for unsupervised voice activity detection and classification of audio segments. Special Section on Radio Frequency Identification and Security Techniques, IEEE Access,6, 15494–15504.Google Scholar
  4. Ananthi, S., & Dhanalakshmi, P. (2013). Speech recognition system and isolated word recognition based on hidden markov model (HMM) for hearing impaired. International Journal of Computer Applications,73(20), 30–34.Google Scholar
  5. Anusha, K. P. (2012). Determination of noise levels in using AMS features of noisy speech signal and their comparison. International Journal of Advanced Research in Computer Engineering & Technology,1(5), 75–78.Google Scholar
  6. Beltran, Angelo A., Ericson, D. D., & Donde, A. D. (2015). Speaker dependent voice recognition using discrete wavelet transform. International Journal of Scientific Engineering and Technology,4(8), 443–446.Google Scholar
  7. Biagetti, G., Crippa, P., Falaschetti, L., & Turchetti, C. (2018). HMM speech synthesis based on MDCT representation. International Journal of Speech Technology.  https://doi.org/10.1007/s10772-018-09571-9.Google Scholar
  8. Chang, H. Y., & Bin, M. A. (2017). Spectral-domain speech enhancement for speech recognition. Speech Communication,94, 30–41.Google Scholar
  9. Georg, H., Hermann, N., Ralf, S., & Simon, W. (2012). Discriminative training for automatic speech recognition. IEEE Signal Processing Magazine,29(6), 58–69.Google Scholar
  10. Gulin, D., & Murat, H. S. (2010). Speech recognition with artificial neural networks. Digital Signal Processing,20, 763–768.Google Scholar
  11. Gupta, M., Jain, M., & Kumar, B. (2010). Novel class of stable wideband recursive digital integrators and differentiators. IET Signal Processing,4(5), 560–566.Google Scholar
  12. Gupta, M., Jain, M., & Kumar, B. (2011). Recursive wideband digital integrator and differentiator. International Journal of Circuit Theory and Applications,39(7), 775–782.Google Scholar
  13. Gupta, M., Jain, M., & Kumar, B. (2012). Wideband digital integrator and differentiator. IETE Journal of Research,58(2), 166–170.Google Scholar
  14. Hasan, B., Alper, B., Abdullah, C., & Mehmet, E. Y. (2017). A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited–memory BFGS optimization algorithms. Neurocomputing,266, 506–526.Google Scholar
  15. Ibrahim, E. H., Walid, K., Osama, E., & Al-Zahraa, A. (2014). Recognition of phonetic arabic figures via wavelet based mel frequency cepstrum using HMMs. Journal of Housing and Building National Research Center,10(1), 49–54.Google Scholar
  16. Jain, M., Gupta, M., & Jain, N. (2012). Linear phase second order recursive digital integrators and differentiators. Radioengineering,21(2), 712–717.Google Scholar
  17. Jain, M., Gupta, M., & Jain, N. (2013). Analysis and design of digital IIR integrators and differentiators using minimax and pole, zero and constant optimization methods. ISRN Electronics,2013, 1–14.Google Scholar
  18. Jain, M., Gupta, M., & Jain, N. (2014). The design of the IIR differintegrator and its application in edge detection. Journal of Information Processing Systems,10(2), 223–239.Google Scholar
  19. Jain, M., Gupta, M., & Jain, N. (2016). Design of half sample delay recursive digital integrators using trapezoidal integration rule. International Journal of Signal & Imaging Systems Engineering,9(2), 126–134.Google Scholar
  20. Karaboga, D. (2006). An idea based on honey bee swarm for numerical optimization. Technical report TR06, engineering faculty, computer engineering.Google Scholar
  21. Kennedy, J., & Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of ICNN’95- International Conference on Neural Networks, 4, 1942–1948.Google Scholar
  22. Khaled, D., & Tarek, A. T. (2015). Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Applied Soft Computing,27, 231–239.Google Scholar
  23. Kuldeep, K., Aggarwal, J. A., & Ankita, J. (2011). An analysis of speech recognition performance based upon network layers and transfer functions. International Journal of Computer Science, Engineering and Applications,1(3), 11–20.Google Scholar
  24. Michael, S., Dong, Y. & Yongqiang, W. (2013). An investigation of deep neural networks for noise robust speech recognition. In Proceedings of IEEE international conference on acoustics, speech and signal processing (pp. 7398–7402).Google Scholar
  25. Moataz, E. A., Mohamed, K., & Fakhri, K. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition,44(3), 572–587.zbMATHGoogle Scholar
  26. Mohammed, E. A. (2011). Opposition-based artificial bee colony algorithm. In Proceedings of the genetic and evolutionary computation conference (pp. 109–115).Google Scholar
  27. Nazri, M. N., Abdullah, K., & Rehman, M. Z. (2013). A new levenberg marquardt based back propagation algorithm trained with cuckoo search. Procedia Technology,11, 18–23.Google Scholar
  28. Niko, M., Jorn, A. & Birger, K. (2011). Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments. In Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5492–5495).Google Scholar
  29. Orcik, L., Voznak, M., & Rozhon, J. (2017). Prediction of speech quality based on resilient back propagation artificial neural network. Wireless Personal Communications,96, 5375–5389.Google Scholar
  30. Pankaj, R., Sushil, K., & Shweta, R. (2015). Speech recognition using neural network. In IJCA Proceedings on international conference on advancements in engineering and technology (ICAET) (pp. 11–14).  Google Scholar
  31. Salam, M. S., Dzulkifli, M., & Sheikh, S. (2011). Malay isolated speech recognition using neural network: A work in finding number of hidden nodes and learning parameters. The International Arab Journal of Information Technology, 8(4), 364–371.Google Scholar
  32. Shukla, S., Jain, M., & Dubey, R. K. (2019). Increasing the performance of speech recognition system by using different optimization techniques to redesign artificial neural network. Journal of Theoretical and Applied Information Technology, 97(8), 2404–2415.Google Scholar
  33. Sigappi, A. N., & Palanivel, S. (2012). Spoken word recognition strategy for tamil language. International Journal of Computer Science Issues,9(3), 227–233.Google Scholar
  34. Sina, S., & Saeed, B. S. (2018). Evaluation of a novel fuzzy sequential pattern recognition tool (fuzzy elastic matching machine) and its applications in speech and handwriting recognition. Applied Soft Computing,62, 315–327.Google Scholar
  35. Tara, N. S. J., Weiss, R. J., Kevin, W. W., Bo, L., Arun, N., Ehsan, V., et al. (2017). Multichannel signal processing with deep neural networks for automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing,25(5), 965–979.Google Scholar
  36. Vimala, C., & Radha, V. (2012). A review on speech recognition challenges and approaches. World of Computer Science and Information Technology Journal (WCSIT),2(1), 1–7.Google Scholar
  37. Xin, M., & Weidong, Z. (2008). AMS based spectrum subtraction algorithm with confidence interval test. In Proceedings of 7th asian-pacific conference on medical and biological engineering (pp. 389–391).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electronics and Communication EngineeringMahatma Gandhi Mission’s College of Engineering & TechnologyNoidaIndia
  2. 2.Department of Electronics and Communication EngineeringJaypee Institute of Information TechnologyNoidaIndia

Personalised recommendations