An Automatic Tamil Speech Recognition system by using Bidirectional Recurrent Neural Network with Self-Organizing Map

  • S. Lokesh
  • Priyan Malarvizhi Kumar
  • M. Ramya Devi
  • P. Parthasarathy
  • C. Gokulnath
S.I. : Emerging Intelligent Algorithms for Edge-of-Things Computing
  • 26 Downloads

Abstract

Speech recognition is one of the entrancing fields in the zone of computer science. Exactness of speech recognition framework may decrease because of the nearness of noise exhibited by the speech signal. Consequently, noise removal is a fundamental advance in automatic speech recognition (ASR) system. ASR is researched for various languages in light of the fact that every language has its particular highlights. Particularly, the requirement for ASR framework in Tamil language has been expanded broadly over the most recent couple of years. In this work, bidirectional recurrent neural network (BRNN) with self-organizing map (SOM)-based classification scheme is suggested for Tamil speech recognition. At first, the input speech signal is pre-prepared by utilizing Savitzky–Golay filter keeping in mind the end goal to evacuate the background noise and to improve the signal. At that point, Multivariate Autoregressive based highlights by presenting discrete cosine transformation piece to give a proficient signal investigation. And in addition, perceptual linear predictive coefficients likewise separated to enhance the classification accuracy. The feature vector is shifted in measure, for picking the right length of feature vector SOM utilized. At long last, Tamil digits and words are ordered by utilizing BRNN classifier where the settled length feature vector from SOM is given as input, named as BRNN-SOM. The experimental analysis demonstrates that the suggested conspire accomplished preferable outcomes looked at over exist deep neural network–hidden Markov model algorithm regarding signal-to-noise ratio, classification accuracy, and mean square error.

Keywords

Automatic Tamil Speech Recognition Preprocessing Feature extraction Classification Bidirectional Recurrent Neural Network (BRNN) Self-Organizing Map (SOM) Savitzky–Golay Filter (SGF) Multivariate Autoregressive (MAR) Discrete Cosine Transformation (DCT) Perceptual Linear Predictive (PLP) 

Notes

Compliance with ethical standards

Conflict of interest

This statement is to certify that all authors have seen and approved the manuscript being submitted. We warrant that the article is the authors’ original work. We warrant that the article has not received prior publication and is not under consideration for publication elsewhere. On behalf of all co-authors, the corresponding author shall bear full responsibility for the submission. The author(s) declare that there is no conflict of interest.

References

  1. 1.
    Varatharajan R, Manogaran G, Priyan MK, Sundarasekar R (2017) Wearable sensor devices for early detection of Alzheimer disease using dynamic time warping algorithm. Cluster Comput.  https://doi.org/10.1007/s10586-017-0977-2 Google Scholar
  2. 2.
    Varatharajan R, Manogaran G, Priyan MK, Balaş VE, Barna C (2017) Visual analysis of geospatial habitat suitability model based on inverse distance weighting with paired comparison analysis. Multimedia Tools Appl.  https://doi.org/10.1007/s11042-017-4768-9 Google Scholar
  3. 3.
    Balan EV, Priyan MK, Gokulnath C, Devi GU (2015) Fuzzy based intrusion detection systems in MANET. Procedia Comput Sci 50:109–114CrossRefGoogle Scholar
  4. 4.
    Devi GU, Balan EV, Priyan MK, Gokulnath C (2015) Mutual authentication scheme for IoT application. Indian J Sci Technol 8(26).  https://doi.org/10.17485/ijst/2015/v8i26/80996
  5. 5.
    Manogaran G, Varatharajan R, Priyan MK (2018) Hybrid recommendation system for heart disease diagnosis based on multiple kernel learning with adaptive neuro-fuzzy inference system. Multimedia Tools Appl 77(4):4379–4399CrossRefGoogle Scholar
  6. 6.
    Priyan MK, Devi GU (2017) Energy efficient node selection algorithm based on node performance index and random waypoint mobility model in internet of vehicles. Cluster Comput.  https://doi.org/10.1007/s10586-017-0998-x Google Scholar
  7. 7.
    Varatharajan R, Manogaran G, Priyan MK (2017) A big data classification approach using LDA with an enhanced SVM method for ECG signals in cloud computing. Multimedia Tools Appl.  https://doi.org/10.1007/s11042-017-5318-1 Google Scholar
  8. 8.
    Devi GU, Priyan MK, Balan EV, Nath CG, Chandrasekhar M (2015) Detection of DDoS attack using optimized hop count filtering technique. Indian J Sci Technol 8(26):1–6.  https://doi.org/10.17485/ijst/2015/v8i26/83981 Google Scholar
  9. 9.
    Gokulnath C, Priyan MK, Balan EV, Prabha KR, Jeyanthi R (2015) Preservation of privacy in data mining by using PCA based perturbation technique. In: 2015 international conference on smart technologies and management for computing, communication, controls, energy and materials (ICSTM). IEEE, pp 202–206Google Scholar
  10. 10.
    Thota C, Sudarasekhar R, Manogaran G, Varatharajan R, Priyan MK (2017) Centralized fog computing security platform for IoT and cloud in healthcare system. In: Krishna Prasad AV (ed) Exploring the convergence of big data and the internet of things. IGI Global, Hershey, pp 141–154Google Scholar
  11. 11.
    Kumar PM, Gandhi U, Varatharajan R, Manogaran G, Jidhesh R, Vadivel T (2017) Intelligent face recognition and navigation system using neural learning for smart security in Internet of Things. Cluster Comput.  https://doi.org/10.1007/s10586-017-1323-4 Google Scholar
  12. 12.
    Manogaran G, Varatharajan R, Lopez D, Kumar PM, Sundarasekar R, Thota C (2017) A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system. Future Gener Comput Syst 82:375–387CrossRefGoogle Scholar
  13. 13.
    Kumar PM, Gandhi UD (2017) A novel three-tier Internet of Things architecture with machine learning algorithm for early detection of heart diseases. Comput Electr Eng 65:222–235CrossRefGoogle Scholar
  14. 14.
    Radha V, Vimala C, Krishnaveni M (2012) Continuous speech recognition system for Tamil language using monophone-based hidden markov model. In: Proceedings of the second international conference on computational science, engineering and information technology. ACM, pp 227–231Google Scholar
  15. 15.
    Radha V, Vimala C, Krishnaveni M (2011) Isolated word recognition system for Tamil spoken language using back propagation neural network based on LPCC features. Comput Sci Eng 1(4):1–11Google Scholar
  16. 16.
    Patel I, Rao YS (2010) Speech recognition using HMM with MFCC: an analysis using frequency spectral decomposition technique. Signal Image Process Int J (SIPIJ) 1(2):101–110CrossRefGoogle Scholar
  17. 17.
    Chandrasekar M, Ponnavaikko M (2008) Tamil speech recognition: a complete model. Electron J Tech Acoust, article no. 20. http://www.ejta.org/en/chandrasekar2
  18. 18.
    Rojathai S, Venkatesulu M (2012) A novel speech recognition system for Tamil word recognition based on MFCC and FFBNN. Eur J Sci Res 85(4):578–590Google Scholar
  19. 19.
    Sigappi AN, Palanivel S (2012) Spoken word recognition strategy for Tamil language. Int J Comput Sci Issues 9(1):1694-0814Google Scholar
  20. 20.
    Sivaraj P, Rama M (2012) Recognition of isolated spoken words using DWT. Int J Eng Sci Res 2(9):1187–1196Google Scholar
  21. 21.
    Thangarajan R, Natarajan AM, Selvam M (2008) Word and triphone based approaches in continuous speech recognition for Tamil language. WSEAS Trans Signal Process 4(3):76–86Google Scholar
  22. 22.
    Saraswathi S, Geetha TV (2010) Design of language models at various phases of Tamil speech recognition system. Int J Eng Sci Technol 2(5):244–257CrossRefGoogle Scholar
  23. 23.
    Karpagavalli S, Rani KU, Deepika R, Kokila P (2012) Isolated Tamil digits speech recognition using vector quantization. Int J Eng Res Technol 1(4):1–12Google Scholar
  24. 24.
    Iswarya P, Radha V (2012) Speech based query processing architecture for Tamil-English in cross language text retrieval system. Int J Emerg Trends Eng Dev 7(2):437–442Google Scholar
  25. 25.
    Schafer R (2011) What is a Savitzky-Golay filter? IEEE Signal Process Mag 28:111–117 (lecture notes) CrossRefGoogle Scholar
  26. 26.
    Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least squares procedures. Anal Chem 36:1627–1639CrossRefGoogle Scholar
  27. 27.
    Neumaier A, Schneider T (2001) Estimation of parameters and eigenmodes of multivariate autoregressive models. ACM Trans Math Softw (TOMS) 27(1):27–57CrossRefMATHGoogle Scholar
  28. 28.
    Lütkepohl H (2005) New introduction to multiple time series analysis. Springer, BerlinCrossRefMATHGoogle Scholar
  29. 29.
    Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, HobokenMATHGoogle Scholar
  30. 30.
    Misra H (2006) Multi-stream processing for noise robust speech recognition. Doctoral thesis, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland, March 2006Google Scholar
  31. 31.
    Chen R, Jamieson LH (1996) Experiments on the implementation of recurrent neural networks for speech phone recognition. In: Proceedings of the thirtieth annual Asilomar conference on signals, systems and computers, Pacific Grove, California, November, pp 779–782Google Scholar
  32. 32.
    Lee SJ, Kim KC, Yoon H, Cho JW (1991) Application of fully neural networks for speech recognition. In: Korea Advanced Institute of Science and Technology, Korea, pp 77–80Google Scholar
  33. 33.
    He J, Liu L (1999) Speaker verification performance and the length of test sentence. In: Proceedings on ICASSP 1999, vol 1, pp 305–308Google Scholar
  34. 34.
    Gingras F, Bengio Y (1998) Handling asynchronous or missing data with recurrent networks. Int J Comput Intell Organ 1(3):154–163Google Scholar
  35. 35.
    Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681CrossRefGoogle Scholar
  36. 36.
    Fredes J, Novoa J, King S, Stern RM, Yoma NB (2017) Locally normalized filter banks applied to deep neural-network-based robust speech recognition. IEEE Signal Process Lett 24(4):377–381CrossRefGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  • S. Lokesh
    • 1
  • Priyan Malarvizhi Kumar
    • 2
  • M. Ramya Devi
    • 3
  • P. Parthasarathy
    • 2
  • C. Gokulnath
    • 2
  1. 1.Department of Computer Science and EngineeringHindusthan Institute of TechnologyCoimbatoreIndia
  2. 2.VIT UniversityVelloreIndia
  3. 3.Department of Computer Science and EngineeringHindusthan College of Engineering and TechnologyCoimbatoreIndia

Personalised recommendations