Abstract
Speech recognition is a method that allows machines to convert the incoming speech signals into text commands. This paper presents a brief survey on automatic speech recognition systems based on HTK, Julius, MATLAB, Sphinx and Kaldi. A description of the mentioned speech recognition systems is discussed, and the structure and performance of these different systems are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Karpagavalli, S., Deepika, R., Kokila, P., Usha Rani, K., Chandra, E.: Automatic speech recognition: architecture, methodologies and challenges-a review. Int. J. Adv. Res. Comput. Sci. 2(6) (2011)
Satori, H., ElHaoussi, F.: Investigation Amazing speech recognition using CMU tools. Int. J. Speech Technol. 17(3), 235–243 (2014)
Satori, H., Zealouk, O., Satori, K., ElHaoussi, F.: Voice comparison between smokers and non-smokers using HMM speech recognition system. Int. J. Speech Technol. 20(4), 771–777 (2017)
Hamidi, M., Satori, H., Satori, K.: Implementing a voice interface in VOIP network with IVR server using Amazing digits. Int. J. Multi. Sci. 2, 38–43 (2016)
Hamidi, M., Satori, H., Zealouk, O., Satori, K.: Speech coding effect on amazing alphabet speech recognition performance. J. Adv. Res. Dyn. Control Syst. 11(2), 1392–1400 (2019)
Zealouk, O., Satori, H., Hamidi, M., Satori, K.: Speech recognition for Moroccan dialects: feature extraction and classification methods. J. Adv. Res. Dyn. Control Syst. 11(2), 1401–1408 (2019)
Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. Issues (IJCSI) 9(4), 359 (2012)
El Ghazi, A., Daoui, C., Idrissi, N., Fakir, M., Bouikhalene, B.: Speech recognition system based on Hidden Markov Model concerning the Moroccan dialect DARIJA. Global J. Comput. Sci. Technol (2011)
Ilham, A., Hassan, S., Khalid, S.: Building a first amazing database for automatic audiovisual speech recognition system. In: Proceedings of the 2nd International Conference on Smart Digital Environment, pp. 94–99. ACM (2018, October)
Medennikov, I., Prudnikov, A.: Advances in STC Russian spontaneous speech recognition system. In: International Conference on Speech and Computer, pp. 116–123. Springer, Cham (2016, August)
Peddinti, V., Manohar, V., Wang, Y., Povey, D., Khudanpur, S.: Far-field ASR without parallel data. In: Interspeech, pp. 1996–2000 (2016, September)
Mittal, S., Kaur, R.: Implementation of word level speech recognition system for Punjabi language. Int. J. Comput. Appl. 146(3) (2016)
Husnain, S.K., Beg, A., Awan, M.S.: Frequency analysis of spoken Urdu numbers using MATLAB and Simulink. PAF KIET J. Eng. Sci. 1, 5 (2007)
Kimutai, S.K., Milgo, E., Gichoya, D.: Isolated Swahili words recognition using Sphinx4. Int. J. Emerg. Sci. Eng. 2(2), 2319–6378 (2013)
Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: 2010 International Symposium in Information Technology (ITSim), Vol. 2, pp. 557–562. IEEE (2010, June)
Kumar, K., Aggarwal, R.K., Jain, A.: A Hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)
Mohamed, H., Hassan, S., Ouissam, Z., Khalid, S., Naouar, L.: Interactive voice response server voice network administration using hidden Markov model speech recognition system. In: 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pp. 16–21. IEEE (2018, October)
Kraleva, R., Kralev, V.: On model architecture for a children’s speech recognition interactive dialog system. (2016). arXiv preprint arXiv:1605.07733
Hayes, B.: First links in the Markov chain. Am. Sci. 101(2), 252 (2013)
Rabiner, L.R., Juang, B.H.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986)
CMUSphinx, Open Source Toolkit For Speech Recognition, Project By CMU, “Sphinx-4 Application Programmer’s Guide”
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Valtchev, V.: The HTK Book, 3rd edn, p. 175. Cambridge University Engineering Department, Cambridge (2002)
Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine julius. In: Proceedings: APSIPA ASC 2009: Asia-Pacific signal and information processing association, 2009 annual summit and conference. Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, International Organizing Committee, pp. 131–137 (2009)
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P. and Silovsky, J.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (No. EPFL-CONF-192584). IEEE Signal Processing Society (2011)
Campbell, D., Palomaki, K., Brown, G.: A MATLAB simulation of “shoebox” room acoustics for use in research and teaching. Comput. Inf. Syst. 9(3), 48 (2005)
Yang, H., Oehlke, C., Meinel, C.: German speech recognition: a solution for the analysis and processing of lecture recordings. In: IEEE/ACIS 10th International Conference on Computer and Information Science (ICIS), 2011, pp. 201–206. IEEE (2011, May)
Gaida, C., Lange, P., Petrick, R., Proba, P., Malatawy, A., Suendermann-Oeft, D.: Comparing Open-source Speech Recognition Toolkits. Technical Report. DHBW Stuttgart, Stuttgart (2014)
Samudravijaya, K., Barot, M.: A comparison of public domain software tools for speech recognition. In: WSLP2003, pp. 125–131 (2003)
Ma, G., Zhou, W., Zheng, J., You, X. Ye, W. A Comparison between HTK and SPHINX on Chinese Mandarin. In: 2009 International Joint Conference on Artificial Intelligence, pp. 394–397. IEEE (2009)
Vertanen, K.: Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments. Technical report). Cavendish Laboratory, Cambridge, United Kingdom (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Barkani, F., Satori, H., Hamidi, M., Zealouk, O., Laaidi, N. (2020). Comparative Evaluation of Speech Recognition Systems Based on Different Toolkits. In: Bhateja, V., Satapathy, S., Satori, H. (eds) Embedded Systems and Artificial Intelligence. Advances in Intelligent Systems and Computing, vol 1076. Springer, Singapore. https://doi.org/10.1007/978-981-15-0947-6_4
Download citation
DOI: https://doi.org/10.1007/978-981-15-0947-6_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0946-9
Online ISBN: 978-981-15-0947-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)