Comparative Evaluation of Speech Recognition Systems Based on Different Toolkits

Barkani, Fatima; Satori, Hassan; Hamidi, Mohamed; Zealouk, Ouissam; Laaidi, Naouar

doi:10.1007/978-981-15-0947-6_4

Fatima Barkani¹⁷,
Hassan Satori¹⁷,
Mohamed Hamidi¹⁷,
Ouissam Zealouk¹⁷ &
…
Naouar Laaidi¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1076))

1462 Accesses
4 Citations

Abstract

Speech recognition is a method that allows machines to convert the incoming speech signals into text commands. This paper presents a brief survey on automatic speech recognition systems based on HTK, Julius, MATLAB, Sphinx and Kaldi. A description of the mentioned speech recognition systems is discussed, and the structure and performance of these different systems are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Karpagavalli, S., Deepika, R., Kokila, P., Usha Rani, K., Chandra, E.: Automatic speech recognition: architecture, methodologies and challenges-a review. Int. J. Adv. Res. Comput. Sci. 2(6) (2011)
Google Scholar
Satori, H., ElHaoussi, F.: Investigation Amazing speech recognition using CMU tools. Int. J. Speech Technol. 17(3), 235–243 (2014)
Article Google Scholar
Satori, H., Zealouk, O., Satori, K., ElHaoussi, F.: Voice comparison between smokers and non-smokers using HMM speech recognition system. Int. J. Speech Technol. 20(4), 771–777 (2017)
Article Google Scholar
Hamidi, M., Satori, H., Satori, K.: Implementing a voice interface in VOIP network with IVR server using Amazing digits. Int. J. Multi. Sci. 2, 38–43 (2016)
Google Scholar
Hamidi, M., Satori, H., Zealouk, O., Satori, K.: Speech coding effect on amazing alphabet speech recognition performance. J. Adv. Res. Dyn. Control Syst. 11(2), 1392–1400 (2019)
Google Scholar
Zealouk, O., Satori, H., Hamidi, M., Satori, K.: Speech recognition for Moroccan dialects: feature extraction and classification methods. J. Adv. Res. Dyn. Control Syst. 11(2), 1401–1408 (2019)
Google Scholar
Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. Issues (IJCSI) 9(4), 359 (2012)
Google Scholar
El Ghazi, A., Daoui, C., Idrissi, N., Fakir, M., Bouikhalene, B.: Speech recognition system based on Hidden Markov Model concerning the Moroccan dialect DARIJA. Global J. Comput. Sci. Technol (2011)
Google Scholar
Ilham, A., Hassan, S., Khalid, S.: Building a first amazing database for automatic audiovisual speech recognition system. In: Proceedings of the 2nd International Conference on Smart Digital Environment, pp. 94–99. ACM (2018, October)
Google Scholar
Medennikov, I., Prudnikov, A.: Advances in STC Russian spontaneous speech recognition system. In: International Conference on Speech and Computer, pp. 116–123. Springer, Cham (2016, August)
Google Scholar
Peddinti, V., Manohar, V., Wang, Y., Povey, D., Khudanpur, S.: Far-field ASR without parallel data. In: Interspeech, pp. 1996–2000 (2016, September)
Google Scholar
Mittal, S., Kaur, R.: Implementation of word level speech recognition system for Punjabi language. Int. J. Comput. Appl. 146(3) (2016)
Google Scholar
Husnain, S.K., Beg, A., Awan, M.S.: Frequency analysis of spoken Urdu numbers using MATLAB and Simulink. PAF KIET J. Eng. Sci. 1, 5 (2007)
Google Scholar
Kimutai, S.K., Milgo, E., Gichoya, D.: Isolated Swahili words recognition using Sphinx4. Int. J. Emerg. Sci. Eng. 2(2), 2319–6378 (2013)
Google Scholar
Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: 2010 International Symposium in Information Technology (ITSim), Vol. 2, pp. 557–562. IEEE (2010, June)
Google Scholar
Kumar, K., Aggarwal, R.K., Jain, A.: A Hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)
Article Google Scholar
Mohamed, H., Hassan, S., Ouissam, Z., Khalid, S., Naouar, L.: Interactive voice response server voice network administration using hidden Markov model speech recognition system. In: 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pp. 16–21. IEEE (2018, October)
Google Scholar
Kraleva, R., Kralev, V.: On model architecture for a children’s speech recognition interactive dialog system. (2016). arXiv preprint arXiv:1605.07733
Hayes, B.: First links in the Markov chain. Am. Sci. 101(2), 252 (2013)
MathSciNet Google Scholar
Rabiner, L.R., Juang, B.H.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986)
Google Scholar
CMUSphinx, Open Source Toolkit For Speech Recognition, Project By CMU, “Sphinx-4 Application Programmer’s Guide”
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Valtchev, V.: The HTK Book, 3rd edn, p. 175. Cambridge University Engineering Department, Cambridge (2002)
Google Scholar
Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine julius. In: Proceedings: APSIPA ASC 2009: Asia-Pacific signal and information processing association, 2009 annual summit and conference. Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, International Organizing Committee, pp. 131–137 (2009)
Google Scholar
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P. and Silovsky, J.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (No. EPFL-CONF-192584). IEEE Signal Processing Society (2011)
Google Scholar
Campbell, D., Palomaki, K., Brown, G.: A MATLAB simulation of “shoebox” room acoustics for use in research and teaching. Comput. Inf. Syst. 9(3), 48 (2005)
Google Scholar
Yang, H., Oehlke, C., Meinel, C.: German speech recognition: a solution for the analysis and processing of lecture recordings. In: IEEE/ACIS 10th International Conference on Computer and Information Science (ICIS), 2011, pp. 201–206. IEEE (2011, May)
Google Scholar
Gaida, C., Lange, P., Petrick, R., Proba, P., Malatawy, A., Suendermann-Oeft, D.: Comparing Open-source Speech Recognition Toolkits. Technical Report. DHBW Stuttgart, Stuttgart (2014)
Google Scholar
Samudravijaya, K., Barot, M.: A comparison of public domain software tools for speech recognition. In: WSLP2003, pp. 125–131 (2003)
Google Scholar
Ma, G., Zhou, W., Zheng, J., You, X. Ye, W. A Comparison between HTK and SPHINX on Chinese Mandarin. In: 2009 International Joint Conference on Artificial Intelligence, pp. 394–397. IEEE (2009)
Google Scholar
Vertanen, K.: Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments. Technical report). Cavendish Laboratory, Cambridge, United Kingdom (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

LIIAN Laboratory, Faculty of Sciences Dhar Mahraz, Sidi Mohammed Ben Abbdallah University, Fez, Morocco
Fatima Barkani, Hassan Satori, Mohamed Hamidi, Ouissam Zealouk & Naouar Laaidi

Authors

Fatima Barkani
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Satori
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Hamidi
View author publications
You can also search for this author in PubMed Google Scholar
Ouissam Zealouk
View author publications
You can also search for this author in PubMed Google Scholar
Naouar Laaidi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Vikrant Bhateja
School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, Odisha, India
Suresh Chandra Satapathy
Department of Computer Sciences, Faculty of Sciences Dhar Mahraz, Sidi Mohammed Ben Abbdallah University, Fez, Morocco
Hassan Satori

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barkani, F., Satori, H., Hamidi, M., Zealouk, O., Laaidi, N. (2020). Comparative Evaluation of Speech Recognition Systems Based on Different Toolkits. In: Bhateja, V., Satapathy, S., Satori, H. (eds) Embedded Systems and Artificial Intelligence. Advances in Intelligent Systems and Computing, vol 1076. Springer, Singapore. https://doi.org/10.1007/978-981-15-0947-6_4

Download citation

DOI: https://doi.org/10.1007/978-981-15-0947-6_4
Published: 08 April 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0946-9
Online ISBN: 978-981-15-0947-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics