Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition

Krishna Ramachandran, Lekshmi; Elizabeth, Sherly

doi:10.1007/978-3-030-04021-5_11

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11278))

Included in the following conference series:

International Conference on Intelligent Human Computer Interaction

Abstract

Automatic Speech Recognition is a computer-driven transcription of spoken-language into human-readable text. This paper is focused on the development of an acoustic model for medium vocabulary, context independent, isolated Malayalam Speech Recognizer using Hidden Markov Model (HMM). In this work, the emission probabilities of syllables, based on HMMs are estimated from the Gaussian Mixture Model (GMM). Mel Frequency Cepstral Coefficient (MFCC) technique is used for feature extraction from the input speech. The generation of mixture weights for GMMs is done by implementing Dirichlet Distribution. The efficiency of thus generated Gaussian Mixture Model is verified with different Information Criteria namely Akaike Information Criterion, Bayes Information Criterion, Corrected AIC, Kullback Linear Information Criterion, corrected KIC and Approximated KIC (KICc, AKICc). The accuracy of medium vocabulary, speaker dependent and isolated Malayalam speech corpus for a single Gaussian is 90.91% and Word Error Rate (WER) is 11.9%. The word accuracy and WER of the system are calculated based on the experiments conducted for multivariate Gaussians. For Gaussian mixture five, a better word accuracy of 95.24% along with a WER of 4.76% is attained and the same is verified using Information Criteria.

Kerala State Council of Science Technology and Environment-KSCSTE.

L. Krishna Ramachandran—Research Scholar

S. Elizabeth—Professor

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Benzeghiba, M., et al.: Automatic speech recognition and speech variability: a review. Speech Commun. 49(10), 763–786 (2007)
Article Google Scholar
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Article Google Scholar
Malayalam Language (2018). https://en.wikipedia.org/wiki/Malayalam. Accessed 02 Jun 2018
Dirichlet Distribution (2018). https://en.wikipedia.org/wiki/Dirichlet_distribution. Accessed 02 Jun 2018
Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English digits speech recognition system based on hidden Markov models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010)
Google Scholar
Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: International Symposium in Information Technology (ITSim), vol. 2, pp. 557–562. IEEE (2010)
Google Scholar
Saini, P., Kaur, P., Dua, M.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. (IJETT) 4(6), 2223–2229 (2013)
Google Scholar
Kumar, K., Aggarwal, R., Jain, A.: A hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)
Article Google Scholar
Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. IJCSI Int. J. Comput. Sci. Issues 9(4), 359 (2012)
Google Scholar
Bhaskar, P.V., Rao, S.R.M., Gopi, A.: HTK based telugu speech recognition. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(12), 307–314 (2012)
Google Scholar
Kurian, C., Balakrishnan, K.: Speech recognition of Malayalam numbers. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1475–1479. IEEE (2009)
Google Scholar
Kurian, C., Balakrishnan, K.: Connected digit speech recognition system for Malayalam language. Sadhana 38(6), 1339–1346 (2013)
Article Google Scholar
Kurian, C., Balakrishnan, K.: Development & evaluation of different acoustic models for Malayalam continuous speech recognition. Procedia Eng. 30, 1081–1088 (2012)
Article Google Scholar
Krishnan, V.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In: 4th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2008, pp. 240–243. IEEE (2008)
Google Scholar
Yu, K.: Generating Gaussian mixture models by model selection for speech recognition. F06 10–701 Final Project Report (2006)
Google Scholar
Akogul, S., Erisoglu, M.: A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Math. Comput. Appl. 21(3), 34 (2016)
MathSciNet Google Scholar
Young, S.: Hidden Markov model toolkit: design and philosophy. CUED/F-INFENG/TR. 152, Cambridge University Engineering Department (1994)
Google Scholar
Yu, D., Deng, L.: Automatic Speech Recognition, A Deep Learning Approach. SCT. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3
Book MATH Google Scholar
Reynolds, D.A.: Gaussian mixture models. Encycl. Biom. 2009, 659–663 (2009)
Google Scholar
Karlis, D., Xekalaki, E.: Choosing initial values for the EM algorithm for finite mixtures. Comput. Stat. Data Anal. 41(3), 577–590 (2003)
Article MathSciNet Google Scholar
Steele, R.J., Raftery, A.E.: Performance of bayesian model selection criteria for gaussian mixture models. Front. Stat. Decis. Mak. Bayesian Anal. 2, 113–130 (2010)
Google Scholar
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G. (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer, New York (1998). https://doi.org/10.1007/978-1-4612-1694-0_15
Google Scholar
Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76(2), 297–307 (1989)
Article MathSciNet Google Scholar
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet Google Scholar
Cavanaugh, J.E.: A large-sample model selection criterion based on kullback’s symmetric divergence. Stat. Probab. Lett. 42(4), 333–343 (1999)
Article MathSciNet Google Scholar
Seghouane, A.K., Bekara, M.: A small sample model selection criterion based on kullback’s symmetric divergence. IEEE Trans. Signal Process. 52(12), 3314–3323 (2004)
Article MathSciNet Google Scholar
Seghouane, A.K., Bekara, M., Fleury, G.: A criterion for model selection in the presence of incomplete data based on kullback’s symmetric divergence. Signal Process. 85(7), 1405–1417 (2005)
Article Google Scholar
HTK hidden Markov model toolkit (1994). http://htk.eng.cam.ac.uk

Download references

Acknowledgment

This research is supported by Kerala State Council for Science, Technology and Environment (KSCSTE). I thank KSCSTE for funding the project under the Back-to-lab scheme.

Author information

Authors and Affiliations

Bharathiar University, Coimbatore, Tamil Nadu, India
Lekshmi Krishna Ramachandran
Indian Institute of Information Technology and Management-Kerala, Thiruvananthapuram, India
Lekshmi Krishna Ramachandran & Sherly Elizabeth

Authors

Lekshmi Krishna Ramachandran
View author publications
You can also search for this author in PubMed Google Scholar
Sherly Elizabeth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lekshmi Krishna Ramachandran .

Editor information

Editors and Affiliations

Indian Institute of Information Technology, Allahabad, India
Uma Shanker Tiwary

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krishna Ramachandran, L., Elizabeth, S. (2018). Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition. In: Tiwary, U. (eds) Intelligent Human Computer Interaction. IHCI 2018. Lecture Notes in Computer Science(), vol 11278. Springer, Cham. https://doi.org/10.1007/978-3-030-04021-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-04021-5_11
Published: 10 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04020-8
Online ISBN: 978-3-030-04021-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics