Advertisement

Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition

  • Lekshmi Krishna RamachandranEmail author
  • Sherly Elizabeth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11278)

Abstract

Automatic Speech Recognition is a computer-driven transcription of spoken-language into human-readable text. This paper is focused on the development of an acoustic model for medium vocabulary, context independent, isolated Malayalam Speech Recognizer using Hidden Markov Model (HMM). In this work, the emission probabilities of syllables, based on HMMs are estimated from the Gaussian Mixture Model (GMM). Mel Frequency Cepstral Coefficient (MFCC) technique is used for feature extraction from the input speech. The generation of mixture weights for GMMs is done by implementing Dirichlet Distribution. The efficiency of thus generated Gaussian Mixture Model is verified with different Information Criteria namely Akaike Information Criterion, Bayes Information Criterion, Corrected AIC, Kullback Linear Information Criterion, corrected KIC and Approximated KIC (KICc, AKICc). The accuracy of medium vocabulary, speaker dependent and isolated Malayalam speech corpus for a single Gaussian is 90.91% and Word Error Rate (WER) is 11.9%. The word accuracy and WER of the system are calculated based on the experiments conducted for multivariate Gaussians. For Gaussian mixture five, a better word accuracy of 95.24% along with a WER of 4.76% is attained and the same is verified using Information Criteria.

Keywords

Acoustic model Akaike information criterion Bayes information criterion HMM GMM Dirichlet distribution ASR MFCC Kullback information criterion Bias correction of Kullback information criterion Approximation of Kullback information criterion Word error rate 

Notes

Acknowledgment

This research is supported by Kerala State Council for Science, Technology and Environment (KSCSTE). I thank KSCSTE for funding the project under the Back-to-lab scheme.

References

  1. 1.
    Benzeghiba, M., et al.: Automatic speech recognition and speech variability: a review. Speech Commun. 49(10), 763–786 (2007)CrossRefGoogle Scholar
  2. 2.
    Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  3. 3.
    Malayalam Language (2018). https://en.wikipedia.org/wiki/Malayalam. Accessed 02 Jun 2018
  4. 4.
    Dirichlet Distribution (2018). https://en.wikipedia.org/wiki/Dirichlet_distribution. Accessed 02 Jun 2018
  5. 5.
    Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English digits speech recognition system based on hidden Markov models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010)Google Scholar
  6. 6.
    Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: International Symposium in Information Technology (ITSim), vol. 2, pp. 557–562. IEEE (2010)Google Scholar
  7. 7.
    Saini, P., Kaur, P., Dua, M.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. (IJETT) 4(6), 2223–2229 (2013)Google Scholar
  8. 8.
    Kumar, K., Aggarwal, R., Jain, A.: A hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)CrossRefGoogle Scholar
  9. 9.
    Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. IJCSI Int. J. Comput. Sci. Issues 9(4), 359 (2012)Google Scholar
  10. 10.
    Bhaskar, P.V., Rao, S.R.M., Gopi, A.: HTK based telugu speech recognition. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(12), 307–314 (2012)Google Scholar
  11. 11.
    Kurian, C., Balakrishnan, K.: Speech recognition of Malayalam numbers. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1475–1479. IEEE (2009)Google Scholar
  12. 12.
    Kurian, C., Balakrishnan, K.: Connected digit speech recognition system for Malayalam language. Sadhana 38(6), 1339–1346 (2013)CrossRefGoogle Scholar
  13. 13.
    Kurian, C., Balakrishnan, K.: Development & evaluation of different acoustic models for Malayalam continuous speech recognition. Procedia Eng. 30, 1081–1088 (2012)CrossRefGoogle Scholar
  14. 14.
    Krishnan, V.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In: 4th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2008, pp. 240–243. IEEE (2008)Google Scholar
  15. 15.
    Yu, K.: Generating Gaussian mixture models by model selection for speech recognition. F06 10–701 Final Project Report (2006)Google Scholar
  16. 16.
    Akogul, S., Erisoglu, M.: A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Math. Comput. Appl. 21(3), 34 (2016)MathSciNetGoogle Scholar
  17. 17.
    Young, S.: Hidden Markov model toolkit: design and philosophy. CUED/F-INFENG/TR. 152, Cambridge University Engineering Department (1994)Google Scholar
  18. 18.
    Yu, D., Deng, L.: Automatic Speech Recognition, A Deep Learning Approach. SCT. Springer, London (2015).  https://doi.org/10.1007/978-1-4471-5779-3CrossRefzbMATHGoogle Scholar
  19. 19.
    Reynolds, D.A.: Gaussian mixture models. Encycl. Biom. 2009, 659–663 (2009)Google Scholar
  20. 20.
    Karlis, D., Xekalaki, E.: Choosing initial values for the EM algorithm for finite mixtures. Comput. Stat. Data Anal. 41(3), 577–590 (2003)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Steele, R.J., Raftery, A.E.: Performance of bayesian model selection criteria for gaussian mixture models. Front. Stat. Decis. Mak. Bayesian Anal. 2, 113–130 (2010)Google Scholar
  22. 22.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G. (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer, New York (1998).  https://doi.org/10.1007/978-1-4612-1694-0_15Google Scholar
  23. 23.
    Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76(2), 297–307 (1989)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Cavanaugh, J.E.: A large-sample model selection criterion based on kullback’s symmetric divergence. Stat. Probab. Lett. 42(4), 333–343 (1999)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Seghouane, A.K., Bekara, M.: A small sample model selection criterion based on kullback’s symmetric divergence. IEEE Trans. Signal Process. 52(12), 3314–3323 (2004)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Seghouane, A.K., Bekara, M., Fleury, G.: A criterion for model selection in the presence of incomplete data based on kullback’s symmetric divergence. Signal Process. 85(7), 1405–1417 (2005)CrossRefGoogle Scholar
  28. 28.
    HTK hidden Markov model toolkit (1994). http://htk.eng.cam.ac.uk

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Bharathiar UniversityCoimbatoreIndia
  2. 2.Indian Institute of Information Technology and Management-KeralaThiruvananthapuramIndia

Personalised recommendations