Advertisement

A Quinphone-Based Context-Dependent Acoustic Modeling for LVCSR

  • Priyanka SahuEmail author
  • Mohit Dua
Conference paper
  • 295 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 555)

Abstract

Automatic speech recognition (ASR) is used for accurate and efficient conversion of speech signal into a text message. Generally, speech signal is taken as input and it is processed at front end to extract features and then computed at back end using the GMM model. GMM mixture selection is quite important depending upon the size of dataset. As for concise vocabulary, use of triphone-based acoustic modeling exhibits good results but for large size vocabulary, quinphone (quadraphones)-based acoustic modeling gives better performance. This paper compares the performance of context-independent- and context-dependent-based acoustic modeling to reduce error rate.

Keywords

Speech recognition Speech modeling Feature extraction techniques Gaussian mixture model LVCSR 

References

  1. 1.
    O’Shaughnessy, D.: Acoustic analysis for automatic speech recognition. Proceedings of the IEEE vol. 101.5, pp. 1038–1053 (2013)Google Scholar
  2. 2.
    Cai, J., Bouselmi, G., Laprie, Y., Haton, J-P.: Efficient Likelihood Evaluation and Dynamic Gaussian Selection for HMM-Based Speech Recognition. Computer Speech and Language, vol.23, pp. 147–164, (2009)Google Scholar
  3. 3.
    Kumar, A., Dua, M., Choudhary, T.: Continuous Hindi speech recognition using Gaussian mixture HMM, IEEE Student conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–5, (2014)Google Scholar
  4. 4.
    Becchetti, C., Klucio, P.R. Speech Recognition: Theory and C++ Implementation, 3rd ed., vol. 2, John Wiley & Sons, pp. 121–141, (2008)Google Scholar
  5. 5.
    Cutajar, M., Gatt, E., Grech, I., Casha, O., Micallef, J.: Comparative study of automatic speech recognition techniques. Signal Processing, IET vol. 7.1, pp. 25–46 (2013)Google Scholar
  6. 6.
    Aggarwal, R.k., Dave, M.: Using Gaussian mixtures for Hindi speech recognition system, International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 4.4, pp. 157–170, (2011)Google Scholar
  7. 7.
    Rybach, D., Riley, M., Alberti, C.: Direct construction of compact context-dependency transducers from data, INTERSPEECH, pp. 218–221, (2010)Google Scholar
  8. 8.
    Schuster, M., Hori, T.: Construction of weighted finite state transducers for very wide context-dependent acoustic models, Automatic Speech Recognition and Understanding, IEEE Workshop, pp. 162–167, (2005)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.National Institute of Technology KurukshetraKurukshetraIndia

Personalised recommendations