Higher Accuracy of Hindi Speech Recognition Due to Online Speaker Adaptation

Sivaraman, Ganesh; Mehta, Swapnil; Nabar, Neeraj; Samudravijaya, K.

doi:10.1007/978-3-642-20209-4_33

Ganesh Sivaraman⁴,
Swapnil Mehta⁴,
Neeraj Nabar⁴ &
…
K. Samudravijaya⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 145))

1579 Accesses
1 Citations

Abstract

Speaker Adaptation is a technique which is used to improve the recognition accuracy of Automatic Speech Recognition (ASR) systems. Here, we report a study of the impact of online speaker adaptation on the performance of a speaker independent, continuous speech recognition system for Hindi language. The speaker adaptation is performed using the Maximum Likelihood Linear Regression (MLLR) transformation approach. The ASR system was trained using narrowband speech. The efficacy of the speaker adaptation is studied by using an unrelated speech database. The MLLR transform based speaker adaptation technique is found to significantly improve the accuracy of the Hindi ASR system by 3%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77, 257–286 (1989)
Article Google Scholar
Legetter, C.J.: Improved acoustic modeling for HMMs using linear transformations. Ph.D. Thesis. University of Cambridge (1995)
Google Scholar
Leggetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of HMMs. Computer Speech & Language 9, 171–185 (1995)
Article Google Scholar
Doh, S.-J.: Enhancements to Transformation-Based Speaker Adaptation: Principal Component and Inter-Class Maximum Likelihood Linear Regression, PhD. Thesis, Carnegie Mellon University (2000)
Google Scholar
CMU sphinx – Speech Recognition Toolkit, http://www.cmusphinx.sourceforge.net
Chan, A., et al.: The Hieroglyphs: Building Speech Applications Using CMU Sphinx and Related Resources (2003)
Google Scholar
Samudravijaya, K.: Hindi Speech Recognition. J. Acoustic Society of India 29(1), 385–393 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Technology & Computer Science, Tata Institute of Findamental Research, Homi Bhabha Road, Mumbai, India
Ganesh Sivaraman, Swapnil Mehta, Neeraj Nabar & K. Samudravijaya

Authors

Ganesh Sivaraman
View author publications
You can also search for this author in PubMed Google Scholar
Swapnil Mehta
View author publications
You can also search for this author in PubMed Google Scholar
Neeraj Nabar
View author publications
You can also search for this author in PubMed Google Scholar
K. Samudravijaya
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

SVKM’s NMIMS, MPSTME, Mukesh Patel School of Technology Management and Engineering, Bhakti Vedant Swami Marg, JVPD, 400056, Vile- Parle (W) Mumbai, Maharashtra, India
Ketan Shah
SVKM’s NMIMS, MPSTME Mukesh Patel School of Technology Management and Engineering, Bhakti Vedant Swami Marg, JVPD, 400056, Vile- Parle (W), Mumbai, Maharashtra, India
V. R. Lakshmi Gorty
SVKM’s NMIMS, MPSTME, Mukesh Patel School of Technology Management and Engineering, Bhakti Vedant Swami Marg, JVPD, 400056, Vile- Parle (W), Mumbai, Maharashtra, India
Ajay Phirke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sivaraman, G., Mehta, S., Nabar, N., Samudravijaya, K. (2011). Higher Accuracy of Hindi Speech Recognition Due to Online Speaker Adaptation. In: Shah, K., Lakshmi Gorty, V.R., Phirke, A. (eds) Technology Systems and Management. Communications in Computer and Information Science, vol 145. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20209-4_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-20209-4_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20208-7
Online ISBN: 978-3-642-20209-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics