Abstract
This paper investigates Indian English from the point of view of a speech recognition problem. A novel approach towards building an Automated Speech Recognition System (ASR) for Indian English using PocketSphinx has been proposed. The system was trained with a database of English words spoken by Indians in three different accents using continuous as well as semi-continuous models. We have compared the performances in each case and the optimum case performance comes close to 98 % accurate. Based on this study, we tweaked the original PocketSphinx Android application in order to incorporate our results and present it as an Indian English-based SMS sending application. We are working further on this approach to identify ways of successfully training a speech recognition system to recognize a much wider variety of Indian accents with much more significant accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Discussion Forum about Siri on the official website of Apple Inc. https://discussions.apple.com/thread/3390280?tstart=0.
List of Countries by English Speaking Population—Wikipedia. http://en.wikipedia.org/wiki/List_of_countries_by_English-speaking_population.
Samudravijaya K. Automatic Speech Recognition. Tata Institute of Fundamental Research Archives. 2004.
Samudravijaya K. Speech and speaker recognition—a tutorial. Tata Institute of Fundamental Research Archives. 2004.
Samudravijaya K, Rao PVS, Agrawal SS. Hindi speech database. In: the Proceedings of the International Conference on Spoken Language Processing ICSLP00, Beijing, 2000; CDROM: 00192.pdf.
Huggins-Daines D, Kumar M, Chan A, Black AW, Ravishankar M, Rudnicky AI. Pocketsphinx: a free, real-time continuous speech recognition system for hand-held devices. In: The proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), France, 2006.
Kulkarni K, Sengupta S, Ramasubramanian V, Bauer JG, Stemmer G. Accented Indian english ASR: some early results. In: The proceedings of the IEEE spoken language technology workshop, India, 2008.
Kumar R, Gangadharaiah R, Rao S, Prahallad K, Rosé CP, Black AW. Building a better Indian english voice using ‘more data’. In: The proceedings of the 6th ISCA workshop on speech synthesis, Germany, 2007.
Balyan A, Agrawal SS, Dev A. Automatic phonetic segmentation of Hindi speech using hidden Markov model 27:543–549, AI & Soc, Springer: London; 2012.
Sinha, S, Agrawal, SS, Jain, A. Continuous density hidden markov model for context dependent hindi speech recognition. In: The proceedings of the international conference on advances in computing, communication and informatics (ICACCI), India, 2013.
Picone J. Signal modeling techniques in speech recognition. In: Proceedings of the IEEE international conference, June 1993.
Geirhofer S. Feature reduction with linear discriminant analysis and its performance on phoneme recognition. Department of Electrical and Computer Engineering: University of Illinois at Urbana-Champaign; 2004.
Psutka JV. Benefit of maximum likelihood linear transform (MLLT) used at different levels of covariance matrices clustering in ASR systems., Lecture Notes in Computer ScienceBerlin Heidelberg: Springer; 2007.
Arpabet. http://en.wikipedia.org/wiki/Arpabet.
Reynolds DA. A Gaussian mixture modeling approach to text-independent speaker identification. Ph.D. thesis, Georgia Institute of Technology, 1992.
Raux A, Singh R. Maximum-likelihood adaptation of semi-continuous HMMs by latent variable decomposition of state distributions. In: The proceedings of the 8th international conference on spoken language processing (ICSLP), South Korea, 2004.
Duchateau J, Demuynck K, Van Compernolle D. Fast and accurate acoustic modelling with semi-continuous HMMs. Speech Commun. 1998;24(1):5–17.
Indian English SMS Sending App—PocketSphinx Derivative. https://github.com/parthoiiitm/smsforindeng.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this paper
Cite this paper
Mandal, P., Ojha, G., Shukla, A., Agrawal, S.S. (2016). Accoustic Modeling for Development of Accented Indian English ASR. In: Dash, S., Bhaskar, M., Panigrahi, B., Das, S. (eds) Artificial Intelligence and Evolutionary Computations in Engineering Systems. Advances in Intelligent Systems and Computing, vol 394. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2656-7_16
Download citation
DOI: https://doi.org/10.1007/978-81-322-2656-7_16
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2654-3
Online ISBN: 978-81-322-2656-7
eBook Packages: EngineeringEngineering (R0)