Accoustic Modeling for Development of Accented Indian English ASR
This paper investigates Indian English from the point of view of a speech recognition problem. A novel approach towards building an Automated Speech Recognition System (ASR) for Indian English using PocketSphinx has been proposed. The system was trained with a database of English words spoken by Indians in three different accents using continuous as well as semi-continuous models. We have compared the performances in each case and the optimum case performance comes close to 98 % accurate. Based on this study, we tweaked the original PocketSphinx Android application in order to incorporate our results and present it as an Indian English-based SMS sending application. We are working further on this approach to identify ways of successfully training a speech recognition system to recognize a much wider variety of Indian accents with much more significant accuracy.
KeywordsAutomatic speech recognition Indian English Discrete HMMs
- 1.Discussion Forum about Siri on the official website of Apple Inc. https://discussions.apple.com/thread/3390280?tstart=0.
- 2.List of Countries by English Speaking Population—Wikipedia. http://en.wikipedia.org/wiki/List_of_countries_by_English-speaking_population.
- 3.Samudravijaya K. Automatic Speech Recognition. Tata Institute of Fundamental Research Archives. 2004.Google Scholar
- 4.Samudravijaya K. Speech and speaker recognition—a tutorial. Tata Institute of Fundamental Research Archives. 2004.Google Scholar
- 5.Samudravijaya K, Rao PVS, Agrawal SS. Hindi speech database. In: the Proceedings of the International Conference on Spoken Language Processing ICSLP00, Beijing, 2000; CDROM: 00192.pdf.Google Scholar
- 6.Huggins-Daines D, Kumar M, Chan A, Black AW, Ravishankar M, Rudnicky AI. Pocketsphinx: a free, real-time continuous speech recognition system for hand-held devices. In: The proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), France, 2006.Google Scholar
- 7.Kulkarni K, Sengupta S, Ramasubramanian V, Bauer JG, Stemmer G. Accented Indian english ASR: some early results. In: The proceedings of the IEEE spoken language technology workshop, India, 2008.Google Scholar
- 8.Kumar R, Gangadharaiah R, Rao S, Prahallad K, Rosé CP, Black AW. Building a better Indian english voice using ‘more data’. In: The proceedings of the 6th ISCA workshop on speech synthesis, Germany, 2007.Google Scholar
- 9.Balyan A, Agrawal SS, Dev A. Automatic phonetic segmentation of Hindi speech using hidden Markov model 27:543–549, AI & Soc, Springer: London; 2012.Google Scholar
- 10.Sinha, S, Agrawal, SS, Jain, A. Continuous density hidden markov model for context dependent hindi speech recognition. In: The proceedings of the international conference on advances in computing, communication and informatics (ICACCI), India, 2013.Google Scholar
- 11.Picone J. Signal modeling techniques in speech recognition. In: Proceedings of the IEEE international conference, June 1993.Google Scholar
- 12.Geirhofer S. Feature reduction with linear discriminant analysis and its performance on phoneme recognition. Department of Electrical and Computer Engineering: University of Illinois at Urbana-Champaign; 2004.Google Scholar
- 14.Arpabet. http://en.wikipedia.org/wiki/Arpabet.
- 15.Reynolds DA. A Gaussian mixture modeling approach to text-independent speaker identification. Ph.D. thesis, Georgia Institute of Technology, 1992.Google Scholar
- 16.Raux A, Singh R. Maximum-likelihood adaptation of semi-continuous HMMs by latent variable decomposition of state distributions. In: The proceedings of the 8th international conference on spoken language processing (ICSLP), South Korea, 2004.Google Scholar
- 18.Indian English SMS Sending App—PocketSphinx Derivative. https://github.com/parthoiiitm/smsforindeng.