Skip to main content

Accoustic Modeling for Development of Accented Indian English ASR

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 394))

Abstract

This paper investigates Indian English from the point of view of a speech recognition problem. A novel approach towards building an Automated Speech Recognition System (ASR) for Indian English using PocketSphinx has been proposed. The system was trained with a database of English words spoken by Indians in three different accents using continuous as well as semi-continuous models. We have compared the performances in each case and the optimum case performance comes close to 98 % accurate. Based on this study, we tweaked the original PocketSphinx Android application in order to incorporate our results and present it as an Indian English-based SMS sending application. We are working further on this approach to identify ways of successfully training a speech recognition system to recognize a much wider variety of Indian accents with much more significant accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Discussion Forum about Siri on the official website of Apple Inc. https://discussions.apple.com/thread/3390280?tstart=0.

  2. List of Countries by English Speaking Population—Wikipedia. http://en.wikipedia.org/wiki/List_of_countries_by_English-speaking_population.

  3. Samudravijaya K. Automatic Speech Recognition. Tata Institute of Fundamental Research Archives. 2004.

    Google Scholar 

  4. Samudravijaya K. Speech and speaker recognition—a tutorial. Tata Institute of Fundamental Research Archives. 2004.

    Google Scholar 

  5. Samudravijaya K, Rao PVS, Agrawal SS. Hindi speech database. In: the Proceedings of the International Conference on Spoken Language Processing ICSLP00, Beijing, 2000; CDROM: 00192.pdf.

    Google Scholar 

  6. Huggins-Daines D, Kumar M, Chan A, Black AW, Ravishankar M, Rudnicky AI. Pocketsphinx: a free, real-time continuous speech recognition system for hand-held devices. In: The proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), France, 2006.

    Google Scholar 

  7. Kulkarni K, Sengupta S, Ramasubramanian V, Bauer JG, Stemmer G. Accented Indian english ASR: some early results. In: The proceedings of the IEEE spoken language technology workshop, India, 2008.

    Google Scholar 

  8. Kumar R, Gangadharaiah R, Rao S, Prahallad K, Rosé CP, Black AW. Building a better Indian english voice using ‘more data’. In: The proceedings of the 6th ISCA workshop on speech synthesis, Germany, 2007.

    Google Scholar 

  9. Balyan A, Agrawal SS, Dev A. Automatic phonetic segmentation of Hindi speech using hidden Markov model 27:543–549, AI & Soc, Springer: London; 2012.

    Google Scholar 

  10. Sinha, S, Agrawal, SS, Jain, A. Continuous density hidden markov model for context dependent hindi speech recognition. In: The proceedings of the international conference on advances in computing, communication and informatics (ICACCI), India, 2013.

    Google Scholar 

  11. Picone J. Signal modeling techniques in speech recognition. In: Proceedings of the IEEE international conference, June 1993.

    Google Scholar 

  12. Geirhofer S. Feature reduction with linear discriminant analysis and its performance on phoneme recognition. Department of Electrical and Computer Engineering: University of Illinois at Urbana-Champaign; 2004.

    Google Scholar 

  13. Psutka JV. Benefit of maximum likelihood linear transform (MLLT) used at different levels of covariance matrices clustering in ASR systems., Lecture Notes in Computer ScienceBerlin Heidelberg: Springer; 2007.

    Book  Google Scholar 

  14. Arpabet. http://en.wikipedia.org/wiki/Arpabet.

  15. Reynolds DA. A Gaussian mixture modeling approach to text-independent speaker identification. Ph.D. thesis, Georgia Institute of Technology, 1992.

    Google Scholar 

  16. Raux A, Singh R. Maximum-likelihood adaptation of semi-continuous HMMs by latent variable decomposition of state distributions. In: The proceedings of the 8th international conference on spoken language processing (ICSLP), South Korea, 2004.

    Google Scholar 

  17. Duchateau J, Demuynck K, Van Compernolle D. Fast and accurate acoustic modelling with semi-continuous HMMs. Speech Commun. 1998;24(1):5–17.

    Article  Google Scholar 

  18. Indian English SMS Sending App—PocketSphinx Derivative. https://github.com/parthoiiitm/smsforindeng.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaurav Ojha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Mandal, P., Ojha, G., Shukla, A., Agrawal, S.S. (2016). Accoustic Modeling for Development of Accented Indian English ASR. In: Dash, S., Bhaskar, M., Panigrahi, B., Das, S. (eds) Artificial Intelligence and Evolutionary Computations in Engineering Systems. Advances in Intelligent Systems and Computing, vol 394. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2656-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2656-7_16

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2654-3

  • Online ISBN: 978-81-322-2656-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics