Abstract
Present paper describes the real time challenges to design the telephonic Automatic Speech Recognition (ASR) System. Telephonic speech data are collected automatically from all geographical regions of West Bengal to cover major dialectal variations of Bangla spoken language. All incoming calls are handled by Asterisk Server i.e. Computer telephony interface (CTI). The system asks some queries and users’ spoken responses are stored and transcribed manually for ASR system training. At the time of application of telephonic ASR, users’ voice queries are passed through the Signal Analysis and Decision (SAD) Module and after getting its decision speech signal may enter into the back-end Automatic Speech Recognition (ASR) Engine and relevant information is automatically delivered to the user. In real time scenario, the telephonic speech contains channel drop, silence or no speech event, truncated speech signal, noisy signal etc. along with the desired speech event. This paper deals with some techniques which will handle such unwanted signals in case of telephonic speech to certain extent and able to provide almost desired speech signal for the ASR system. Real time telephonic ASR system performance is increased by 8.91 % after implementing SAD module.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lee K-M, Lai J (2005) Speech vs. touch: a comparative study of the use of speech and DTMF keypad for navigation. International Journal of Human Computer Interaction IJHCI, vol 19(3)
Furui S (2000) Speech recognition technology in the ubiquitous/wearable computing environment. In: Proceedings of the international conference on acoustics speech and signal processing, pp 3735–3738
Maes SH, Chazan D, Cohen G, Hoory R (2000) Conversational networking: conversational protocols for transport, coding, and control. In: Proceedings of the international conference on spoken language processing
Gomillion D, Dempster B Building telephony system with asterisk. ISBN: 1-904811-15-9, Packet Publishing Ltd
Meggelen JV, Madsen L, Smith J Asterisk: the future of telephony, ISBN-10: 0-596-51048-9, ISBN-13: 987-0-596-51048-0, O’REILL
Basu J, Khan S, Roy R, Bepari MS (2011) Designing voice enabled railway travel enquiry system: an IVR based approach on bangla ASR. ICON 2011, Anna University, Chennai, India, pp 138–145
Basu J, Bepari MS, Roy R, Khan S (2012) Design of telephonic speech data collection and transcription methodology for speech recognition systems. FRSM 2012, KIIT, Gurgaon, pp 147–153
Basu J, Basu T, Mitra M, Das Mandal S (2009) Grapheme to Phoneme (G2P) conversion for bangla. O-COCOSDA international conference, pp 66–71
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer India
About this paper
Cite this paper
Basu, J., Bepari, M.S., Roy, R., Khan, S. (2013). Real Time Challenges to Handle the Telephonic Speech Recognition System. In: S, M., Kumar, S. (eds) Proceedings of the Fourth International Conference on Signal and Image Processing 2012 (ICSIP 2012). Lecture Notes in Electrical Engineering, vol 222. Springer, India. https://doi.org/10.1007/978-81-322-1000-9_38
Download citation
DOI: https://doi.org/10.1007/978-81-322-1000-9_38
Published:
Publisher Name: Springer, India
Print ISBN: 978-81-322-0999-7
Online ISBN: 978-81-322-1000-9
eBook Packages: EngineeringEngineering (R0)