Abstract
This study proposes an unsupervised noise reduction scheme that improves the performance of voice-based information retrieval tasks in mobile environments. Various types of noises could interfere with speech processing tasks, and noise reduction has become an essential technique in this field. In particular, noise reduction needs to be carefully processed in mobile environments based on the speech coding system and the client-server architecture. In this study, we propose an effective noise reduction scheme that employs the adaptive comb filtering technique. A way of directly using several codec parameters during the filtering process is also investigated. In particular, we modify the conventional comb filter using line spectral pair parameters. To verify the efficiency of the proposed noise reduction approach, we conducted speech recognition experiments using the Aurora2 database. Our approach provided superior recognition performance under various noise conditions compared to the conventional techniques.
Similar content being viewed by others
References
Bhagat D, Bhatt N, Kosta Y (2012) Adaptive multi-rate wideband speech codec based on CELP algorithm: architectural study, implementation & performance analysis. In: International conference on communication systems and network technologies. pp 547–551
Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Acous Speech and Sig Proc 27(2):113–120
Gardner W, Jacobs P, Lee C (1993) QCELP: A variable rate speech coder for CDMA digital cellular. Speech Audio Coding Wirel Netw Appl 224:85–92
Gong YF (1995) Speech recognition in noisy environments: a survey. Speech Comm 16(3):261–291
Jang GJ, Park JS, Kim JH, Seo YH (2011) Line spectral frequency-based noise suppression for speech-centric interface of smart devices. Adv Electr Comput Eng 11(4):3–8
Kabal P, Ramachandran R (1986) The computation of line spectral frequencies using chebyshev polynomials. IEEE Trans Acous Speech Sig Proc 34(6):1419–1426
Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: IEEE international conference on acoustics, speech and signal processing. pp 101–111
Kim HK, Cox RV (2000) Bitstream-based feature extraction for wireless speech recognition. In: International conference on acoustics, speech, and signal processing. pp 1607–1610
Kindoz A, Kondoz A (1994) Digital speech; coding for low bit rate communication systems. Wiley, New York
Krubsack DA, Niederjohn RJ (1991) An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech. IEEE Trans Sig Proc 39(2):319–329
Lee LS, Pan YC (2009) Voice-based information retrieval - how far are we from the text-based information retrieval. In: IEEE automatic speech recognition and understanding workshop. pp 26–43
Lee M, Kim H, Choi S, Lee H (1999) On the use of LSF intermodel interlacing property for spectral quantization. In: IEEE workshop on speech coding. pp 43–45
Lee M, Kim H, Lee H (2001) A new distortion measure for spectral quantization based on the LSF intermodel interlacing property. Speech Comm 35(3–4):191–202
Lee SI, Seo SH, Jang DW, Yoo CD (2003) A novel transcoding algorithm for AMR and EVRC speech codecs via direct parameter transformation. Int Conf Acoust, Speech, Sig Process 2:177–180
Lim JS, Oppenheim AV (1979) Enhancement and bandwidth compression of noisy speech. Proc IEEE 67(12):1586–1604
Nehorai A, Porat B (1986) Adaptive comb filtering for harmonic signal enhancement. IEEE Trans Acous Speech and Sig Proc 34(5):1124–1138
Park JS, Jang GJ, Kim JH, Kim SH (2013) Acoustic interference cancellation for a voice-driven interface in smart TVs. IEEE Trans Consum Electron 59(1):244–249
Park KM, Park JS, Bae JH, Oh YH (2013) Online speaker diarization for multimedia data retrieval on mobile devices. Int J Pattern Recognit Artif Intell 26(8):1–22
Pearce D, Hirsch HG (2000) The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Int Conf Spoken Lang Process 4:29–32
Shimamura T, Kobayashi H (2001) Weighted autocorrelation for pitch extraction of noisy speech. IEEE Trans Speech Audio Proc 9(7):727–730
Srinivasan S (2005) Knowledge-based speech enhancement. Dissertation, KTH - Royal Institute of Technology, Stockholm
Veeneman D, Mazor B (1989) A fully adaptive comb filter for enhancing block-coded speech. IEEE Trans Acous Speech Sig Proc 37(6):955–957
Young S, Evermann G, Gales M, Hain T et al (2006) Hidden Markov model toolkit (HTK); version 3.4. Cambridge University Engineering Department
Acknowledgments
This research was supported by NAP (National Agenda Project) of the Korea Research Council of Fundamental Science and Technology, and the Converging Research Center Program through the Ministry of Science, ICT and Future Planning, Korea (2013K000358)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Park, JS., Jang, GJ., Kim, JH. et al. Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments. Multimed Tools Appl 75, 4981–4996 (2016). https://doi.org/10.1007/s11042-013-1788-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1788-y