Skip to main content
Log in

Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This study proposes an unsupervised noise reduction scheme that improves the performance of voice-based information retrieval tasks in mobile environments. Various types of noises could interfere with speech processing tasks, and noise reduction has become an essential technique in this field. In particular, noise reduction needs to be carefully processed in mobile environments based on the speech coding system and the client-server architecture. In this study, we propose an effective noise reduction scheme that employs the adaptive comb filtering technique. A way of directly using several codec parameters during the filtering process is also investigated. In particular, we modify the conventional comb filter using line spectral pair parameters. To verify the efficiency of the proposed noise reduction approach, we conducted speech recognition experiments using the Aurora2 database. Our approach provided superior recognition performance under various noise conditions compared to the conventional techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bhagat D, Bhatt N, Kosta Y (2012) Adaptive multi-rate wideband speech codec based on CELP algorithm: architectural study, implementation & performance analysis. In: International conference on communication systems and network technologies. pp 547–551

  2. Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Acous Speech and Sig Proc 27(2):113–120

    Article  Google Scholar 

  3. Gardner W, Jacobs P, Lee C (1993) QCELP: A variable rate speech coder for CDMA digital cellular. Speech Audio Coding Wirel Netw Appl 224:85–92

    Article  Google Scholar 

  4. Gong YF (1995) Speech recognition in noisy environments: a survey. Speech Comm 16(3):261–291

    Article  Google Scholar 

  5. Jang GJ, Park JS, Kim JH, Seo YH (2011) Line spectral frequency-based noise suppression for speech-centric interface of smart devices. Adv Electr Comput Eng 11(4):3–8

    Article  Google Scholar 

  6. Kabal P, Ramachandran R (1986) The computation of line spectral frequencies using chebyshev polynomials. IEEE Trans Acous Speech Sig Proc 34(6):1419–1426

    Article  Google Scholar 

  7. Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: IEEE international conference on acoustics, speech and signal processing. pp 101–111

  8. Kim HK, Cox RV (2000) Bitstream-based feature extraction for wireless speech recognition. In: International conference on acoustics, speech, and signal processing. pp 1607–1610

  9. Kindoz A, Kondoz A (1994) Digital speech; coding for low bit rate communication systems. Wiley, New York

    Google Scholar 

  10. Krubsack DA, Niederjohn RJ (1991) An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech. IEEE Trans Sig Proc 39(2):319–329

    Article  Google Scholar 

  11. Lee LS, Pan YC (2009) Voice-based information retrieval - how far are we from the text-based information retrieval. In: IEEE automatic speech recognition and understanding workshop. pp 26–43

  12. Lee M, Kim H, Choi S, Lee H (1999) On the use of LSF intermodel interlacing property for spectral quantization. In: IEEE workshop on speech coding. pp 43–45

  13. Lee M, Kim H, Lee H (2001) A new distortion measure for spectral quantization based on the LSF intermodel interlacing property. Speech Comm 35(3–4):191–202

    Article  MATH  Google Scholar 

  14. Lee SI, Seo SH, Jang DW, Yoo CD (2003) A novel transcoding algorithm for AMR and EVRC speech codecs via direct parameter transformation. Int Conf Acoust, Speech, Sig Process 2:177–180

    Google Scholar 

  15. Lim JS, Oppenheim AV (1979) Enhancement and bandwidth compression of noisy speech. Proc IEEE 67(12):1586–1604

    Article  Google Scholar 

  16. Nehorai A, Porat B (1986) Adaptive comb filtering for harmonic signal enhancement. IEEE Trans Acous Speech and Sig Proc 34(5):1124–1138

    Article  Google Scholar 

  17. Park JS, Jang GJ, Kim JH, Kim SH (2013) Acoustic interference cancellation for a voice-driven interface in smart TVs. IEEE Trans Consum Electron 59(1):244–249

    Article  Google Scholar 

  18. Park KM, Park JS, Bae JH, Oh YH (2013) Online speaker diarization for multimedia data retrieval on mobile devices. Int J Pattern Recognit Artif Intell 26(8):1–22

    Google Scholar 

  19. Pearce D, Hirsch HG (2000) The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Int Conf Spoken Lang Process 4:29–32

    Google Scholar 

  20. Shimamura T, Kobayashi H (2001) Weighted autocorrelation for pitch extraction of noisy speech. IEEE Trans Speech Audio Proc 9(7):727–730

    Article  Google Scholar 

  21. Srinivasan S (2005) Knowledge-based speech enhancement. Dissertation, KTH - Royal Institute of Technology, Stockholm

  22. Veeneman D, Mazor B (1989) A fully adaptive comb filter for enhancing block-coded speech. IEEE Trans Acous Speech Sig Proc 37(6):955–957

    Article  Google Scholar 

  23. Young S, Evermann G, Gales M, Hain T et al (2006) Hidden Markov model toolkit (HTK); version 3.4. Cambridge University Engineering Department

Download references

Acknowledgments

This research was supported by NAP (National Agenda Project) of the Korea Research Council of Fundamental Science and Technology, and the Converging Research Center Program through the Ministry of Science, ICT and Future Planning, Korea (2013K000358)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gil-Jin Jang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, JS., Jang, GJ., Kim, JH. et al. Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments. Multimed Tools Appl 75, 4981–4996 (2016). https://doi.org/10.1007/s11042-013-1788-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1788-y

Keywords

Navigation