Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments

Park, Jeong-Sik; Jang, Gil-Jin; Kim, Ji-Hwan; Yeo, Sang-Soo

doi:10.1007/s11042-013-1788-y

Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments

Published: 05 December 2013

Volume 75, pages 4981–4996, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jeong-Sik Park¹,
Gil-Jin Jang²,
Ji-Hwan Kim³ &
…
Sang-Soo Yeo⁴

361 Accesses
1 Citation
Explore all metrics

Abstract

This study proposes an unsupervised noise reduction scheme that improves the performance of voice-based information retrieval tasks in mobile environments. Various types of noises could interfere with speech processing tasks, and noise reduction has become an essential technique in this field. In particular, noise reduction needs to be carefully processed in mobile environments based on the speech coding system and the client-server architecture. In this study, we propose an effective noise reduction scheme that employs the adaptive comb filtering technique. A way of directly using several codec parameters during the filtering process is also investigated. In particular, we modify the conventional comb filter using line spectral pair parameters. To verify the efficiency of the proposed noise reduction approach, we conducted speech recognition experiments using the Aurora2 database. Our approach provided superior recognition performance under various noise conditions compared to the conventional techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Noise Reduction Scheme for Speech Recognition in Mobile Devices

Speech coding techniques and challenges: a comprehensive literature survey

Article 14 September 2023

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

Article 06 January 2017

References

Bhagat D, Bhatt N, Kosta Y (2012) Adaptive multi-rate wideband speech codec based on CELP algorithm: architectural study, implementation & performance analysis. In: International conference on communication systems and network technologies. pp 547–551
Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Acous Speech and Sig Proc 27(2):113–120
Article Google Scholar
Gardner W, Jacobs P, Lee C (1993) QCELP: A variable rate speech coder for CDMA digital cellular. Speech Audio Coding Wirel Netw Appl 224:85–92
Article Google Scholar
Gong YF (1995) Speech recognition in noisy environments: a survey. Speech Comm 16(3):261–291
Article Google Scholar
Jang GJ, Park JS, Kim JH, Seo YH (2011) Line spectral frequency-based noise suppression for speech-centric interface of smart devices. Adv Electr Comput Eng 11(4):3–8
Article Google Scholar
Kabal P, Ramachandran R (1986) The computation of line spectral frequencies using chebyshev polynomials. IEEE Trans Acous Speech Sig Proc 34(6):1419–1426
Article Google Scholar
Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: IEEE international conference on acoustics, speech and signal processing. pp 101–111
Kim HK, Cox RV (2000) Bitstream-based feature extraction for wireless speech recognition. In: International conference on acoustics, speech, and signal processing. pp 1607–1610
Kindoz A, Kondoz A (1994) Digital speech; coding for low bit rate communication systems. Wiley, New York
Google Scholar
Krubsack DA, Niederjohn RJ (1991) An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech. IEEE Trans Sig Proc 39(2):319–329
Article Google Scholar
Lee LS, Pan YC (2009) Voice-based information retrieval - how far are we from the text-based information retrieval. In: IEEE automatic speech recognition and understanding workshop. pp 26–43
Lee M, Kim H, Choi S, Lee H (1999) On the use of LSF intermodel interlacing property for spectral quantization. In: IEEE workshop on speech coding. pp 43–45
Lee M, Kim H, Lee H (2001) A new distortion measure for spectral quantization based on the LSF intermodel interlacing property. Speech Comm 35(3–4):191–202
Article MATH Google Scholar
Lee SI, Seo SH, Jang DW, Yoo CD (2003) A novel transcoding algorithm for AMR and EVRC speech codecs via direct parameter transformation. Int Conf Acoust, Speech, Sig Process 2:177–180
Google Scholar
Lim JS, Oppenheim AV (1979) Enhancement and bandwidth compression of noisy speech. Proc IEEE 67(12):1586–1604
Article Google Scholar
Nehorai A, Porat B (1986) Adaptive comb filtering for harmonic signal enhancement. IEEE Trans Acous Speech and Sig Proc 34(5):1124–1138
Article Google Scholar
Park JS, Jang GJ, Kim JH, Kim SH (2013) Acoustic interference cancellation for a voice-driven interface in smart TVs. IEEE Trans Consum Electron 59(1):244–249
Article Google Scholar
Park KM, Park JS, Bae JH, Oh YH (2013) Online speaker diarization for multimedia data retrieval on mobile devices. Int J Pattern Recognit Artif Intell 26(8):1–22
Google Scholar
Pearce D, Hirsch HG (2000) The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Int Conf Spoken Lang Process 4:29–32
Google Scholar
Shimamura T, Kobayashi H (2001) Weighted autocorrelation for pitch extraction of noisy speech. IEEE Trans Speech Audio Proc 9(7):727–730
Article Google Scholar
Srinivasan S (2005) Knowledge-based speech enhancement. Dissertation, KTH - Royal Institute of Technology, Stockholm
Veeneman D, Mazor B (1989) A fully adaptive comb filter for enhancing block-coded speech. IEEE Trans Acous Speech Sig Proc 37(6):955–957
Article Google Scholar
Young S, Evermann G, Gales M, Hain T et al (2006) Hidden Markov model toolkit (HTK); version 3.4. Cambridge University Engineering Department

Download references

Acknowledgments

This research was supported by NAP (National Agenda Project) of the Korea Research Council of Fundamental Science and Technology, and the Converging Research Center Program through the Ministry of Science, ICT and Future Planning, Korea (2013K000358)

Author information

Authors and Affiliations

Department of Intelligent Robot Engineering, Mokwon University, Daejeon, Republic of Korea
Jeong-Sik Park
School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
Gil-Jin Jang
Department of Computer Science and Engineering, Sogang University, Seoul, Republic of Korea
Ji-Hwan Kim
Division of Computer Engineering, Mokwon University, Daejeon, Republic of Korea
Sang-Soo Yeo

Authors

Jeong-Sik Park
View author publications
You can also search for this author in PubMed Google Scholar
Gil-Jin Jang
View author publications
You can also search for this author in PubMed Google Scholar
Ji-Hwan Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Soo Yeo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gil-Jin Jang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, JS., Jang, GJ., Kim, JH. et al. Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments. Multimed Tools Appl 75, 4981–4996 (2016). https://doi.org/10.1007/s11042-013-1788-y

Download citation

Published: 05 December 2013
Issue Date: May 2016
DOI: https://doi.org/10.1007/s11042-013-1788-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments

Abstract

Access this article

Similar content being viewed by others

Noise Reduction Scheme for Speech Recognition in Mobile Devices

Speech coding techniques and challenges: a comprehensive literature survey

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments

Abstract

Access this article

Similar content being viewed by others

Noise Reduction Scheme for Speech Recognition in Mobile Devices

Speech coding techniques and challenges: a comprehensive literature survey

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation