Skip to main content

Speech Enhancement and Segregation Based on Human Auditory Mechanisms

  • Conference paper
Enabling Society with Information Technology

Abstract

Humans can perceive specific desired sounds without difficulty, even in noisy environments. This is a useful ability that many animals possess, and is referred to as the ‘Cocktail party effect’. We believe that by modeling this mechanism we will be able to produce tools for speech enhancement and segregation, or for other problems in speech recognition and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akagi, M, Mizumachi, M. (1997): Noise reduction by paired microphones. Proc. EUROSPEECH97, Rodes, 335–338

    Google Scholar 

  2. Bregman, A.S. (1990): Auditory Scene Analysis. Academic Press.

    Google Scholar 

  3. Bregman, A.S. (1993): Auditory Scene Analysis: hearing in complex environments. In: Thinking in Sounds. Oxford University Press, New York, pp. 10–36

    Google Scholar 

  4. Cooke, M. P., Brown, G.J. (1993): Computational auditory scene analysis: Exploiting principles of perceived continuity. Speech Communication 13, 391–399

    Google Scholar 

  5. Culling, J. F., Summerfield, Q. (1995): Perceptual separation of concurrent speech sounds: Absence of across-frequency grouping by common interaural delay. J. Acoust. Soc. Am. 98 (2), 785–797

    Article  Google Scholar 

  6. de Cheveigne, A. (1993): Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancellation model of auditory processing. J. Acoust. Soc. Am. 93 (6), 3271–3290

    Article  Google Scholar 

  7. Durlach, N. L. (1963): Equalization and Cancellation Theory of Binaural Masking-Level Difference. J. Acoust. Soc. Am. 35 (8), 1206–1218

    Article  Google Scholar 

  8. Ellis, D. P. W. (1996): Prediction-driven computational auditory scene analysis. Ph.D. thesis, MIT Media Lab

    Google Scholar 

  9. Flanagan, J. L, et al. (1991): Autodirective microphone systems. Acoustica 73 (2), 58–71

    Google Scholar 

  10. Mizumachi, M, Akagi, M. (1998): Noise reduction by paired-microphones using spectral subtraction. Proc. ICASSP98 II, 1001–1004

    Google Scholar 

  11. Mizumachi, M., Akagi, M. (1999): Noise reduction method that is equipped for robust direction finder in adverse environments. Proc. Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland. 179–182

    Google Scholar 

  12. Masato Akagi et al.

    Google Scholar 

  13. Mizumachi, M., Akagi, M. (1999): An objective distortion estimator for hearing aids and its application to noise reduction. Proc. EUROSPEECH99, Budapest, 2619–2622

    Google Scholar 

  14. Mizumachi, M., et al. (2000): Design of robust subtractive beamformer for noisy speech recognition. Proc. ICSLP2000, Beijing, IV-57–60

    Google Scholar 

  15. Nakatani, T., et al. (1994): Unified Architecture for Auditory Scene Analysis and Spoken Language Processing. Proc. ICSLP’94, Yokohama, 24 (3)

    Google Scholar 

  16. Unoki, M., Akagi, M. (1998): Signal Extraction from Noisy Signal based on Auditory Scene Analysis. Proc. ICSLP’98, Sydney, 1515–1518

    Google Scholar 

  17. Unoki, M., Akagi, M. (1997): A method of signal extraction from noisy signal. Proc. EUROSPEECH97, Rodes, 2587–2590

    Google Scholar 

  18. Unoki, M., Akagi, M. (1999a): Signal Extraction from Noisy Signal based on Auditory Scene Analysis. Speech Communication 27 (3), pp. 261–279

    Article  Google Scholar 

  19. Unoki, M., Akagi, M. (1999b): Segregation of vowel in background noise using the method of segregating two acoustic sources based on auditory scene. Proc. EUROSPEECH99, Budapest, 2575–2578

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Japan

About this paper

Cite this paper

Akagi, M., Mizumachi, M., Ishimoto, Y., Unoki, M. (2002). Speech Enhancement and Segregation Based on Human Auditory Mechanisms. In: Jin, Q., Li, J., Zhang, N., Cheng, J., Yu, C., Noguchi, S. (eds) Enabling Society with Information Technology. Springer, Tokyo. https://doi.org/10.1007/978-4-431-66979-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-66979-1_18

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-66981-4

  • Online ISBN: 978-4-431-66979-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics