Abstract
Humans can perceive specific desired sounds without difficulty, even in noisy environments. This is a useful ability that many animals possess, and is referred to as the ‘Cocktail party effect’. We believe that by modeling this mechanism we will be able to produce tools for speech enhancement and segregation, or for other problems in speech recognition and analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akagi, M, Mizumachi, M. (1997): Noise reduction by paired microphones. Proc. EUROSPEECH97, Rodes, 335–338
Bregman, A.S. (1990): Auditory Scene Analysis. Academic Press.
Bregman, A.S. (1993): Auditory Scene Analysis: hearing in complex environments. In: Thinking in Sounds. Oxford University Press, New York, pp. 10–36
Cooke, M. P., Brown, G.J. (1993): Computational auditory scene analysis: Exploiting principles of perceived continuity. Speech Communication 13, 391–399
Culling, J. F., Summerfield, Q. (1995): Perceptual separation of concurrent speech sounds: Absence of across-frequency grouping by common interaural delay. J. Acoust. Soc. Am. 98 (2), 785–797
de Cheveigne, A. (1993): Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancellation model of auditory processing. J. Acoust. Soc. Am. 93 (6), 3271–3290
Durlach, N. L. (1963): Equalization and Cancellation Theory of Binaural Masking-Level Difference. J. Acoust. Soc. Am. 35 (8), 1206–1218
Ellis, D. P. W. (1996): Prediction-driven computational auditory scene analysis. Ph.D. thesis, MIT Media Lab
Flanagan, J. L, et al. (1991): Autodirective microphone systems. Acoustica 73 (2), 58–71
Mizumachi, M, Akagi, M. (1998): Noise reduction by paired-microphones using spectral subtraction. Proc. ICASSP98 II, 1001–1004
Mizumachi, M., Akagi, M. (1999): Noise reduction method that is equipped for robust direction finder in adverse environments. Proc. Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland. 179–182
Masato Akagi et al.
Mizumachi, M., Akagi, M. (1999): An objective distortion estimator for hearing aids and its application to noise reduction. Proc. EUROSPEECH99, Budapest, 2619–2622
Mizumachi, M., et al. (2000): Design of robust subtractive beamformer for noisy speech recognition. Proc. ICSLP2000, Beijing, IV-57–60
Nakatani, T., et al. (1994): Unified Architecture for Auditory Scene Analysis and Spoken Language Processing. Proc. ICSLP’94, Yokohama, 24 (3)
Unoki, M., Akagi, M. (1998): Signal Extraction from Noisy Signal based on Auditory Scene Analysis. Proc. ICSLP’98, Sydney, 1515–1518
Unoki, M., Akagi, M. (1997): A method of signal extraction from noisy signal. Proc. EUROSPEECH97, Rodes, 2587–2590
Unoki, M., Akagi, M. (1999a): Signal Extraction from Noisy Signal based on Auditory Scene Analysis. Speech Communication 27 (3), pp. 261–279
Unoki, M., Akagi, M. (1999b): Segregation of vowel in background noise using the method of segregating two acoustic sources based on auditory scene. Proc. EUROSPEECH99, Budapest, 2575–2578
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Japan
About this paper
Cite this paper
Akagi, M., Mizumachi, M., Ishimoto, Y., Unoki, M. (2002). Speech Enhancement and Segregation Based on Human Auditory Mechanisms. In: Jin, Q., Li, J., Zhang, N., Cheng, J., Yu, C., Noguchi, S. (eds) Enabling Society with Information Technology. Springer, Tokyo. https://doi.org/10.1007/978-4-431-66979-1_18
Download citation
DOI: https://doi.org/10.1007/978-4-431-66979-1_18
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-66981-4
Online ISBN: 978-4-431-66979-1
eBook Packages: Springer Book Archive