Abstract
This paper deals with the new algorithm for the general audio classification which explores auditory scene analysis approach and has been inspired by the recent results in cognitive psychology and audition. This algorithm combines an information theoretical approach with the supervised pattern recognition models of environmental sounds, namely Hidden Markov Models; and with modern missing feature techniques. The investigated algorithm has been tested on the set of environmental sounds; increase in the classification accuracy of the selected environmental sound source in the mixtures of sounds corrupted by noise or by mixing process has been shown.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cherry, E.C.: Some Experiments on the Recognition of Speech, with One and Two Ears. J. Acoustic Soc. Am. 25, 975–979 (1953)
Mellinger, D.K., Mont-Reynaud, B.R.: Scene Analysis. In: Hawkins, et al. (eds.) Auditory Computation, Springer, Heidelberg (1995)
Bregman, A.S.: Auditory Scene Analysis: the Perceptual Organization of Sound. MIT Press, Cambridge (1990)
Sternberg, R.J.: Cognitive Psychology. Harcourt Inc. (1996)
Kashino, K., Murase, H.: Sound Source Identification for Ensemble Music Based on the Music Stream Extraction. In: Proc. IJCAI 1997, Workshop on Computational Auditory Scene Analysis (1997)
Cooke, M.P.: Modelling Auditory Processing and Organization: Distinguished Dissertations in Computer Science Series. Cambridge University Press, Cambridge (1993)
Nakatani, O., Kawabata, T., Okuno, H.G.: A Computational Model of Sound Stream Segregation. In: Proc. ICASS 1995 (1995)
Nakatani, O., Kawabata, T., Okuno, H.G.: Residue-driven Architecture for Computational Auditory Scene Analysis. In: Proc. IJCAI 1995 (1995)
Nawab, S.H., Espy-Wilson, C.Y., Mani, R., Bitar, N.N.: Knowledge-based Analysis of Speech Mixed with Sporadic Environmental Sounds. In: Rosenthal, D.F., Okuno, E.G. (eds.) Readings in Computational Auditory Scene Analysis, Lawrence Erlbaum, Mahweh (1998)
Ellis, D.P.W.: Prediction-Driven Computational Auditory Scene Analysis. Ph.D. Thesis, MIT Dept. of Electrical Engineering and Computer Science, Cambridge, Massachusetts (1996)
Ellis, D.P.W., Rosenthal, D.F.: Mid-level Representations for Computational Auditory Scene Analysis: The Weft Element. In: Rosenthal, D.F., Okuno, E.G. (eds.) Readings in Computational Auditory Scene Analysis, Lawrence Erlbaum, Mahweh (1998)
Couvreur, C.: Hidden Markov Models and their Mixtures. Habilitation Thesis, Catholic University of Louvan (1996)
Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. John Willey & Sons, New York (2002)
Common, P.: Independent Component Analysis – a New Concept? Signal Processing 36, 287–314 (1997)
Herault, J., Jutten, C.: Blind Separation of Sources – Part I: An Adaptive Algorithm Based on Neuromimetic Architecture. Signal Processing 24 (1991)
Hyvorinen, A.: Fast and Robust Fixed-Point Algorithms for Independent Component Analysis. IEEE Trans. Neural Networks 10 (3) (1999)
Palmer, S.E.: Modern Theories of Gestalt Perception. In: Understanding Vision: An Interdisciplinary Perspective: Readings in Mind and Language, Oxford, England, Blaskwell (1992)
Moore, B.C.J.: Hearing. Handbook of Perception and Cognition Series. Academic Press, London (1995)
Attneave, F.: Informational Aspects of Visual Perception. Psychological Review 61, 183–193 (1954)
Atick, J.J., Redlich, A.N.: Towards a Theory of Early Visual Processing. In: Neural Computation 2, MIT Press, Cambridge (1990)
Olshausen, B.A., Field, D.J.: Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Nature 381, 607–609 (1996)
Bell, A.J., Sejnowski, T.J.: Learning the Higher-order Structure of a Natural Sound. Network. Computation in Neural Systems 7 (1996)
Bell, A.J., Sejnowski, T.J.: The ’Independent Components’ of Natural Scenes are Edge Filters. Vision Research 37(23), 3327–3338 (1997)
Smaragdis, P.: Redundancy Reduction for Computational Audition – an Unifying Approach. Ph.D. Thesis, MIT, Massachusetts (2001)
Marcell, M.E., Borella, D., Greene, M.: Confrontation Naming of Environmental Sounds. Journal of Clinical and Experimental neurophysiology 22(6), 830–864 (2000)
Mecklinger, A., Opitz, B., Friederici, A.D.: Semantic Aspects of Novelty Detection in Humans. Neuroscience Letters 235(1-2), 65–68 (1997)
Lebrun, N., et al.: An ERD Mapping Study of the Neurocognitive Processes Involved in the Perceptual and Semantic Analysis of Environmental Sounds and Words. Cognitive Brain Research 11(2), 235–248 (2001)
Cycowicz, Y.M., Friedman, D.: Effect of Sound Familiarity on the Event-related Potentials Elicited by Novel Environmental Sounds. Journal of the Brain and Cognition 36(1), 30–51 (1998)
Gold, B., Morgan, N.: Speech and Audio Signal Processing. John Willey & Sons, Chichester (2000)
Patterson, R.D., Moore, B.C.J.: Auditory Filters and Excitation Patterns as Representation of Frequency Resolution. In: Moore, B.C.J. (ed.) Hearing, Academic Press, London (1996)
Slaney, M.: Auditory Toolbox. Apple Computer, Inc., Technical Report #45, Cupertino, California (1994)
Huang, X., Acero, A., Hisao-Wuen, H.: Spoken Language Processing. Prentice-Hall, Englewood Cliffs (2001)
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77 (1989)
Gaunard, P.: Automatic Classification of Environmental Noise Events by Hidden Markov Models. In: Proc. ICASSP, Seattle (1998)
Mitchell, T.: Machine Learning. MIT Press, Cambridge (1999)
Reyes-Gomez, M., Ellis, D.E.: Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modelling. In: Proc. ICME (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Janku, L. (2004). Artificial Perception: Auditory Decomposition of Mixtures of Environmental Sounds – Combining Information Theoretical and Supervised Pattern Recognition Approaches. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2004: Theory and Practice of Computer Science. SOFSEM 2004. Lecture Notes in Computer Science, vol 2932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24618-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-24618-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20779-5
Online ISBN: 978-3-540-24618-3
eBook Packages: Springer Book Archive