Artificial Perception: Auditory Decomposition of Mixtures of Environmental Sounds – Combining Information Theoretical and Supervised Pattern Recognition Approaches

Janku, Ladislava

doi:10.1007/978-3-540-24618-3_19

Ladislava Janku⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2932))

Included in the following conference series:

International Conference on Current Trends in Theory and Practice of Computer Science

453 Accesses

Abstract

This paper deals with the new algorithm for the general audio classification which explores auditory scene analysis approach and has been inspired by the recent results in cognitive psychology and audition. This algorithm combines an information theoretical approach with the supervised pattern recognition models of environmental sounds, namely Hidden Markov Models; and with modern missing feature techniques. The investigated algorithm has been tested on the set of environmental sounds; increase in the classification accuracy of the selected environmental sound source in the mixtures of sounds corrupted by noise or by mixing process has been shown.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cherry, E.C.: Some Experiments on the Recognition of Speech, with One and Two Ears. J. Acoustic Soc. Am. 25, 975–979 (1953)
Article Google Scholar
Mellinger, D.K., Mont-Reynaud, B.R.: Scene Analysis. In: Hawkins, et al. (eds.) Auditory Computation, Springer, Heidelberg (1995)
Google Scholar
Bregman, A.S.: Auditory Scene Analysis: the Perceptual Organization of Sound. MIT Press, Cambridge (1990)
Google Scholar
Sternberg, R.J.: Cognitive Psychology. Harcourt Inc. (1996)
Google Scholar
Kashino, K., Murase, H.: Sound Source Identification for Ensemble Music Based on the Music Stream Extraction. In: Proc. IJCAI 1997, Workshop on Computational Auditory Scene Analysis (1997)
Google Scholar
Cooke, M.P.: Modelling Auditory Processing and Organization: Distinguished Dissertations in Computer Science Series. Cambridge University Press, Cambridge (1993)
Google Scholar
Nakatani, O., Kawabata, T., Okuno, H.G.: A Computational Model of Sound Stream Segregation. In: Proc. ICASS 1995 (1995)
Google Scholar
Nakatani, O., Kawabata, T., Okuno, H.G.: Residue-driven Architecture for Computational Auditory Scene Analysis. In: Proc. IJCAI 1995 (1995)
Google Scholar
Nawab, S.H., Espy-Wilson, C.Y., Mani, R., Bitar, N.N.: Knowledge-based Analysis of Speech Mixed with Sporadic Environmental Sounds. In: Rosenthal, D.F., Okuno, E.G. (eds.) Readings in Computational Auditory Scene Analysis, Lawrence Erlbaum, Mahweh (1998)
Google Scholar
Ellis, D.P.W.: Prediction-Driven Computational Auditory Scene Analysis. Ph.D. Thesis, MIT Dept. of Electrical Engineering and Computer Science, Cambridge, Massachusetts (1996)
Google Scholar
Ellis, D.P.W., Rosenthal, D.F.: Mid-level Representations for Computational Auditory Scene Analysis: The Weft Element. In: Rosenthal, D.F., Okuno, E.G. (eds.) Readings in Computational Auditory Scene Analysis, Lawrence Erlbaum, Mahweh (1998)
Google Scholar
Couvreur, C.: Hidden Markov Models and their Mixtures. Habilitation Thesis, Catholic University of Louvan (1996)
Google Scholar
Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. John Willey & Sons, New York (2002)
Book Google Scholar
Common, P.: Independent Component Analysis – a New Concept? Signal Processing 36, 287–314 (1997)
Article Google Scholar
Herault, J., Jutten, C.: Blind Separation of Sources – Part I: An Adaptive Algorithm Based on Neuromimetic Architecture. Signal Processing 24 (1991)
Google Scholar
Hyvorinen, A.: Fast and Robust Fixed-Point Algorithms for Independent Component Analysis. IEEE Trans. Neural Networks 10 (3) (1999)
Google Scholar
Palmer, S.E.: Modern Theories of Gestalt Perception. In: Understanding Vision: An Interdisciplinary Perspective: Readings in Mind and Language, Oxford, England, Blaskwell (1992)
Google Scholar
Moore, B.C.J.: Hearing. Handbook of Perception and Cognition Series. Academic Press, London (1995)
Google Scholar
Attneave, F.: Informational Aspects of Visual Perception. Psychological Review 61, 183–193 (1954)
Article Google Scholar
Atick, J.J., Redlich, A.N.: Towards a Theory of Early Visual Processing. In: Neural Computation 2, MIT Press, Cambridge (1990)
Google Scholar
Olshausen, B.A., Field, D.J.: Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Nature 381, 607–609 (1996)
Article Google Scholar
Bell, A.J., Sejnowski, T.J.: Learning the Higher-order Structure of a Natural Sound. Network. Computation in Neural Systems 7 (1996)
Google Scholar
Bell, A.J., Sejnowski, T.J.: The ’Independent Components’ of Natural Scenes are Edge Filters. Vision Research 37(23), 3327–3338 (1997)
Article Google Scholar
Smaragdis, P.: Redundancy Reduction for Computational Audition – an Unifying Approach. Ph.D. Thesis, MIT, Massachusetts (2001)
Google Scholar
Marcell, M.E., Borella, D., Greene, M.: Confrontation Naming of Environmental Sounds. Journal of Clinical and Experimental neurophysiology 22(6), 830–864 (2000)
Google Scholar
Mecklinger, A., Opitz, B., Friederici, A.D.: Semantic Aspects of Novelty Detection in Humans. Neuroscience Letters 235(1-2), 65–68 (1997)
Article Google Scholar
Lebrun, N., et al.: An ERD Mapping Study of the Neurocognitive Processes Involved in the Perceptual and Semantic Analysis of Environmental Sounds and Words. Cognitive Brain Research 11(2), 235–248 (2001)
Article MathSciNet Google Scholar
Cycowicz, Y.M., Friedman, D.: Effect of Sound Familiarity on the Event-related Potentials Elicited by Novel Environmental Sounds. Journal of the Brain and Cognition 36(1), 30–51 (1998)
Article Google Scholar
Gold, B., Morgan, N.: Speech and Audio Signal Processing. John Willey & Sons, Chichester (2000)
Google Scholar
Patterson, R.D., Moore, B.C.J.: Auditory Filters and Excitation Patterns as Representation of Frequency Resolution. In: Moore, B.C.J. (ed.) Hearing, Academic Press, London (1996)
Google Scholar
Slaney, M.: Auditory Toolbox. Apple Computer, Inc., Technical Report #45, Cupertino, California (1994)
Google Scholar
Huang, X., Acero, A., Hisao-Wuen, H.: Spoken Language Processing. Prentice-Hall, Englewood Cliffs (2001)
Google Scholar
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77 (1989)
Google Scholar
Gaunard, P.: Automatic Classification of Environmental Noise Events by Hidden Markov Models. In: Proc. ICASSP, Seattle (1998)
Google Scholar
Mitchell, T.: Machine Learning. MIT Press, Cambridge (1999)
Google Scholar
Reyes-Gomez, M., Ellis, D.E.: Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modelling. In: Proc. ICME (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
Ladislava Janku

Authors

Ladislava Janku
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Sciences, ILLC - Department of Mathematics and Computer Science, University of Amsterdam, Plantage Muidergracht 24, 1018 TV, Amsterdam, The Netherlands
Peter Van Emde Boas
Faculty of Mathematics and Physics, Charles University, Prague
Jaroslav Pokorný
Institute of Informatics and Software Engineering Faculty of Informatics and Information technologies, Slovak University of Technology, Ilkovičova 3, 842 16, Bratislava
Mária Bieliková
Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodárenskou věží 2, 182 07, Prague 8 Czech Republic
Július Štuller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janku, L. (2004). Artificial Perception: Auditory Decomposition of Mixtures of Environmental Sounds – Combining Information Theoretical and Supervised Pattern Recognition Approaches. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2004: Theory and Practice of Computer Science. SOFSEM 2004. Lecture Notes in Computer Science, vol 2932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24618-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-540-24618-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20779-5
Online ISBN: 978-3-540-24618-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics