Skip to main content

Artificial Perception: Auditory Decomposition of Mixtures of Environmental Sounds – Combining Information Theoretical and Supervised Pattern Recognition Approaches

  • Conference paper
SOFSEM 2004: Theory and Practice of Computer Science (SOFSEM 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2932))

  • 453 Accesses

Abstract

This paper deals with the new algorithm for the general audio classification which explores auditory scene analysis approach and has been inspired by the recent results in cognitive psychology and audition. This algorithm combines an information theoretical approach with the supervised pattern recognition models of environmental sounds, namely Hidden Markov Models; and with modern missing feature techniques. The investigated algorithm has been tested on the set of environmental sounds; increase in the classification accuracy of the selected environmental sound source in the mixtures of sounds corrupted by noise or by mixing process has been shown.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cherry, E.C.: Some Experiments on the Recognition of Speech, with One and Two Ears. J. Acoustic Soc. Am. 25, 975–979 (1953)

    Article  Google Scholar 

  2. Mellinger, D.K., Mont-Reynaud, B.R.: Scene Analysis. In: Hawkins, et al. (eds.) Auditory Computation, Springer, Heidelberg (1995)

    Google Scholar 

  3. Bregman, A.S.: Auditory Scene Analysis: the Perceptual Organization of Sound. MIT Press, Cambridge (1990)

    Google Scholar 

  4. Sternberg, R.J.: Cognitive Psychology. Harcourt Inc. (1996)

    Google Scholar 

  5. Kashino, K., Murase, H.: Sound Source Identification for Ensemble Music Based on the Music Stream Extraction. In: Proc. IJCAI 1997, Workshop on Computational Auditory Scene Analysis (1997)

    Google Scholar 

  6. Cooke, M.P.: Modelling Auditory Processing and Organization: Distinguished Dissertations in Computer Science Series. Cambridge University Press, Cambridge (1993)

    Google Scholar 

  7. Nakatani, O., Kawabata, T., Okuno, H.G.: A Computational Model of Sound Stream Segregation. In: Proc. ICASS 1995 (1995)

    Google Scholar 

  8. Nakatani, O., Kawabata, T., Okuno, H.G.: Residue-driven Architecture for Computational Auditory Scene Analysis. In: Proc. IJCAI 1995 (1995)

    Google Scholar 

  9. Nawab, S.H., Espy-Wilson, C.Y., Mani, R., Bitar, N.N.: Knowledge-based Analysis of Speech Mixed with Sporadic Environmental Sounds. In: Rosenthal, D.F., Okuno, E.G. (eds.) Readings in Computational Auditory Scene Analysis, Lawrence Erlbaum, Mahweh (1998)

    Google Scholar 

  10. Ellis, D.P.W.: Prediction-Driven Computational Auditory Scene Analysis. Ph.D. Thesis, MIT Dept. of Electrical Engineering and Computer Science, Cambridge, Massachusetts (1996)

    Google Scholar 

  11. Ellis, D.P.W., Rosenthal, D.F.: Mid-level Representations for Computational Auditory Scene Analysis: The Weft Element. In: Rosenthal, D.F., Okuno, E.G. (eds.) Readings in Computational Auditory Scene Analysis, Lawrence Erlbaum, Mahweh (1998)

    Google Scholar 

  12. Couvreur, C.: Hidden Markov Models and their Mixtures. Habilitation Thesis, Catholic University of Louvan (1996)

    Google Scholar 

  13. Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. John Willey & Sons, New York (2002)

    Book  Google Scholar 

  14. Common, P.: Independent Component Analysis – a New Concept? Signal Processing 36, 287–314 (1997)

    Article  Google Scholar 

  15. Herault, J., Jutten, C.: Blind Separation of Sources – Part I: An Adaptive Algorithm Based on Neuromimetic Architecture. Signal Processing 24 (1991)

    Google Scholar 

  16. Hyvorinen, A.: Fast and Robust Fixed-Point Algorithms for Independent Component Analysis. IEEE Trans. Neural Networks 10 (3) (1999)

    Google Scholar 

  17. Palmer, S.E.: Modern Theories of Gestalt Perception. In: Understanding Vision: An Interdisciplinary Perspective: Readings in Mind and Language, Oxford, England, Blaskwell (1992)

    Google Scholar 

  18. Moore, B.C.J.: Hearing. Handbook of Perception and Cognition Series. Academic Press, London (1995)

    Google Scholar 

  19. Attneave, F.: Informational Aspects of Visual Perception. Psychological Review 61, 183–193 (1954)

    Article  Google Scholar 

  20. Atick, J.J., Redlich, A.N.: Towards a Theory of Early Visual Processing. In: Neural Computation 2, MIT Press, Cambridge (1990)

    Google Scholar 

  21. Olshausen, B.A., Field, D.J.: Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Nature 381, 607–609 (1996)

    Article  Google Scholar 

  22. Bell, A.J., Sejnowski, T.J.: Learning the Higher-order Structure of a Natural Sound. Network. Computation in Neural Systems 7 (1996)

    Google Scholar 

  23. Bell, A.J., Sejnowski, T.J.: The ’Independent Components’ of Natural Scenes are Edge Filters. Vision Research 37(23), 3327–3338 (1997)

    Article  Google Scholar 

  24. Smaragdis, P.: Redundancy Reduction for Computational Audition – an Unifying Approach. Ph.D. Thesis, MIT, Massachusetts (2001)

    Google Scholar 

  25. Marcell, M.E., Borella, D., Greene, M.: Confrontation Naming of Environmental Sounds. Journal of Clinical and Experimental neurophysiology 22(6), 830–864 (2000)

    Google Scholar 

  26. Mecklinger, A., Opitz, B., Friederici, A.D.: Semantic Aspects of Novelty Detection in Humans. Neuroscience Letters 235(1-2), 65–68 (1997)

    Article  Google Scholar 

  27. Lebrun, N., et al.: An ERD Mapping Study of the Neurocognitive Processes Involved in the Perceptual and Semantic Analysis of Environmental Sounds and Words. Cognitive Brain Research 11(2), 235–248 (2001)

    Article  MathSciNet  Google Scholar 

  28. Cycowicz, Y.M., Friedman, D.: Effect of Sound Familiarity on the Event-related Potentials Elicited by Novel Environmental Sounds. Journal of the Brain and Cognition 36(1), 30–51 (1998)

    Article  Google Scholar 

  29. Gold, B., Morgan, N.: Speech and Audio Signal Processing. John Willey & Sons, Chichester (2000)

    Google Scholar 

  30. Patterson, R.D., Moore, B.C.J.: Auditory Filters and Excitation Patterns as Representation of Frequency Resolution. In: Moore, B.C.J. (ed.) Hearing, Academic Press, London (1996)

    Google Scholar 

  31. Slaney, M.: Auditory Toolbox. Apple Computer, Inc., Technical Report #45, Cupertino, California (1994)

    Google Scholar 

  32. Huang, X., Acero, A., Hisao-Wuen, H.: Spoken Language Processing. Prentice-Hall, Englewood Cliffs (2001)

    Google Scholar 

  33. Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77 (1989)

    Google Scholar 

  34. Gaunard, P.: Automatic Classification of Environmental Noise Events by Hidden Markov Models. In: Proc. ICASSP, Seattle (1998)

    Google Scholar 

  35. Mitchell, T.: Machine Learning. MIT Press, Cambridge (1999)

    Google Scholar 

  36. Reyes-Gomez, M., Ellis, D.E.: Selection, Parameter Estimation, and Discriminative Training of Hidden Markov Models for General Audio Modelling. In: Proc. ICME (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Janku, L. (2004). Artificial Perception: Auditory Decomposition of Mixtures of Environmental Sounds – Combining Information Theoretical and Supervised Pattern Recognition Approaches. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2004: Theory and Practice of Computer Science. SOFSEM 2004. Lecture Notes in Computer Science, vol 2932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24618-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24618-3_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20779-5

  • Online ISBN: 978-3-540-24618-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics