Generalized Recognition of Sound Events: Approaches and Applications

Potamitis, Ilyas; Ganchev, Todor

doi:10.1007/978-3-540-78502-6_3

Ilyas Potamitis⁴ &
Todor Ganchev⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 120))

583 Accesses
12 Citations

Summary

This chapter surveys the contemporary approaches of automatic sound recognition and discusses the benefits stemming from real-world applications of this technology. We identify the common aspects and subtle differences among these diverse application areas and review state-of-the-art systems. In this context we project that there is much space for knowledge transfer between the different subfields of sound classification, which seem to evolve independently while achieving different states of maturity. Particular emphasis is given to lessons learned from the speech recognition paradigm, which together with speaker recognition were among the first applications of sound classification that reached the status of launching commercial products at a large climax. Special attention is paid to new emerging applications such as environmental monitoring and bioacoustic identification and applications to music which have already started altering our everyday life as we once knew it.

Download to read the full chapter text

Chapter PDF

Acoustic Features for Environmental Sound Analysis

Environmental sound recognition using short-time feature aggregation

Article 19 August 2017

Gerard Roma, Perfecto Herrera & Waldo Nogueira

Introduction to Sound Scene and Event Analysis

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Deng, L., O’Shaughnessy, D., Speech Processing: A Dynamic and Optimization-Oriented Approach, Marcel Dekker, New York, 2003.
Google Scholar
Garces, M., Hetzer, C., Merrifield, M., Willis, M., Aucan, J., Observations of surf infrasound in Hawai’I, In Geophysical Research Letters, pp. 2264-2267,2003.
Google Scholar
Auckland, D.W., McGrail, A.J., Smith, C.D., Varlow, B.R., Zhao, J., Zhu, D., The application of ultrasound to the inspection of insulation, In Proceedings of the IEEE 5th International Conference on Conduction and Breakdown in Solid Dielectrics, pp. 590-594, 1995.
Google Scholar
Höge, H., Draxler, C., Van den Heuvel, H., Johansen, F.T., Sanders, E., Tropf, H.S., SpeechDat multilingual speech databases for teleservices: across the finish line, In Proceedings of the Eurospeech’99, Budapest, vol. 6, pp. 2699-2702, 1999.
Google Scholar
Benyassine, A., Shlomot, E., Su, H.-Y., ITU recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications, In IEEE Communications Magazine, pp. 64-73, 1997.
Google Scholar
Sohn, J., Kim, N.S., Sung, W., A statistical model-based voice activity detec-tion, In IEEE Signal Processing Letters, vol. 6, pp. 1-3, 1999.
Article Google Scholar
Cho, Y.D., Kondoz, A., Analysis and improvement of a statistical model-based voice activity detector, In IEEE Signal Processing Letters, vol. 8, pp. 276-278,2001.
Article Google Scholar
Chollet, G., Automatic Speech and Speaker Recognition: Overview, Current Issues and Perspectives, In Keller, E. (Ed.), Fundamentals of Speech Synthesis and Speech Recognition. Basic Concepts, State of the Art and Future Chal-lenges. Chichester, Wiley, pp. 129-148, 1994.
Google Scholar
Zue, V., Cole, R., Ward, W., Speech Recognition, In Cole, R.A., Mariani, J., Uszkoreit, H., Zaenen, A., Zue, V. (Eds.), Survey of the State of the Art in Hu-man Language Technology, Cambridge, Cambridge University Press, pp. 4-10, 1997.
Google Scholar
Reynolds, D.A., Rose, R.C., Robust text-independent speaker identification using Gaussian mixture speaker models, In IEEE Transactions on Speech and Audio Processing, vol. 3, no. 1, pp. 72-83, January 1995.
Article Google Scholar
Furui, S., Speaker Recognition, In Cole, R. (Ed.), Survey of the State of the Art in Human Language Technology, Chapter 1.7, Oregon Health & Science U., 1996.
Google Scholar
Gish, H., Schmidt, M., Text-idependent speaker identification, In IEEE Signal Processing Magazine, vol. 11, no. 4, pp.18-32, October 1994.
Article Google Scholar
Zervas, P., Mporas, I., Fakotakis, N., Kokkinakis, G., Evaluating intonational features for emotion recognition from speech, In International Journal of Ar-tificial Intelligence Tools, 2007.
Google Scholar
Kwon, O., Chan, K., Hao, J., Lee, T., Emotion recognition by speech signals, In Proceedings of the Eurospeech’03, Geneva, pp. 125-128, 2003.
Google Scholar
Muthusamy, Y., Barnard, E., Cole, R., Reviewing automatic language recog-nition, In IEEE Signal Processing Magazine, pp. 33-41, October 1994.
Google Scholar
Hansen, J., Arslan, L., Foreign accent classification using source genera-tor based prosodic features, In Proceedings of the ICASSP’95, Detroit, MI, pp. 836-839, 1995.
Google Scholar
Hansen, J.H.L., Gavidia-Ceballos, L., Kaiser, J.F., A nonlinear based speech feature analysis method with application to vocal fold pathology assessment, In IEEE Transactions on Biomedical Engineering, vol. 45, no. 3, pp. 300-313, March 1998.
Article Google Scholar
Gavidia-Ceballos, L., Hansen, J.H.L., direct speech feature estimation using an iterative EM algorithm for vocal cancer detection, In IEEE Transactions on Biomedical Engineering, vol. 43, no. 4, pp. 373-383, April 1996.
Article Google Scholar
Tzanetakis, G., Cook, P., Musical Genre classification of audio signals, In IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, July 2002.
Article Google Scholar
Gouyon, F., Dixon, S., Pampalk, E., Widmer, G., Evaluating rhythmic descrip-tors for musical genre classification, In Proceedings of the AES 25th Interna-tional Conference, London, United Kingdom, June 17-19, 2004.
Google Scholar
FitzGerald, D., Coyle, E., Lawlor, B., Sub-band independent subspace analysis for drum transcription, In Proceedings of the DAFX’02, pp. 65-69, 2002.
Google Scholar
Klapuri, A., Davy, M., (Eds.), Signal Processing Methods for Music Transcrip-tion, Springer, Berlin Heidelberg New York, 2006.
Google Scholar
Widmer, G. (Ed.), Special Issue on Machine Learning in Music, In Machine Learning, vol. 65, no. 2-3, December 2006.
Google Scholar
Eggink, J., Brown, G.J., Instrument recognition in accompanied sonatas and concertos, In Proceedings of the ICASSP’04, Montreal, Canada, pp. 217-220, May 2004.
Google Scholar
Livshin, A.A., Rodet, X., Musical instrument identification in continuous recordings, In Proceedings of the DAFX’04, Naples, Italy, October 5-8, 2004.
Google Scholar
Peeters, G., Automatic classification of large musical instrument databases using hierarchical classifiers with inertia ratio maximization, In Proceedings of the AES 115th convention, New York, USA, October 10-13, 2003.
Google Scholar
Eggink, J., Brown, G.J., A missing feature approach to instrument identifi-cation in polyphonic music, In Proceedings of the ICASSP’03, Hong Kong, pp. 553-556, April 2003.
Google Scholar
Liu, M., Wan, C., Feature selection for automatic classification of musical in-strument sounds, In Proceedings of the 1st ACM/IEEE-CS Joint conference on Digital libraries, pp. 247-248, 2001.
Google Scholar
Essid, S., Richard, G., David, B., Efficient musical instrument recognition on solo performance music using basic features, In Proceedings of the AES 25th International Conference, London, UK, June 2004.
Google Scholar
Herrera, P., Yeterian, A., Gouyon, F., Automatic classification of drum sounds: a comparison of feature selection methods and classification techniques, In Proceedings of Second International Conference on Music and Artificial Intelligence, Edinburgh, Scotland, 2002.
Google Scholar
Eronen, A., Musical instrument recognition using ICA-based transform of fea-tures and discriminatively trained HMMs, In Proceedings of the Seventh Inter-national Symposium on Signal Processing and it’s Applications, pp. 133-136, July 2003.
Google Scholar
Eronen, A., Klapuri, A., Musical instrument recognition using cepstral coeffi- cients and temporal features, In Proceedings of the ICASSP’00, pp. 753-756, 2000.
Google Scholar
Brown, J.C., Houix, O., McAdams, S., Feature dependence in the automatic identification of musical woodwind instruments, In Journal of the Acoustical Society of America, vol. 109, no. 3, pp. 1064-1072, March 2000.
Article Google Scholar
Herrera, P., Peeters, G., Dubnov, S., Automatic classification of musical in-strument sounds, New Music Research, vol. 32, no. 1, 2003.
Google Scholar
Peeters, G., Rodet, X., Automatically selecting signal descriptors for sound classification. In Proceedings of the ICMC’02, Goteborg, Sweden, September 2002.
Google Scholar
Wold, T., Blum, D., Wheaton, J., Content-based classification, search, and retrieval of audio, In Proceedings of the IEEE Multimedia, vol.3, no.3, pp. 2736, 1996.
Google Scholar
Slaney, M., Mixtures of probability experts for audio retrieval and indexing, In Proceedings of the IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland, vol. 1, pp. 345-348, August 2002.
Google Scholar
Berenzweig, A., Ellis, D.P.W., Lawrence, S., Anchor space for classification and similarity measurement of music, In Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 1, pp. 29-32, 2003.
Google Scholar
Drosopoulos, S., Claridge M., Insect sounds and communication: physiology, behaviour, ecology, and evolution, Contemporary Topics in Entomology, CRC Press, 2005.
Google Scholar
Helweg, D.A., Automatic detection and species identification of blue and fin whale calls, In Bioacoustics, vol. 13, p. 96, 2002.
Google Scholar
Hennig, R.M., Acoustic feature extraction by cross-correlation in crickets, In Journal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology, vol. 189, pp. 589-598, 2003.
Article Google Scholar
Oba, T., Application of automated bioacoustic identification in environmental education and assessment, In Anais da Academia Brasileira de Cincias, vol. 76, pp. 446-451, 2004.
Google Scholar
Potamitis, I., Ganchev, T., Fakotakis, N., Automatic acoustic identifica- tion of insects inspired by the speaker recognition paradigm, In Proceedings of the Interspeech-ICSLP’06, Pittsburg PA, USA, paper 1505-Wed3CaP.13, September 17-21, 2006.
Google Scholar
Skowronski, M., Harris, J., Acoustic detection and classification of microchi-roptera using machine learning: Lessons learned from automatic speech recog-nition, In Journal of the Acoustical Society of America, vol. 119, pp. 1817-1833, 2006.
Article Google Scholar
Alexander, R., Sound production and associated behavior in insects, In The Ohio Journal of Science, vol. 57, no. 2, pp. 101-113, 1957.
Google Scholar
Bennett-Clark, H., Resonators in insect sound production: how insects produce loud pure-tone songs, In Journal of Experimental Biology, vol. 202, pp. 3347-3357,1999. 3 Generalized Recognition of Sound Events: Approaches and Applications73
Google Scholar
Martin, K., Sound-source recognition: a theory and computational model, Ph.D. Thesis, MIT, Media Lab, 1999.
Google Scholar
Ashiya, T., Hagiwara, M., Nakagawa, M., IOSES: An indoor observation sys-tem based on environmental sounds recognition using a neural network, In Transactions of the Institute of Electrical Engineers of Japan, vol. 116-C, no. 3, pp. 341-349, 1996.
Google Scholar
Cowling, M., Sitte, R., Comparison of techniques for environmental sound recognition, In Pattern Recognition Letters, vo1. 24, no. 15, pp. 2895-2907, 2003.
Article Google Scholar
Goldhor, R.S., Recognition of environmental sounds, In Proceedings of the ICASSP93, vol. 1, pp. 149-152, 1993.
Google Scholar
Arrigoni, J.E., An evaluation of amphibian monitoring approaches in the maya forest, Chapter 3: An assessment of the vocalization survey method for mon-itoring anuran populations in the Maya Forest, Master thesis, pp. 21-42, February, 2003.
Google Scholar
Lee, C.-H., Chou, C.-H., Han, C.-C., Huang, R.-Z., Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis, In Pattern Recognition Letters, vol. 27, pp. 93-101, 2006.
Article Google Scholar
Mitrovic, D., Zeppelzauer, M., Discrimination and Retrieval of Animal Sounds, In Proceedings of the IEEE Multimedia Modelling Conference, Beijing, China, pp. 339-343, 2006.
Google Scholar
Gaston, K., O’Neill, M.A., Automated species identification - why not? In Philosophical Transactions-Royal Society of London. Biological Sciences, vol. 359, no. 1444, pp. 655-667, 2004.
Article Google Scholar
Watson, A.T., O’Neill, M.A., Kitching, I.J., A qualitative study investigating automated identification of living macrolepidoptera using the Digital Auto-mated Identification SYstem (DAISY), In Systematics and Biodiversity, vol. 1, no. 1, 2003.
Google Scholar
Chesmore, E., Application of time domain signal coding and artificial neural networks to passive acoustical identification of animals, In Applied Acoustics, vol. 62, pp. 1359-1374, 2001.
Article Google Scholar
Dietrich, C., Temporal sensor fusion for the classification of bioacoustic time se-ries, PhD thesis, University of Ulm, Department of Neural Information Process-ing, 2004.
Google Scholar
Guo, Y.B., Ammula, S.C., Real-time acoustic emission monitoring for surface damage in hard machining, In International Journal of Machine Tools and Manufacture, vol. 45, pp. 1622-1627, 2005.
Google Scholar
SrinivasaPai, P., Ramakrishna Rao, P.K., Acoustic emission analysis for tool wear monitoring in face milling, In International Journal Production Research, vol. 40, no. 5, pp. 1081-1093, 2002.
Article Google Scholar
Dornfeld, D.A., Manufacturing process monitoring and analysis using acoustic emission, In Journal Acoustic Emission, vol. 4, no. 2-3, pp. 123-126, 1985.
Google Scholar
Dimla, D.E., Jr., Lister, P.M., Leighton, N.J., Neural network solutions to the tool condition monitoring problem in metal cutting. A critical review of methods, In International Journal of Machine Tools Manufacturing, vol. 37, no. 9, pp. 1219-1240, 1997.
Article Google Scholar
Diniz, A.E., Liu, J.J., Dornfeld, D.A., Correlating tool life, tool wear and sur-face roughness by monitoring acoustic emission in turning, In Wear, vol. 152, pp. 395-407, 1992.
Article Google Scholar
Diei, E.N., Dornfeld, D.A., Acoustic emission sensing of tool wear in face milling, In Transactions of ASME, Journal of Engineering for Industry, vol. 109, pp. 234-240, 1987.
Article Google Scholar
Kannatey-Asibu, E., Jr., Dornfeld, D.A., Quantitative relationships for acoustic emission from orthogonal metal cutting, In Transactions of ASME, Journal of Engineering for Industry, vol. 103, pp. 330-339, 1981.
Article Google Scholar
Carolan, T.A., Kidd, S.R., Hand, D.P., Wilcox, S.J., Wilkinson, P., Barton, J.S., Jones, J.D.C., Reuben, R.L., Acoustic emission monitoring of tool wear during the face milling of steels and aluminium alloys using a fiber optic sensor energy analysis, In Proceedings of the Institution of Mechanical Engineers, 211(B), pp. 299-309, 1997.
Article Google Scholar
Iwata, K., Moriwaki, T., An application of acoustic emission measurements to in process sensing of tool wear, In Annals of the CIRP, vol. 25, no. 1, pp. 21-26, 1977.
Google Scholar
Sampath, A., Vajpayee, S., Tool health monitoring using acoustic emission, In International Journal of Production Research, vol. 25, no. 5, pp. 703-719, 1987.
Article Google Scholar
Lister, P.M., Barrow, G., Tool condition monitoring systems, In Proceedings of the 26th International Machine Tool Design and Research Conference, pp. 271-288,1986.
Google Scholar
Inasaki, I., Application of acoustic emission sensor for monitoring machining processes, In Ultrasonics, vol. 36, pp. 273-281, 1998.
Article Google Scholar
Eronen, A.J., Peltonen, V.T., Tuomi, J.T., Klapuri, A.P., Fagerlund, S., Sorsa, T., Lorho, G., Huopaniemi, J., Audio-Based Context Recognition, In IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 1, January 2006.
Google Scholar
Gellersen, H.-W., Schmidt, A., Beigl, M., Adding some smartness to devices and everyday things, In Proceedings of the Third IEEE Workshop on Mobile Computing Systems and Applications, pp. 3-10, 2005.
Google Scholar
Vemuri, S., Schmandt, C., Bender, W., Tellex, S., Lassey, B., An audio-based personal memory aid, In Proceedings of the 6th International Conference Ubiq-uitous Computing, Ubicomp’04, pp. 400-417, 2004.
Google Scholar
Chu, S., Narayanan, S., Jay Kuo, C.-C., Content analysis for acoustic en-vironment classification in mobile robots, In Proceedings of AAAI 2006 Fall Symposium, Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems, Arlington, VA, October 2006.
Google Scholar
Clarkson, B., Sawhney, N., Pentland, A., Auditory context awareness via wear-able computing, In Proceedings of the Workshop on Perceptual User Interfaces, November 1998.
Google Scholar
Képesi, M., Weruaga, L., Adaptive chirp-based time-frequency analysis of speech signals, In Speech Communication, vol. 48, no. 5, pp. 474-492, 2006.
Article Google Scholar
Gopalan, K., Speech modification by selective fourier-bessel series expansion of speech signals, In IEEE Pacific Rim Conference on Communications, Com-puters and Signal Processing, pp. 388-392, 1999.
Google Scholar
Irino, T., Patterson, R.D., Stabilised wavelet Mellin transform: An auditory strategy for normalising sound-source size, In Proceedings of the Eurospeech ’99, Budapest, pp. 1899-1902, Hungary, 1999.
Google Scholar
Wolfe, P.J., Godsill, S.J., Ng, W.-J., Bayesian variable selection and regularisa-tion for time-frequency surface estimation, In Journal of The Royal Statistical Society Series B, Royal Statistical Society, vol. 66, no. 3, pp. 575-589, 2004.
Article MATH MathSciNet Google Scholar
Hong, L., Rosca, J., Balan, R., Bayesian single channel speech enhancement exploiting sparseness in the ICA domain, In Proceedings of the EUSIPCO 2004, Vienna, Austria, September 2004.
Google Scholar
Mossing, J.C., Tuthill, T.A., Reduced interference distributions for the detec-tion andclassification of outside sound source acoustic emissions, In Proceedings of the ICASSP’96, vol. 5, pp. 2758-2761, 1996.
Google Scholar
Tzanetakis, G., Essl, G., Cook, P.R., Audio analysis using the discrete wavelet transform, In Proceedings of WSES International Conference, Acoustics and Music: Theory and Applications (AMTA), Skiathos, Greece, 2001.
Google Scholar
Purat, M., Noll, P., Audio coding with a dynamic wavelet packet decomposition based on frequency-varying modulated lapped transforms, In Proceedings of the ICASSP’96, vol. 2, pp. 1021-1024, 1996.
Google Scholar
Davis, S.B., Mermelstein, P., Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, In IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.28, no.4, pp. 357-366, 1980.
Article Google Scholar
Kim, H., Moreau, N., Sikora, T., Audio classification based on MPEG-7 spec-tral basis representations, In IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 716-725, 2004.
Article Google Scholar
Allamanche, E., Herre, J., Hellmuth, O., Fröba, B., Kastner, T., Cremer, M., Content-based identification of audio material using MPEG-7 low-level de-scription. In Proceedings of the International Conference on Music Information Retrieval, 2001.
Google Scholar
Quackenbush, S., Lindsay, A., Overview of MPEG-7 audio, In IEEE Transac-tions on Circuits Systems for Video Technology, vol. 11, pp. 725-729, 2001.
Article Google Scholar
Peeters, G., McAdams, S., Herrera, P. Instrument sound description in the context of MPEG-7, In Proceedings of the International Conference on Music and Computers (ICMC), Berlin, Germany, 2000.
Google Scholar
Kim, H., Sikora, T., Comparison of MPEG-7 audio spectrum projection fea-tures and MFCC applied to speaker recognition, sound classification and audio segmentation, In Proceedings of the ICASSP’04, vol. 5, pp. 925-928, 2004.
Google Scholar
Casey, M., MPEG-7 sound recognition tools, In IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, pp. 737-747, 2001.
Article Google Scholar
Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, S.T., Comparing MFCC and MPEG-7 audio features for feature extraction, Maximum Likelihood HMM and Entropic Prior HMM for sports audio classification, In Proceedings of the International Conference on Multimedia and Expo, vol. 3, pp. 397-400, 2003.
Google Scholar
Haeb-Umbach, R., Ney, H., Linear discrimination analysis for improved large vocabulary continuous speech recognition, In Proceedings of the ICASSP’92, pp. 113-116, 1992.
Google Scholar
Tokuhira, M., Ariki, Y., Effectiveness of KL-transformation in spectral delta expansion, In Proceedings of the Eurospeech’99, vol. 1, pp. 359-362, 1999.
Google Scholar
Saul, L.K., Rahim, M.G., Maximum likelihood and minimum classification error factor analysis for automatic speech recognition, In IEEE Transactions on Speech and Audio Processing, vol. 8, no. 2, pp. 115-125, March 2000.
Article Google Scholar
Casey, M.A., Reduced-rank spectra and minimum-entropy priors as consis-tent and reliable cues for generalized sound recognition, In Proceedings of the Workshop on Consistent and Reliable Acoustic Cues for Sound Analysis, Eurospeech’01, Aalborg, Denmark, 2001.
Google Scholar
Lee, T.-W., Jang, G.-J., The statistical structure of male and female speech signals, In Proceedings of the ICASSP’01, vol. 1, pp. 105-108, May 2001.
Google Scholar
Eisele, T., Haeb-Umbach, R., Langmann, D., A comparative study of linear feature transformation techniques for automatic speech recognition, In Pro-ceedings of the ICSLP’96, pp. 252-255, 1996.
Google Scholar
Battle, E., Nadeu, C., Fonollosa, J., Feature decorrelation methods in speech recognition. A comparative study, In Proceedings of the ICSLP’98, pp. 951-954,1998.
Google Scholar
Bayes, T., An essay towards solving a problem in the doctrine of chances, In Philosophical Transactions of the Royal Society of London, vol. 53, pp. 370-418,1763.
Article Google Scholar
Fisher, R.A., The use of multiple measurements in taxonomic problems, In Annals of Eugenics, vol. 7, pp. 179-188, 1936.
Google Scholar
Specht, D.F., Generation of polynomial discriminant functions for pattern recognition, In IEEE Transactions on Electronic Computers, vol. 16, pp. 308-319,1967.
Article MATH Google Scholar
Lang, K.J., Hinton, G.E., A time delay neural network architecture for speech recognition, Technical Report CMU-cs-88-152, Carnegie Mellon University, Pittsburgh PA, 1988.
Google Scholar
Jordan, M.I., Serial order: A parallel distributed processing approach, Institute for Cognitive Science, Report 8604, University of California, San Diego, 1986.
Google Scholar
Elman, J.L., Finding structure in time, In Cognitive Science, vol. 14, pp. 179-211,1990.
Article Google Scholar
Rosenblatt, F., The perceptron: a probabilistic model for information stor- age and organization in the brain, In Psychological Review, vol. 65, pp. 386-408,1958.
Article MathSciNet Google Scholar
Vapnik, V.N., The Nature of Statistical Learning Theory, Springer, 1995.
Google Scholar
Specht, D.F., Probabilistic neural networks for classification, mapping, or as- sociative memory, In Proceedings of the IEEE Conference on Neural Networks, San Diego, vol. 1, pp. 525-532, July 1988.
Article Google Scholar
Hansen, L.P., Large sample properties of generalized method of moments esti- mation, In Econometrica, vol. 50, pp. 1029-1054, 1982.
Article MATH Google Scholar
Baum, L.E., Petrie, T., Statistical inference for probabilistic functions of finite state markov chains, In Annals of Mathematical Statistics, vol. 37, pp. 1554-1563,1966.
Article MATH MathSciNet Google Scholar
Cover, T., Hart, P., Nearest neighbour pattern classification, In IEEE Trans- actions on Information Theory, vol. 13, pp. 21-27, 1967.
Article MATH Google Scholar
Kohonen, T., Learning vector quantization for pattern recognition, Technical Report TKK-F-A601, Helsinki University of Technology, 1986.
Google Scholar
Powell, M.J.D., Radial basis Functions for Multivariable Interpolation: A Re- view, In Mason, J., Cox, M. (Eds.), Algorithms for Approximation, Oxford, Clarendon Press, pp. 143-167, 1987.
Google Scholar
Bengio, S., Mariethoz. J., Learning the decision function for speaker verifica- tion, Technical Report, IDIAP Research Report 00-40, IDIAP, January 2001.
Google Scholar
Bourlard, H.A., Morgan, N., Connectionist speech recognition: A hybrid ap-proach, Kluwer, 1994.
Google Scholar
Neto, J., Almeida, L., Hochberg, M., Martins, C., Nunes, L., Renals, S., Robinson, T., Speaker adaptation for hybrid HMM/ANN continuous speech recognition system, In Proceedings of the Eurospeech’95, pp. 2171-2174, 1995.
Google Scholar
Bengio, Y., Frasconi, P., Input-output HMM’s for sequence processing, In IEEE Transactions on Neural Networks, vol. 7, no. 5, pp. 1231-1249, 1996.
Article Google Scholar
Setlur, A.R., Sukkar R.A., Jacob J., Correcting recognition errors via discrim-inative utterance verification, In Proceedings of ICSLP’96, Philadelphia, USA, vol. 2, pp. 602-605, 1996.
Google Scholar
Ganchev, T., Tasoulis, D.K., Vrahatis, M.N., Fakotakis, N., Locally recur- rent probabilistic neural network for text-independent speaker verification, In Proceedings of the Eurospeech’03, Geneva, Switzerland, vol. 3, pp. 1673-1676, September 1-4, 2003.
Google Scholar
Ganchev, T., Tasoulis, D.K., Vrahatis, M.N., Fakotakis, N., Generalized lo- cally recurrent probabilistic neural networks for text-independent speaker verification, In Proceedings of the ICASSP’04, Montreal, Quebec, Canada, vol. 1, pp. 41-44, May 17-21, 2004.
Google Scholar
Ganchev, T., Tasoulis, D.K., Vrahatis, M.N., Fakotakis N., Generalized locally recurrent probabilistic neural networks with application to text-independent speaker verification, In Neurocomputing, vol. 70, no. 7-9, pp. 1424-1438, 2007.
Article Google Scholar
Liu, M., Wan, C., A study on content-based classification and retrieval of audio database, In International Database Engineering and Applications Symposium (IDEAS ’01), ISSN:1098-8068, p. 339, 2001.
Google Scholar
Guo, X., Yan, Y., Xiao, Y.S., Xiao, S.-C., Heart sound recognition algorithm based on pnn for evaluating cardiac contractility change trend, In Journal of Biomedical Engineering, vol. 23, no. 5, 2006.
Google Scholar
Barry, S.J., Dane1, A.D., Morice, A.H., Walmsley, A.D., The automatic recog- nition and counting of cough, In Cough, vol. 2, no. 8, 2006.
Google Scholar
Chordia, P., Segmentation and recognition of tabla strokes, In Proceedings of the 6th International Conference on Music Information Retrieval, London, UK, 11-15 September, 2005.
Google Scholar
Bolat, B., Kucuk, U., Musical sound recognition by active learning PNN, In Lecture Notes in Computer Science, vol. 4105/2006, Multimedia Content Representation, Classification and Security, ISSN:0302-9743, Springer, Berlin Heidelberg New York, 2006.
Google Scholar
Kraft, F., Malkin, R., Schaaf, T., Waibel, A., Temporal ICA for classification of acoustic events in a kitchen environment, In Proceedings of the Interspeech’05, Lisbon, Portugal, 2005.
Google Scholar
Ravindran, S., Anderson, D.V., Audio classification and scene recognition for hearing aids, In IEEE International Symposium on Circuits and Systems, ISCAS’05, vol. 2, pp. 860-863, 2005.
Article Google Scholar
Temko, A., Nadeu, C., Classification of acoustic events using SVM-based clustering schemes, In Pattern Recognition, ISSN:0031-3203, vol. 39, no. 4, pp. 682-694, April 2006.
Article MATH Google Scholar
Dufaux, A., Besacier, L., Ansorge, M., Pellandini, F., Automatic sound detec- tion and recognition for noisy environment, In Proceedings of the EUSIPCO 2000, Tampere, Finland, 2000.
Google Scholar
Yella, S., Gupta, N.K., Dougherty, M., Pattern recognition approach for the automatic classification of data from impact acoustics, In Proceedings of the AISC’2006, Palma De Mallorca, Spain, pp. 144-149, August 28-30, 2006.
Google Scholar
Chu, S., Narayanan, S., Jay Kuo, C.-C., Matarić, M.J., Where am I? Scene recognition for mobile robots using audio features, In Proceedings of the ICME’06, pp. 885-888, 2006.
Google Scholar
Essid, S., Classification of audio signals: machine recognition of musical instru- ments, Seminars, CNRS-LTCI, 2006.
Google Scholar
Casey, M., General sound classification and similarity in MPEG-7, In Organised Sound, vol. 6, no. 2, pp. 153-164, 2001.
Article MathSciNet Google Scholar
Casey, M., MPEG-7 sound recognition tools, In IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 6, pp. 737-747, 2001.
Article Google Scholar
Peltonen, V., Tuomi, J., Klapuri, A., Huopaniemi, J., Sorsa, T., Computational auditory scene recognition, In Proceedings of the ICASSP’02, vol. 2, pp. 1941-1944,2002.
Google Scholar
Sitte, R., Willets, L., Non-speech environmental sound identification for surveil- lance using self-organizing-maps, In Proceeding of the SPPRA 2007, Innsbruck, Austria, February 14-16, 2007.
Google Scholar
Harlow, C., Wang, Y., Acoustic accident detection system, In Journal In- telligent Transportation Systems, Taylor & Francis, ISSN:1024-8072, vol. 7, pp. 43-56, January 2002.
MATH Google Scholar
Yella, S., Gupta, N.K., Dougherty, M., Condition monitoring using pattern recognition techniques on data from acoustic emissions, In Proceedings of the ICMLA’06, pp. 3-9, 2006.
Google Scholar
Toyoda, Y., Huang, J., Ding, S., Liu, Y., Environmental sound recognition by the instantaneous spectrum combined with the time pattern of power, In Proceedings of the 2nd IASTED International Conference on Neural Networks and Computational Intelligence, NCI 2004, pp. 169-172, 2004.
Google Scholar
Coath, M., Denham, S.L., Robust sound classification through the representa-tion of similarity using response fields derived from stimuli during early expe-rience, In Biological Cybernetics, vol. 93, no. 1, pp. 22-30, July, 2005.
Article Google Scholar
Li, Y., Dorai, C., SVM-based audio classification for instructional video analy-sis, In Proceedings of the ICASSP’04, Montreal, Canada, vol. 5, pp. 897-900,2004.
Google Scholar
Lin, C.-C., Chen, S.-H., Truong, T.-K., Chang, Y., Audio classification and categorization based on wavelets and support vector machine, In IEEE Trans-actions on Speech and Audio Processing, vol. 13, no. 5, September 2005.
Google Scholar
Chen, L., Gunduz, S., Ozsu, M.T., Mixed type audio classification with support vector machine, In IEEE International Conference on Multimedia and Expo, ICME’06, pp. 781-784, July 2006.
Google Scholar
McLachlan, G.J., Krishnan, T., The EM algorithm and extensions, Wiley Se- ries in Probability and Statistics, New York, Wiley, 1997.
Google Scholar
Hartigan, J.A., Wong, M.A., A k-means clustering algorithm, In Applied Sta- tistics, vol. 28, no. 1, pp. 100-108, 1979.
Article MATH Google Scholar
Meisel, W., Computer-Oriented Approaches To Pattern Recognition, Academic Press, New York, 1972.
MATH Google Scholar
Cain, B.J., Improved probabilistic neural network and its performance relative to the other models, In Proceedings of the SPIE, Applications of Artificial Neural Networks, vol. 1294, pp. 354-365, 1990.
Google Scholar
Musavi, M., Kalantri, K., Ahmed, W., Improving the performance of proba- bilistic neural networks, In Proceedings of IEEE International Joint Conference on Neural Networks, Baltimore, MD, USA, vol. 1, pp. 595-600, June 7-11, 1992.
Google Scholar
Abe, S., Support Vector Machines for Pattern Classification, Springer, Berlin Heidelberg New York, London, 2005.
Google Scholar
Hansen, L.K., Salamon, P., Neural Network Ensembles, In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 10, pp. 993-1001, October 1990.
Article Google Scholar
Ho, T.K., Hull, J.J., Srihari, S.N., Decision combination in multiple classifier systems, In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 1, pp. 66-75, January 1994.
Article Google Scholar
Breiman, L., Bagging predictors, In Machine Learning, vol. 24, pp. 123-140, 1996.
MATH MathSciNet Google Scholar
Dietterich, T., An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization, In Machine Learning, pp. 1-22, 1998.
Google Scholar
Kittler, J., Hatef, M., Duin, R., Matas, J., On combining classifiers, In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226-239, 1998.
Article Google Scholar
Alkoot, F.M., Kittler, J., Experimental evaluation of expert fusion strategies, In Pattern Recognition Letters, vol. 20, no. 11, pp. 11-13, 1999.
Article Google Scholar
Kittler, J., Alkoot, F.M., Sum versus vote fusion in multiple classifier systems, In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 1, pp. 110-115, 2003.
Article Google Scholar
Xu, L., Krzyzak, A., Suen, C.Y., Methods of combining multiple classifiers and their applications to handwriting recognition, In IEEE Transactions on Systems, Man, and Cybernetics, vol. 22, no. 3, pp. 418-435, 1992.
Article Google Scholar
Jordan, M.I., Jacobs, R.A., Hierarchical mixtures of experts and the EM algo-rithm, In Neural Computation, no. 6, pp. 181-214, 1994.
Google Scholar
Hinton, G.E., Sallans, B., Ghahramani, Z., A Hierarchical Community of Experts, In Jordan, M.I.(Ed.), Learning in Graphical Models, Kluwer, pp. 479-494, 1998.
Google Scholar
Dietterich, T., Ensamble Methods in Machine Learning, In Kittler, J., Rolli, F. (Eds.), Multiple Classifier Systems, pp. 1-15, 2000.
Google Scholar
Ganchev, T., Tsopanoglou, A., Fakotakis, N., Kokkinakis, G., Probabilistic neural networks combined with GMMs for speaker recognition over telephone channels, In Proceedings of the DSP2002, Santorini, Greece, vol. 2, pp. 1081-1084, July 1-3, 2002.
Google Scholar
Potamitis, I., Ganchev, T., Fakotakis, N., Automatic acoustic identification of crickets and cicadas, In Proceedings of the ISSPA’07, February 12-15, 2007.
Google Scholar
Bishop, C., Pattern Recognition and Machine Learning, Springer, Berlin Heidelberg New York, 2006.
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Music Technology and Acoustics, Technological Educational Institute of Crete, Greece
Ilyas Potamitis
Department of Electrical and Computer Engineering, University of Patras, Greece
Todor Ganchev

Authors

Ilyas Potamitis
View author publications
You can also search for this author in PubMed Google Scholar
Todor Ganchev
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, University of Piraeus, Karaoli-Dimitriou Str. 80, 185 34, Piraeus, Greece
George A. Tsihrintzis
School of Electrical & Information Engineering, University of South Australia KES Centre, Mawson Lakes Campus, Adelaide, SA, 5095, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Potamitis, I., Ganchev, T. (2008). Generalized Recognition of Sound Events: Approaches and Applications. In: Tsihrintzis, G.A., Jain, L.C. (eds) Multimedia Services in Intelligent Environments. Studies in Computational Intelligence, vol 120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78502-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-78502-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78491-3
Online ISBN: 978-3-540-78502-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Generalized Recognition of Sound Events: Approaches and Applications

Summary

Chapter PDF

Similar content being viewed by others

Acoustic Features for Environmental Sound Analysis

Environmental sound recognition using short-time feature aggregation

Introduction to Sound Scene and Event Analysis

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Generalized Recognition of Sound Events: Approaches and Applications

Summary

Chapter PDF

Similar content being viewed by others

Acoustic Features for Environmental Sound Analysis

Environmental sound recognition using short-time feature aggregation

Introduction to Sound Scene and Event Analysis

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation