Environmental Sounds Classification Based on Visual Features

  • Sameh Souli
  • Zied Lachiri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7042)


This paper presents a method aimed at classification of the environmental sounds in the visual domain by using the scale and translation invariance. We present a new approach that extracts visual features from sound spectrograms. We suggest to apply support vector machines (SVM’s) in order to address sound classification. Indeed, in the proposed method we explore sound spectrograms as texture images, and extracts the time-frequency structures by using a translation-invariant wavelet transform and a patch transform alternated with local maximum and global maximum to pursuit scale and translation invariance. We illustrate the performance of this method on an audio database, which composed of 10 sounds classes. The obtained recognition rate is of the order 91.82 % with the multiclass decomposition method: One-Against-One.


Environmental sounds Visual features Translation-invariant wavelet transform Spectrogram SVM Multiclass 


  1. 1.
    Chu, S., Narayanan, S., Kuo, C.C.J.: Environmental Sound Recognition with Time-Frequency Audio Features. IEEE Trans. on Speech, Audio, and Language Processing 17, 1142–1158 (2009)CrossRefGoogle Scholar
  2. 2.
    Rabaoui, A., Davy, M., Rossignol, S., Ellouze, N.: Using One-Class SVMs and Wavelets for Audio Surveillance. IEEE Transactions on Information Forensics and Security 3, 763–775 (2008)CrossRefGoogle Scholar
  3. 3.
    Schulz-Mir, H., Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust Object Recognition with Cortex-Like Mechanisms. IEEE Transactions Pattern Analysis and Machine Intelligence 29, 411–426 (2007)CrossRefGoogle Scholar
  4. 4.
    Vladimir, V., Vapnik, N.: An Overview of Statistical Learning Theory. IEEE Transactions on Neural Networks 10, 988–999 (1999)CrossRefGoogle Scholar
  5. 5.
    Vapnik, V., Chapelle, O.: Bounds on error expectation for support vector machines. Neural Computation 12 (2000)Google Scholar
  6. 6.
    Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  7. 7.
    Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press (1999)Google Scholar
  8. 8.
    Scholkopf, B., Smola, A.: Learning with Kernels. MIT Press (2001)Google Scholar
  9. 9.
    El-Maleh, K., Samouelian, A., Kabal, P.: Frame-Level Noise Classification in Mobile Environments. In: Proc. ICASSP, Phoenix, AZ, pp. 237–240 (1999)Google Scholar
  10. 10.
    Dufaux, A., Besacier, L., Ansorge, M., Pellandini, F.: Automatic Sound Detection and Recognition For Noisy Environment. In: Proceedings of European Signal Processing Conference (EUSIPCO), Tampere, FI, pp. 1033–1036 (2000)Google Scholar
  11. 11.
    Fleury, A., Noury, N., Vacher, M., Glasson, H., Serigna, J.-F.: Sound and Speech Detection and classification in a Health Smart Home. In: 30th Annual Int. Conf. IEEE, Engineering in Medicine and Biology Society (EMBS), Canada, pp. 4644–4647 (2008)Google Scholar
  12. 12.
    He, L., Lech, M., Maddage, N.: Stress and Emotion Recognition Using Log-Gabor Filter Analysis of Speech Spectrograms. In: 3rd Int. Conf. Affective Computing and Intelligent Interaction and Workshops, ACII, Amsterdam, pp. 1–6 (2009)Google Scholar
  13. 13.
    Xinyi, Z., Jianxiao, Y., Qiang, H.: Research of STRAIGHT Spectrogram and Difference Subspace Algorithm for Speech Recognition. In: Int. Congress on Image and Signal Processing (CISP 2009), IEEE DOI Link 0910, pp. 1–4 (2009)Google Scholar
  14. 14.
    Yu, G., Slotine, J.J.: Audio Classification from Time-Frequency Texture. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Taipei, pp. 1677–1680 (2009)Google Scholar
  15. 15.
    He, L., Lech, M., Maddage, N.C., Allen, N.: Stress Detection Using Speech Spectrograms and Sigma-pi Neuron Units. In: Fifth Int. Conf. on Natural Computation, pp. 260–264 (2009)Google Scholar
  16. 16.
    Yu, G., Sloine, J.J.: Fast Wavelet-based Visual Classification. In: Proc. IEEE ICPR, Tampa (2008)Google Scholar
  17. 17.
    Hsu, C.-W., Chang, C-C., Lin, C-J.: A practical Guide to Support Vector Classification. Department of Computer Science and Information Engineering National, Taipei, Taiwan (2009)Google Scholar
  18. 18.
    Leonardo Software, Santa Monica, CA 90401,

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Sameh Souli
    • 1
  • Zied Lachiri
    • 1
    • 2
  1. 1.Signal, Image and pattern recognition research unit Dept. of Genie ElectriqueENITLe BelvédèreTunisia
  2. 2.Dept. of Physique and InstrumentationINSATCentre UrbainTunisia

Personalised recommendations