Skip to main content

Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 924))

Abstract

The volume of audio data is increasing tremendously daily on public networks like Internet. This increases the difficulty in accessing those audio data. Hence, there is a need of efficient indexing and annotation mechanisms. Non-stationarity and discontinuity present in the audio signal rise the difficulty in segmentation and classification of audio signals. The other challenging task is to extract and select the optimal features in audio signal. The application areas of audio classification and retrieval system include speaker recognition, gender classification, music genre classification, environment sound classification, etc. This paper proposes a machine learning- and neural network-based approach which performs audio pre-processing, segmentation, feature extraction, classification and retrieval of audio signal from the dataset. We have proposed novel approach of classification and retrieval using FPNN by combining fuzzy logic and PNN characteristics. We found that FPNN classifier gives better accuracy, F1-score and Kappa coefficient values compared to SVM, k-NN and PNN classifiers.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Castán, D., Tavarez, D., et al.: Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains. EURASIP J. Audio Speech Music Process. 33, 1–15 (2015)

    Google Scholar 

  2. Ludeña-Choez, J., Gallardo-Antolín, A.: Feature extraction based on the high-pass filtering of audio signals for acoustic event classification. J. Comput. Speech Lang. 30(1), 32–42 (2015)

    Article  Google Scholar 

  3. Muthumari, A., Mala, K.: An efficient approach for segmentation, feature extraction and classification of audio signals. J. Circuits Syst. 7, 255–279 (2016)

    Article  Google Scholar 

  4. Nagavi, T.C., Anusha, S.B., Monisha, P., Poornima, S.P.: Content based audio retrieval with MFCC feature extraction, clustering and sort-merge techniques. In: Proceedings of IEEE 4th International Conference on Computing, Communications and Networking Technologies, July 2013, pp. 1–6

    Google Scholar 

  5. Christopher Praveen Kumar, R., Suguna, S., Becky Elfreda, J.: Audio retrieval based on cepstral feature. Int. J. Comput. Appl. 107(17), 28–33 (2014). ISSN: 0975-8887

    Google Scholar 

  6. Al-Maathidi, M.M., Li, F.F.: NNET based audio content classification and indexing system. Int. J. Digit. Inf. Wirel. Commun. (IJDIWC) 2(4), 335–347 (2012). ISSN: 2225-658X

    Google Scholar 

  7. Srinivasa Murthy, Y., Koolagudi, S.G.: Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations. In: Proceedings of IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE), Halifax, May 2015, pp. 1271–1276

    Google Scholar 

  8. Zhang, X., Su, Z., Lin, P., He, Q., Yang, J.: An audio feature extraction scheme based on spectral decomposition. In: Proceedings of IEEE International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, July 2014, pp. 730–733

    Google Scholar 

  9. Haque, M.A., Kim, J.M.: An enhanced fuzzy C-means algorithm for audio segmentation and classification. Int. J. Multimed. Tools Appl. 63(2), 485–500 (2013)

    Article  Google Scholar 

  10. Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, Oct 2013, pp. 1–4

    Google Scholar 

  11. Dhanalakshmi, P., Palanivel, S., Ramalingam, V.: Classification of audio signals using AANN and GMM. Appl. Soft Comput. 11(1), 716–723 (2011)

    Article  Google Scholar 

  12. Riley, M., Heinen, E., Ghosh, J.: A text retrieval approach to content-based audio retrieval. In: Proceedings of ISMIR 9th International Conference on Music Information Retrieval, Sept 2008, pp. 295–300

    Google Scholar 

  13. Park, D.-C.: Content-based retrieval of audio data using a Centroid Neural Network. In: Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), South Korea, Dec 2010, pp. 394–398

    Google Scholar 

  14. Zahid, S., Hussain, F., Rashid, M., Yousaf, M.H., Habib, H.A.: Optimized audio classification and segmentation algorithm by using ensemble methods. Math. Problems Eng. 2015, 1–11 (2015). Article ID 209814

    Article  Google Scholar 

  15. Mahana, Poonam, Singh, Gurbhej: Comparative analysis of machine learning algorithms for audio signals classification. Int. J. Comput. Sci. Netw. Secur. (IJCSNS) 15(6), 49–55 (2015)

    Google Scholar 

  16. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5) (2002)

    Article  Google Scholar 

  17. Miotto, R., Lanckriet, G.: A generative context model for semantic music annotation and retrieval. IEEE Trans. Audio Speech Lang. Process. 20(4), 1096–1108 (2012)

    Article  Google Scholar 

  18. Haque, Mohammad A., Kim, Jong-Myon: An analysis of content-based classification of audio signals using a fuzzy c-means algorithm. J. Multimed. Tools Appl. 63(1), 77–92 (2013)

    Article  Google Scholar 

  19. Dhabarde, S.V., Deshpande, P.S.: Feature extraction and classification of audio signal using local discriminant bases. Int. J. Ind. Electron. Electr. Eng. 3(5), 51–54 (2015). ISSN: 2347-6982

    Google Scholar 

  20. Baniya, B.K., Ghimire, D., Lee, J.: Automatic music genre classification using timbral texture and rhythmic content features. ICACT Trans. Adv. Commun. Technol. (TACT) 3(3), 434–443 (2014)

    Google Scholar 

  21. Kesavan Namboothiri, T., Anju, L.: Efficient audio retrieval using SVM and DTW techniques. Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) 23(2) (2016)

    Google Scholar 

  22. Rong, F.: Audio classification method based on machine learning. In: IEEE Proceedings of International Conference on Intelligent Transportation, Big Data & Smart City, pp. 81–84 (2016)

    Google Scholar 

  23. Kour, G., Mehan, N.: Music genre classification using MFCC, SVM and BPNN. Int. J. Comput. Appl. 112(6) (2015)

    Google Scholar 

  24. Hirvonen, T.: Speech/music classification of short audio segments. In: IEEE Proceedings of International Symposium on Multimedia, pp. 135–138 (2014)

    Google Scholar 

  25. Singh, M., Tiwary, U.S., Siddiqui, T.J.: A speech retrieval system based on fuzzy logic and knowledge-base filtering. In: IEEE Proceedings of International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), Nov 2013, pp. 46–50

    Google Scholar 

  26. GTZAN Dataset: http://marsyasweb.appspot.com/download/data_sets/

  27. ESC-50 Dataset: https://github.com/karoldvl/ESC-50

  28. Smith, S.W.: The Scientist and Engineer’s Guide to Digital Signal Processing, pp. 277–284

    Chapter  Google Scholar 

  29. Sunitha, R.: Separation of unvoiced and voiced speech using zero crossing rate and short time energy. Int. J. Adv. Comput. Electron. Technol. (IJACET) 4(1), 6–9 (2017). ISSN: 2394-3416

    Google Scholar 

  30. Thiruvengatanadhan, R., Dhanalakshmi, P., Suresh Kumar, P.: Speech/music classification using SVM. Int. J. Comput. Appl. 65(6), 36–41 (2013). ISSN: 0975-8887

    Google Scholar 

  31. Radha Krishna, S., Rajeswara Rao, R.: SVM based emotion recognition using spectral features and PCA. Int. J. Pure Appl. Math. 114(9), 227–235 (2017). ISSN: 1314-3395

    Google Scholar 

  32. https://xpertsvision.wordpress.com/2015/12/04/gender-recognition-by-voice-analysis/

  33. http://shodhganga.inflibnet.ac.in/bitstream/10603/150477/12/12_chapter%204.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nilesh M. Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patil, N.M., Nemade, M.U. (2019). Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_23

Download citation

Publish with us

Policies and ethics