Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach

Patil, Nilesh M.; Nemade, Milind U.

doi:10.1007/978-981-13-6861-5_23

Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach

Nilesh M. Patil^18,19 &
Milind U. Nemade²⁰

Conference paper
First Online: 22 May 2019

1268 Accesses
14 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 924))

Abstract

The volume of audio data is increasing tremendously daily on public networks like Internet. This increases the difficulty in accessing those audio data. Hence, there is a need of efficient indexing and annotation mechanisms. Non-stationarity and discontinuity present in the audio signal rise the difficulty in segmentation and classification of audio signals. The other challenging task is to extract and select the optimal features in audio signal. The application areas of audio classification and retrieval system include speaker recognition, gender classification, music genre classification, environment sound classification, etc. This paper proposes a machine learning- and neural network-based approach which performs audio pre-processing, segmentation, feature extraction, classification and retrieval of audio signal from the dataset. We have proposed novel approach of classification and retrieval using FPNN by combining fuzzy logic and PNN characteristics. We found that FPNN classifier gives better accuracy, F1-score and Kappa coefficient values compared to SVM, k-NN and PNN classifiers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Castán, D., Tavarez, D., et al.: Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains. EURASIP J. Audio Speech Music Process. 33, 1–15 (2015)
Google Scholar
Ludeña-Choez, J., Gallardo-Antolín, A.: Feature extraction based on the high-pass filtering of audio signals for acoustic event classification. J. Comput. Speech Lang. 30(1), 32–42 (2015)
Article Google Scholar
Muthumari, A., Mala, K.: An efficient approach for segmentation, feature extraction and classification of audio signals. J. Circuits Syst. 7, 255–279 (2016)
Article Google Scholar
Nagavi, T.C., Anusha, S.B., Monisha, P., Poornima, S.P.: Content based audio retrieval with MFCC feature extraction, clustering and sort-merge techniques. In: Proceedings of IEEE 4th International Conference on Computing, Communications and Networking Technologies, July 2013, pp. 1–6
Google Scholar
Christopher Praveen Kumar, R., Suguna, S., Becky Elfreda, J.: Audio retrieval based on cepstral feature. Int. J. Comput. Appl. 107(17), 28–33 (2014). ISSN: 0975-8887
Google Scholar
Al-Maathidi, M.M., Li, F.F.: NNET based audio content classification and indexing system. Int. J. Digit. Inf. Wirel. Commun. (IJDIWC) 2(4), 335–347 (2012). ISSN: 2225-658X
Google Scholar
Srinivasa Murthy, Y., Koolagudi, S.G.: Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations. In: Proceedings of IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE), Halifax, May 2015, pp. 1271–1276
Google Scholar
Zhang, X., Su, Z., Lin, P., He, Q., Yang, J.: An audio feature extraction scheme based on spectral decomposition. In: Proceedings of IEEE International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, July 2014, pp. 730–733
Google Scholar
Haque, M.A., Kim, J.M.: An enhanced fuzzy C-means algorithm for audio segmentation and classification. Int. J. Multimed. Tools Appl. 63(2), 485–500 (2013)
Article Google Scholar
Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, Oct 2013, pp. 1–4
Google Scholar
Dhanalakshmi, P., Palanivel, S., Ramalingam, V.: Classification of audio signals using AANN and GMM. Appl. Soft Comput. 11(1), 716–723 (2011)
Article Google Scholar
Riley, M., Heinen, E., Ghosh, J.: A text retrieval approach to content-based audio retrieval. In: Proceedings of ISMIR 9th International Conference on Music Information Retrieval, Sept 2008, pp. 295–300
Google Scholar
Park, D.-C.: Content-based retrieval of audio data using a Centroid Neural Network. In: Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), South Korea, Dec 2010, pp. 394–398
Google Scholar
Zahid, S., Hussain, F., Rashid, M., Yousaf, M.H., Habib, H.A.: Optimized audio classification and segmentation algorithm by using ensemble methods. Math. Problems Eng. 2015, 1–11 (2015). Article ID 209814
Article Google Scholar
Mahana, Poonam, Singh, Gurbhej: Comparative analysis of machine learning algorithms for audio signals classification. Int. J. Comput. Sci. Netw. Secur. (IJCSNS) 15(6), 49–55 (2015)
Google Scholar
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5) (2002)
Article Google Scholar
Miotto, R., Lanckriet, G.: A generative context model for semantic music annotation and retrieval. IEEE Trans. Audio Speech Lang. Process. 20(4), 1096–1108 (2012)
Article Google Scholar
Haque, Mohammad A., Kim, Jong-Myon: An analysis of content-based classification of audio signals using a fuzzy c-means algorithm. J. Multimed. Tools Appl. 63(1), 77–92 (2013)
Article Google Scholar
Dhabarde, S.V., Deshpande, P.S.: Feature extraction and classification of audio signal using local discriminant bases. Int. J. Ind. Electron. Electr. Eng. 3(5), 51–54 (2015). ISSN: 2347-6982
Google Scholar
Baniya, B.K., Ghimire, D., Lee, J.: Automatic music genre classification using timbral texture and rhythmic content features. ICACT Trans. Adv. Commun. Technol. (TACT) 3(3), 434–443 (2014)
Google Scholar
Kesavan Namboothiri, T., Anju, L.: Efficient audio retrieval using SVM and DTW techniques. Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) 23(2) (2016)
Google Scholar
Rong, F.: Audio classification method based on machine learning. In: IEEE Proceedings of International Conference on Intelligent Transportation, Big Data & Smart City, pp. 81–84 (2016)
Google Scholar
Kour, G., Mehan, N.: Music genre classification using MFCC, SVM and BPNN. Int. J. Comput. Appl. 112(6) (2015)
Google Scholar
Hirvonen, T.: Speech/music classification of short audio segments. In: IEEE Proceedings of International Symposium on Multimedia, pp. 135–138 (2014)
Google Scholar
Singh, M., Tiwary, U.S., Siddiqui, T.J.: A speech retrieval system based on fuzzy logic and knowledge-base filtering. In: IEEE Proceedings of International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), Nov 2013, pp. 46–50
Google Scholar
GTZAN Dataset: http://marsyasweb.appspot.com/download/data_sets/
ESC-50 Dataset: https://github.com/karoldvl/ESC-50
Smith, S.W.: The Scientist and Engineer’s Guide to Digital Signal Processing, pp. 277–284
Chapter Google Scholar
Sunitha, R.: Separation of unvoiced and voiced speech using zero crossing rate and short time energy. Int. J. Adv. Comput. Electron. Technol. (IJACET) 4(1), 6–9 (2017). ISSN: 2394-3416
Google Scholar
Thiruvengatanadhan, R., Dhanalakshmi, P., Suresh Kumar, P.: Speech/music classification using SVM. Int. J. Comput. Appl. 65(6), 36–41 (2013). ISSN: 0975-8887
Google Scholar
Radha Krishna, S., Rajeswara Rao, R.: SVM based emotion recognition using spectral features and PCA. Int. J. Pure Appl. Math. 114(9), 227–235 (2017). ISSN: 1314-3395
Google Scholar
https://xpertsvision.wordpress.com/2015/12/04/gender-recognition-by-voice-analysis/
http://shodhganga.inflibnet.ac.in/bitstream/10603/150477/12/12_chapter%204.pdf

Download references

Author information

Authors and Affiliations

Computer Engineering, Pacific Academy of Higher Education and Research University, Udaipur, India
Nilesh M. Patil
Fr. CRCE, Mumbai, India
Nilesh M. Patil
Electronics Engineering Department, K. J. Somaiya Institute of Engineering and Information Technology, Sion, Mumbai, India
Milind U. Nemade

Authors

Nilesh M. Patil
View author publications
You can also search for this author in PubMed Google Scholar
Milind U. Nemade
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nilesh M. Patil .

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Missouri–St. Louis, St. Louis, MO, USA
Sanjiv K. Bhatia
CSED, ABES Engineering College, Ghaziabad, Uttar Pradesh, India
Shailesh Tiwari
Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology, Allahabad, Uttar Pradesh, India
Krishn K. Mishra
Department of Information Technology, Rajkiya Engineering College, Azamgarh, Uttar Pradesh, India
Munesh C. Trivedi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patil, N.M., Nemade, M.U. (2019). Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_23

Download citation

DOI: https://doi.org/10.1007/978-981-13-6861-5_23
Published: 22 May 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6860-8
Online ISBN: 978-981-13-6861-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics