Advertisement

Sādhanā

, 43:154 | Cite as

Segmentation of continuous audio recordings of Carnatic music concerts into items for archival

  • SARALA PADI
  • HEMA A MURTHY
Article
  • 11 Downloads

Abstract

Concert recordings of Carnatic music are often continuous and unsegmented. At present, these recordings are manually segmented into items for making CDs. The objective of this paper is to develop algorithms that segment continuous concert recordings into items using applause as a cue. Owing to the ‘here and now’ nature of applauses, the number of applauses exceeds the number of items in the concert. This results in a concert being fragmented into different segments. In the first part of the paper, applause locations are identified using time, and spectral domain features, namely, short-time energy, zero-crossing rate, spectral flux and spectral entropy. In the second part, inter-applause segments are merged if they belong to the same item. The main component of every item in a concert is a composition. A composition is characterised by an ensemble of vocal (or main instrument), violin (optional) and percussion. Inter-applause segments are classified into three segments, namely, vocal solo, violin solo, composition and thaniavarthanam using tonic normalised cent filter-bank cepstral coefficients. Adjacent composition segments are merged into a single item, if they belong to the same melody. Meta-data corresponding to the concert in terms of items, available from listeners, are matched to the segmented audio. The applauses are further classified based on strength using Cumulative Sum. The location of the top three highlights of every concert is documented. The performance of the proposed approaches to applause identification, inter-applause classification and mapping of items is evaluated on 50 live recordings of Carnatic music concerts. The applause identification accuracy is 99%, and the inter- and intra-item classification is 93%, while the mapping accuracy is 95%.

Keyword

Cent filter-bank cepstral coefficients; segmentation of concerts; applause detection; classification of music segments 

Notes

Acknowledgements

This research was partly funded by the European Research Council under the European Union’s Seventh Framework Program, as part of the CompMusic project (ERC grant agreement 267583). We would like to thank Mr R K Ramakrishnan for arranging these concerts by Srividya Janakiraman and also seeking permission from the artists for the CompMusic project. We also thank Vidwan T M Krishna for giving permission to use his 26 personal recordings for applause analysis.

References

  1. 1.
    Murthy M V N 2012 Applause and aesthetic experience. http://compmusic.upf.edu/zh-hans/node/151
  2. 2.
    Krishna T M 2013 A southern music: the Karnatik story. India: Harpercollins IndiaGoogle Scholar
  3. 3.
    Jarina R and Olajec J 2007 Discriminative feature selection for applause sounds detection. In: Proceedings of the Eighth International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS’07, IEEE, pp. 13–13Google Scholar
  4. 4.
    Olajec J, Jarina R and Kuba M 2006 Ga-based feature extraction for clapping sound detection. In: Proceedings of the Eighth Seminar on Neural Network Applications in Electrical Engineering, NEUREL 2006, IEEE, pp. 21–25Google Scholar
  5. 5.
    Carey M J, Parris E S and Lloyd-Thomas H 1999 A comparison of features for speech, music discrimination. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, vol. 1, pp. 149–152Google Scholar
  6. 6.
    Shi Z, Han J and Zheng T 2011 Heterogeneous mixture models using sparse representation features for applause and laugh detection. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), September, pp. 1–5Google Scholar
  7. 7.
    Li Y X, He Q H, Kwong S, Li T and Yang J C 2009 Characteristics-based effective applause detection for meeting speech. Signal Process. 89(8): 1625–1633CrossRefMATHGoogle Scholar
  8. 8.
    Li Y X, He Q H, Li W and Wang Z F 2010 Two-level approach for detecting non-lexical audio events in spontaneous speech. In: Proceedings of the International Conference on Audio Language and Image Processing (ICALIP), IEEE, pp. 771–777Google Scholar
  9. 9.
    Manoj C, Magesh S, Sankaran M S and Manikandan M S 2011 A novel approach for detecting applause in continuous meeting. In: Proceedings of the IEEE International Conference on Electronics and Computer Technology, India, April, pp. 182–186Google Scholar
  10. 10.
    Koduri G K, Ishwar V, Serrà J and Serra X 2014 Intonation analysis of rāgas in carnatic music. J. N. Music Res. (Special Issue on Computational Approaches to the Art Music Traditions of India and Turkey) 43: 72–93Google Scholar
  11. 11.
    Krishna T M and Ishwar V 2012 Svaras, gamaka, motif and raga identity. In: Proceedings of the Workshop on Computer Music, July, pp. 12–18Google Scholar
  12. 12.
    Pesch L 2009 The Oxford illustrated companion to south Indian classical music. Oxford: Oxford University PressGoogle Scholar
  13. 13.
    Serra J, Koduri G K, Miron M and Serra X 2011 Assessing the tuning of sung Indian classical music. In: Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pp. 157–162Google Scholar
  14. 14.
    Dutta S 2016 Analysis of motifs in Carnatic music: a computational perspective. Master’s Thesis, Indian Institute of Technology MadrasGoogle Scholar
  15. 15.
    Bellur A and Murthy H A 2013 A novel application of group delay function for identifying tonic in Carnatic music. In: Proceedings of the 21st European Signal Processing Conference (EUSIPCO), IEEE, pp. 1–5Google Scholar
  16. 16.
    Bellur A, Ishwar V, Serra X and Murthy H A 2012 A knowledge based signal processing approach to tonic identification in Indian classical music. In: Serra X, Rao P, Murthy H and Bozkurt B (Eds.) Proceedings of the 2nd CompMusic Workshop, July 12–13, Istanbul, Turkey. Barcelona: Universitat Pompeu Fabra, pp. 113–118Google Scholar
  17. 17.
    Sarala P, Ishwar V, Bellur A and Murthy H A 2012 Applause identification and its relevance to archival of carnatic music. In: Serra X, Rao P, Murthy H and Bozkurt B (Eds.) Proceedings of the 2nd CompMusic Workshop, July 12–13, Istanbul, Turkey. Barcelona: Universitat Pompeu FabraGoogle Scholar
  18. 18.
    Rabiner L R and Schafer R W 2011 Theory and applications of digital speech processing. Upper Saddle River, NJ: Pearson InternationalGoogle Scholar
  19. 19.
    Cannam C, Landone C, Sandler M B and Bello J P 2006 The sonic visualiser: a visualisation platform for semantic descriptors from musical signals. In: Proceedings of the ISMIR Conference, pp. 324–327Google Scholar
  20. 20.
    Chang C C and Lin C J 2011 LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3): 27CrossRefGoogle Scholar
  21. 21.
    Müller M, Kurth F and Clausen M 2005 Audio matching via chroma-based statistical features. In: Proceedings of the ISMIR Conference, vol. 2005, p. 6Google Scholar
  22. 22.
    Ellis D 2007 Chroma feature analysis and synthesis. http://www.ee.columbia.edu/~dpwe/resources/Matlab/chroma-ansyn
  23. 23.
    Salamon J, Gulati S and Serra X 2012 A two-stage approach for tonic identification in Indian art music. In: Proceedings of the Workshop on Computer Music, July, pp. 119–127Google Scholar
  24. 24.
    De Cheveigné A and Kawahara H 2002 Yin, a fundamental frequency estimator for speech and music. J. Acoust. Soci. Am. 111(4): 1917–1930CrossRefGoogle Scholar
  25. 25.
    Brown J C and Puckette M S 1992 An efficient algorithm for the calculation of a constant q transform. J. Acoust. Soc. Am. 92(5): 2698–2701CrossRefGoogle Scholar
  26. 26.
    Chordia P and Rae A 2007 Raag recognition using pitch-class and pitch-class dyad distributions. In: Proceedings of the ISMIR Conference, pp. 431–436Google Scholar
  27. 27.
    Brodsky E and Darkhovsky B S 1993 Nonparametric methods in change point problems. New York: Springer Science & Business MediaCrossRefMATHGoogle Scholar
  28. 28.
    Wang H, Zhang D and Shin K G 2002 Syn-dog: sniffing syn flooding sources. In: Proceedings of the ICDCS, July, pp. 421–428Google Scholar
  29. 29.
    Liu H and Kim M S 2010 Real-time detection of stealthy ddos attacks using time-series decomposition. In: Proceedings of the IEEE International Conference on Communications (ICC), IEEE, pp. 1–6Google Scholar

Copyright information

© Indian Academy of Sciences 2018

Authors and Affiliations

  1. 1.Computer Science and EngineeringIndian Institute of Technology MadrasChennaiIndia

Personalised recommendations