Abstract
This paper presents a novel method for the automatic detection of singing voice in polyphonic music recordings, that involves the extraction of harmonic sounds from the audio mixture and their classification. After being separated, sounds can be better characterized by computing features that are otherwise obscured in the mixture. A set of descriptors of typical pitch fluctuations of the singing voice is proposed, that is combined with classical spectral timbre features. The evaluation conducted shows the usefulness of the proposed pitch features and indicates that the approach is a promising alternative for tackling the problem, in particular for not much dense polyphonies where singing voice can be correctly tracked. As an outcome of this work an automatic singing voice separation system is obtained with encouraging results.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Tsai, W.H., Wang, H.M.: Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signal. IEEE Transactions on Speech and Audio Processing 14(1) (2006)
Rocamora, M., Herrera, P.: Comparing audio descriptors for singing voice detection in music audio files. In: 11th Brazilian Symposium on Computer Music, São Paulo, Brazil (2007)
Regnier, L., Peeters, G.: Singing voice detection in music tracks using direct voice vibrato detection. In: ICASSP IEEE Int. Conf., pp. 1685–1688 (2009)
Cancela, P., López, E., Rocamora, M.: Fan chirp transform for music representation. In: 13th DAFx-10 Int. Conf. on Digital Audio Effects, Graz, Austria (2010)
Rocamora, M., Cancela, P.: Pitch tracking in polyphonic audio by clustering local fundamental frequency estimates. In: 9th Brazilian AES Audio Engineering Congress, São Paulo, Brazil (2011)
Sundberg, J.: The science of the singing voice. De Kalb, Il. Northern Illinois University Press (1987)
Ellis, D.P.W.: PLP and RASTA (and MFCC, and inversion) in Matlab (2005)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rocamora, M., Pardo, A. (2012). Separation and Classification of Harmonic Sounds for Singing Voice Detection. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2012. Lecture Notes in Computer Science, vol 7441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33275-3_87
Download citation
DOI: https://doi.org/10.1007/978-3-642-33275-3_87
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33274-6
Online ISBN: 978-3-642-33275-3
eBook Packages: Computer ScienceComputer Science (R0)