Abstract
Talking and singing seem disparate, but there are a range of human utterances that fall between them, such as poetry, chanting, and rap music. This paper presents research into differentiation between talking and singing, development of feature-based analysis tools to explore the continuum between talking and singing, and evaluating human perception of this continuum as compared to these analysis tools. Preliminary background is presented to acquaint the reader with some of the science used in the algorithm development. A corpus of sounds was collected to study the differences between singing and talking, and the procedures and results of this collection are presented. A set of features is developed to differentiate between talking and singing, and to investigate the intermediate vocalizations between talking and singing. The results of these features are examined and evaluated. The perception of speech is heavily influenced by the pitch, which in the english language carries no lexicographic information but can carry higher-level semiotic information and can contribute to disambiguation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Kostek, B.: Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics, Studies in Fuzziness and Soft Computing. Physica Verlag, Heidelberg (1999)
Schroeder, M.R.: Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise. W.H.Freeman, New York (1991)
Levitin, D.J.: Absolute pitch: Self-reference and human memory. International Journal of Computing and Anticipatory Systems 4, 255–266 (1999)
List, G.: The boundaries of speech and song. In: McAllester, D. (ed.) Readings in Ethnomusicology, pp. 253–268. Johnson Reprint Co (1971)
Mang, E.H.S.: Speech, Song and Intermediate Vocalizations: A Longitudinal Study of Preschool Children’s Vocal Development. PhD thesis, University of British Columbia (1999)
de Cheveigné, A., Kawahara, H.: Yin, a fundamental frequency estimator for speech and music. Journal of the Acoustical Society of America 111 (2002)
Wold, E., Blum, T., Keislar, D., Wheaton, J.: Content-based classification, search and retrieval of audio. IEEE MultiMedia, 27–37 (1996)
Rao, G.R., Srichand, J.: Word boundary detection using pitch variations. In: Fourth International inproceedings on Spoken Language Processing, vol. 2, pp. 813–816 (1996)
Silverman, B.W.: Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability (1986)
Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing. Prentice-Hall, New Jersey (1999)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipies in C. Cambridge University Press, Cambridge (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gerhard, D. (2005). Multiresolution Pitch Analysis of Talking, Singing, and the Continuum Between. In: Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W., Hu, X. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2005. Lecture Notes in Computer Science(), vol 3642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11548706_31
Download citation
DOI: https://doi.org/10.1007/11548706_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28660-8
Online ISBN: 978-3-540-31824-8
eBook Packages: Computer ScienceComputer Science (R0)