Multiresolution Pitch Analysis of Talking, Singing, and the Continuum Between

Gerhard, David

doi:10.1007/11548706_31

Multiresolution Pitch Analysis of Talking, Singing, and the Continuum Between

David Gerhard²²

Conference paper

1548 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3642))

Abstract

Talking and singing seem disparate, but there are a range of human utterances that fall between them, such as poetry, chanting, and rap music. This paper presents research into differentiation between talking and singing, development of feature-based analysis tools to explore the continuum between talking and singing, and evaluating human perception of this continuum as compared to these analysis tools. Preliminary background is presented to acquaint the reader with some of the science used in the algorithm development. A corpus of sounds was collected to study the differences between singing and talking, and the procedures and results of this collection are presented. A set of features is developed to differentiate between talking and singing, and to investigate the intermediate vocalizations between talking and singing. The results of these features are examined and evaluated. The perception of speech is heavily influenced by the pitch, which in the english language carries no lexicographic information but can carry higher-level semiotic information and can contribute to disambiguation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kostek, B.: Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics, Studies in Fuzziness and Soft Computing. Physica Verlag, Heidelberg (1999)
Google Scholar
Schroeder, M.R.: Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise. W.H.Freeman, New York (1991)
MATH Google Scholar
Levitin, D.J.: Absolute pitch: Self-reference and human memory. International Journal of Computing and Anticipatory Systems 4, 255–266 (1999)
Google Scholar
List, G.: The boundaries of speech and song. In: McAllester, D. (ed.) Readings in Ethnomusicology, pp. 253–268. Johnson Reprint Co (1971)
Google Scholar
Mang, E.H.S.: Speech, Song and Intermediate Vocalizations: A Longitudinal Study of Preschool Children’s Vocal Development. PhD thesis, University of British Columbia (1999)
Google Scholar
de Cheveigné, A., Kawahara, H.: Yin, a fundamental frequency estimator for speech and music. Journal of the Acoustical Society of America 111 (2002)
Google Scholar
Wold, E., Blum, T., Keislar, D., Wheaton, J.: Content-based classification, search and retrieval of audio. IEEE MultiMedia, 27–37 (1996)
Google Scholar
Rao, G.R., Srichand, J.: Word boundary detection using pitch variations. In: Fourth International inproceedings on Spoken Language Processing, vol. 2, pp. 813–816 (1996)
Google Scholar
Silverman, B.W.: Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability (1986)
Google Scholar
Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing. Prentice-Hall, New Jersey (1999)
Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipies in C. Cambridge University Press, Cambridge (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
David Gerhard

Authors

David Gerhard
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Regina, Regina, SK, S4S 0A2 Canada, Polish-Japanese Institute of Information Technology, Koszykowa 86, 02-008 Warsaw, P.O. Box, Poland
Dominik Ślęzak
Department of Computer Science, University of Regina, S4S 0A2, Regina, Saskatchewan, Canada
JingTao Yao & Wojciech Ziarko &
Department of Electrical and Computer Engineering, University of Manitoba, R3T 5V6, Winnipeg, Manitoba, Canada
James F. Peters
College of Computer and Information Engineering, Hehan University, Henan, China
Xiaohua Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gerhard, D. (2005). Multiresolution Pitch Analysis of Talking, Singing, and the Continuum Between. In: Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W., Hu, X. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2005. Lecture Notes in Computer Science(), vol 3642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11548706_31

Download citation

DOI: https://doi.org/10.1007/11548706_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28660-8
Online ISBN: 978-3-540-31824-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics