Prosodic Features’ Criterion for Hebrew
Prosody provides important information about intention and meaning, and carries clues regarding dialogue turns, phrase emphasis and even the physiological or emotional condition of the speaker. Prosody has been researched extensively by linguists and speech scientists; However, little attention has been given to formulating and ranking the acoustic features that represent prosodic information. This paper aims at defining a simple methodology that allows us to test whether a feature conveys prosodic information. This way, we can compare different features and rate them as prosodic or content related (In this paper the word “content” refers to the verbal information of the utterance.). We explore many features using a Hebrew dataset especially designed for validating prosodic features, and as the first step of our research we chose two prosody classes: neutral and question. We apply our methodology successfully and find that prosodic features indeed are invariant to the content of the utterance, while correlating with prosodic manifestations. We validate our methodology by showing that our ranking of prosodic features yields similar results to classification based feature selection.
KeywordsProsody Prosodic features Hebrew database
The authors thank Ella Erlich, Ruth Aloni-Lavi and Noga Hellman for their help with the Hebrew dataset.
- 1.Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A.: Prosody-based automatic detection of annoyance and frustration in human-computer dialog. In: Seventh International Conference on Spoken Language Processing (2002)Google Scholar
- 3.Rose, R.C.: Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events. In: Sixth European Conference on Speech Communication and Technology (1999)Google Scholar
- 5.Eyben, F., Wöllmer, M., Schuller, B.: OpenSMILE: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)Google Scholar
- 8.Li, S., Wang, Y., Sun, L., Lee, L.: Improved tonal language speech recognition by integrating spectro-temporal evidence and pitch information with properly chosen tonal acoustic units. In: INTERSPEECH (2011)Google Scholar
- 9.Liberman, M.: Emotional Prosody Speech and Transcripts LDC2002S28 (2002). https://catalog.ldc.upenn.edu/LDC2002S28
- 14.Qavi, A., Khan, S.A., Basir, K.: Voice morphing based on spectral features and prosodic modification. In: Multi-Topic Conference (INMIC), pp. 401–405. IEEE (2014)Google Scholar
- 15.Silverman, K., et al.: ToBI: a standard for labeling English prosody. In: Second International Conference on Spoken Language Processing (1992)Google Scholar
- 16.Tong, R., Ma, B., Zhu, D., Li, H., Chng, E.S.: Integrating acoustic, prosodic and phonotactic features for spoken language identification. In: Acoustics, Speech and Signal Processing, vol. 1, p. I. IEEE (2006)Google Scholar