Abstract
This paper proposes a new methodology for automatically comparing the speech rhythm structure of two utterances. Eleven parameters were automatically extracted from 44 pairs of audiofiles yielding 11-size difference vectors. The parameters include speech rate, duration-related stress group rate, prominence and prosodic boundary strength, f0 peak rate, as well as the coupling strength between underlying syllable and stress group oscillators. The 11-parameter difference vectors were used to infer the perceptual differences identified by a group of 10 listeners who judged the same 44 pairs of audiofiles . The results indicate that duration-related prominence or prosodic boundary rate and speech rate, taken together, predict up to 71 % of the response variance. To a minor extent, prominence/boundary strength mean and non-prominent VV unit rate predict up to 60 % of the response variance when combined with prominence or prosodic boundary rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fraisse, P.: Les Rythmes. Journal Français d’Oto-Rhino-laryngologie Supplément 7, 23–33 (1968)
Dellwo, V.: The Role of Speech Rate in Perceiving Speech Rhythm. In: Proc. Speech Prosody 2008, Campinas, Brazil, pp. 375–378 (2008)
Low, E.L., Grabe, E., Nolan, F.: Quantitative Characterisations of Speech Rhythm: Syllable-Timing in Singapore English. Language and Speech 43, 377–401 (2000)
Ramus, F., Nespor, M., Mehler, J.: Correlates of Linguistic Rhythm in the Speech Signal. Cognition 73, 265–292 (1999)
Barbosa, P.A.: From Syntax to Acoustic Duration: a Dynamical Model of Speech Rhythm Production. Speech Communication 49, 725–742 (2007)
O’Dell, M.L., Nieminen, T.: Coupled Oscillator Model of Speech Rhythm. In: Proc. of ICPhS 1999, San Francisco, USA, pp. 1075–1078 (1999)
Bertinetto, P.M., Bertini, C.: Towards a Unified Predictive Model of Natural Language Rhythm. Quaderni Del Laboratorio Di Linguistica Della SNS 7 (2008)
Barbosa, P.A.: Measuring Speech Rhythm Variation in a Model-based Framework. In: Proc. of Interspeech 2009 - Speech and Intelligence, Brighton, UK, pp. 1527–1530 (2009)
Cummins, F., Port, R.: Rhythmic Constraints on “Stress-timing” in English. J. Phon. 26, 145–171 (1998)
Cummins, F.: Entraining Speech with Speech and Metronomes. Cadernos de Estudos Linguísticos 43, 55–70 (2002)
Silva, W., Barbosa, P.A.: Caracterização Semiautomática da Tipologia Rítmica do Português Brasileiro. Anais do Colóquio Brasileiro de Prosódia da Fala. ID [2432011] (2011), http://www.experimentalprosodybrazil.org/III_CBPF_Anais.html
Öhman, L., Eriksson, A., Granhag, P.A.: Mobile Phone Quality vs Direct Phone Quality: How the Presentation Format Affects Earwitness Identification Accuracy. The European Journal of Psychology Applied to Legal Context 2(2), 161–182 (2010)
Classe, A.: The Rhythm of English Prose. Blackwell, Oxford (1939)
Lehiste, I.: Suprasegmentals. MIT Press, Cambridge (1970)
Dogil, G., Braun, G.: The PIVOT Model of Speech Parsing. Verlag, Wien (1988)
Boersma, P., Weenink, D.: Praat: Doing Phonetics by Computer. Version 5.2.44, http://www.praat.org
Barbosa, P. A.: Incursões em torno do Ritmo da Fala. Pontes/FAPESP, Campinas (2006)
Scott, S.K.: Perceptual Centres in Speech: an Acoustic Analysis. PhD Thesis, University College London (1993)
Beckman, M.E.: Evidence for Speech Rhythms across Languages. In: Tohkura, Y., et al. (eds.) Speech Perception, Production and Linguistic Structure, pp. 457–463. IOS Press, New York (1992)
Kohler, K.J.: Rhythm in Speech and Language: A New Research Paradigm. Phonetica 66, 29–45 (2009)
Cumming, R.E.: The Language Specific Interdependence of Tonal and Durational Cues in Perceived Rhythmicality. Phonetica 68, 1–25 (2011)
Vainio, M., et al.: New Method for Delexicalization and its Application to Prosodic Tagging for Text-to-Speech Synthesis. In: Proc. of Interspeech 2009 - Speech and Intelligence, pp. 1703–1706 (2009)
Cowan, N.: Attention and Memory. An Integrated Framework. Oxford University Press, New York (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barbosa, P.A., da Silva, W. (2012). A New Methodology for Comparing Speech Rhythm Structure between Utterances: Beyond Typological Approaches. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-28885-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)