Skip to main content

A New Methodology for Comparing Speech Rhythm Structure between Utterances: Beyond Typological Approaches

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7243))

  • 1163 Accesses

Abstract

This paper proposes a new methodology for automatically comparing the speech rhythm structure of two utterances. Eleven parameters were automatically extracted from 44 pairs of audiofiles yielding 11-size difference vectors. The parameters include speech rate, duration-related stress group rate, prominence and prosodic boundary strength, f0 peak rate, as well as the coupling strength between underlying syllable and stress group oscillators. The 11-parameter difference vectors were used to infer the perceptual differences identified by a group of 10 listeners who judged the same 44 pairs of audiofiles . The results indicate that duration-related prominence or prosodic boundary rate and speech rate, taken together, predict up to 71 % of the response variance. To a minor extent, prominence/boundary strength mean and non-prominent VV unit rate predict up to 60 % of the response variance when combined with prominence or prosodic boundary rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fraisse, P.: Les Rythmes. Journal Français d’Oto-Rhino-laryngologie Supplément 7, 23–33 (1968)

    Google Scholar 

  2. Dellwo, V.: The Role of Speech Rate in Perceiving Speech Rhythm. In: Proc. Speech Prosody 2008, Campinas, Brazil, pp. 375–378 (2008)

    Google Scholar 

  3. Low, E.L., Grabe, E., Nolan, F.: Quantitative Characterisations of Speech Rhythm: Syllable-Timing in Singapore English. Language and Speech 43, 377–401 (2000)

    Article  Google Scholar 

  4. Ramus, F., Nespor, M., Mehler, J.: Correlates of Linguistic Rhythm in the Speech Signal. Cognition 73, 265–292 (1999)

    Article  Google Scholar 

  5. Barbosa, P.A.: From Syntax to Acoustic Duration: a Dynamical Model of Speech Rhythm Production. Speech Communication 49, 725–742 (2007)

    Article  Google Scholar 

  6. O’Dell, M.L., Nieminen, T.: Coupled Oscillator Model of Speech Rhythm. In: Proc. of ICPhS 1999, San Francisco, USA, pp. 1075–1078 (1999)

    Google Scholar 

  7. Bertinetto, P.M., Bertini, C.: Towards a Unified Predictive Model of Natural Language Rhythm. Quaderni Del Laboratorio Di Linguistica Della SNS 7 (2008)

    Google Scholar 

  8. Barbosa, P.A.: Measuring Speech Rhythm Variation in a Model-based Framework. In: Proc. of Interspeech 2009 - Speech and Intelligence, Brighton, UK, pp. 1527–1530 (2009)

    Google Scholar 

  9. Cummins, F., Port, R.: Rhythmic Constraints on “Stress-timing” in English. J. Phon. 26, 145–171 (1998)

    Article  Google Scholar 

  10. Cummins, F.: Entraining Speech with Speech and Metronomes. Cadernos de Estudos Linguísticos 43, 55–70 (2002)

    MathSciNet  Google Scholar 

  11. Silva, W., Barbosa, P.A.: Caracterização Semiautomática da Tipologia Rítmica do Português Brasileiro. Anais do Colóquio Brasileiro de Prosódia da Fala. ID [2432011] (2011), http://www.experimentalprosodybrazil.org/III_CBPF_Anais.html

  12. Öhman, L., Eriksson, A., Granhag, P.A.: Mobile Phone Quality vs Direct Phone Quality: How the Presentation Format Affects Earwitness Identification Accuracy. The European Journal of Psychology Applied to Legal Context 2(2), 161–182 (2010)

    Google Scholar 

  13. Classe, A.: The Rhythm of English Prose. Blackwell, Oxford (1939)

    Google Scholar 

  14. Lehiste, I.: Suprasegmentals. MIT Press, Cambridge (1970)

    Google Scholar 

  15. Dogil, G., Braun, G.: The PIVOT Model of Speech Parsing. Verlag, Wien (1988)

    Google Scholar 

  16. Boersma, P., Weenink, D.: Praat: Doing Phonetics by Computer. Version 5.2.44, http://www.praat.org

  17. Barbosa, P. A.: Incursões em torno do Ritmo da Fala. Pontes/FAPESP, Campinas (2006)

    Google Scholar 

  18. Scott, S.K.: Perceptual Centres in Speech: an Acoustic Analysis. PhD Thesis, University College London (1993)

    Google Scholar 

  19. Beckman, M.E.: Evidence for Speech Rhythms across Languages. In: Tohkura, Y., et al. (eds.) Speech Perception, Production and Linguistic Structure, pp. 457–463. IOS Press, New York (1992)

    Google Scholar 

  20. Kohler, K.J.: Rhythm in Speech and Language: A New Research Paradigm. Phonetica 66, 29–45 (2009)

    Article  Google Scholar 

  21. Cumming, R.E.: The Language Specific Interdependence of Tonal and Durational Cues in Perceived Rhythmicality. Phonetica 68, 1–25 (2011)

    Article  Google Scholar 

  22. Vainio, M., et al.: New Method for Delexicalization and its Application to Prosodic Tagging for Text-to-Speech Synthesis. In: Proc. of Interspeech 2009 - Speech and Intelligence, pp. 1703–1706 (2009)

    Google Scholar 

  23. Cowan, N.: Attention and Memory. An Integrated Framework. Oxford University Press, New York (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barbosa, P.A., da Silva, W. (2012). A New Methodology for Comparing Speech Rhythm Structure between Utterances: Beyond Typological Approaches. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28885-2_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28884-5

  • Online ISBN: 978-3-642-28885-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics