Skip to main content

Mining Intonation Corpora Using Knowledge Driven Sequential Clustering

  • Conference paper
Advances in Artificial Intelligence - IBERAMIA-SBIA 2006 (IBERAMIA 2006, SBIA 2006)

Abstract

This work presents a mining methodology designed to cope with the usual data scarcity problems of intonation corpora which arises from the high variability of prosodic information. The methodology is an adaptation of a basic agglomerative clustering technique, guided by a set of domain constraints. The peculiarities of the text-to-speech intonation modelling problem are considered in order to fix the initial configuration of the cluster and the criteria to merge classes and stopping their splitting. The scarcity problem poses the need to apply a sequential selection mechanism of prosodic features, in order to obtain the initial set of classes in the cluster. A searching strategy to select the best class among a set of alternatives is proposed, which provides useful prediction models for accurate synthetic intonation. Visualization of final classes by means of a modified decision tree brings graphical cues about contrastable prosodic information of the intonation corpus.

This work has been partially sponsored by Spanish Government (MCYT project TIC2003-08382-C05-03) and by Consejería de Educación (JCYL project VA053A05).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aguado, P.D., Wimmer, K., Bonafonte, A.: Joint extraction and prediction of fujisaki’s intonation model parameters. In: Proceedings of Eurospeech 2005 (2005)

    Google Scholar 

  2. Allen, J., Hunnicutt, M.S., Klatt, D.: From Text to Speech: The MITalk System. Cambridge University Press, Cambridge (1987)

    Google Scholar 

  3. Botinis, A., Granstrom, B., Moebius, B.: Developments and Paradigms in Intonation Research. Speech Communications 33, 263–296 (2001)

    Article  MATH  Google Scholar 

  4. Cardeoso, V., Escudero, D.: A strategy to solve data scarcity problems in corpus based intonation modelling. In: Proceedings of ICASSP 2004 (2004)

    Google Scholar 

  5. Escudero, D.: Modelado Estadstico de Entonacin con Funciones de Bzier: Aplicaciones a la Conversin Texto Voz. PhD thesis, Dpto. de Informtica, Universidad de Valladolid, Espaa (2002)

    Google Scholar 

  6. Escudero, D., Cardeoso, V., Bonafonte, A.: Corpus based extraction of quantitative prosodic parameters of stress groups in spanish. In: Proceedings of ICASSP 2002, Mayo (2002)

    Google Scholar 

  7. Escudero, D., Cardeoso, V.: Optimized selection of intonation dictionaries in corpus based intonation modelling. In: Proceedings of Eurospeech (September 2005)

    Google Scholar 

  8. Gerhard, D.: Pitch extraction and fundamental frequency: History and current techniques. Technical Report TR-CS 2003-06, Department of Computer Science, University of Regina, Regina, Saskatchewan, CANADA (November 2003)

    Google Scholar 

  9. Hart, J., Collier, R., Cohen, A.: A perceptual study of intonation. An experimental approach to speech melody. Cambridge University Press, Cambridge (1990)

    Book  Google Scholar 

  10. Hermes, D.J.: Measuring the perceptual similarity of pitch contours. Journal of Speech, Language, and Hearing Research 41, 73–82 (1994)

    Google Scholar 

  11. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  12. Joskisch, O., Mixdorff, H., Kruschke, H., Kordon, U.: Learning the parameters of quantitative prosody models. In: Proceedings of ICSLP 2000 (2000)

    Google Scholar 

  13. Navarro-Toms, T.: Manual de Entonacin Espaola. Madrid, Guadarrama (1944)

    Google Scholar 

  14. Sakai, S.: Additive modeling of english f0 contours for speech synthesis. In: Proceedings of ICASSP 2005 (2005)

    Google Scholar 

  15. Shriberg, E., Ferrer, L., Kajarekar, S., Venkataraman, A., Stolcke, A.: Modeling Prosodic Feature Sequences for Speaker Recognition. Speech Communication 46(3-4), 455–472 (2005)

    Article  Google Scholar 

  16. Shriberg, E., Stolcke, A., Hakkani, D., Tur, G.: Prosody-Based Automatic Segmentation into Sentences and Topics. Speech Communication 32(1-2), 127–154 (2000)

    Article  Google Scholar 

  17. Sosa, J.M.: La Entonacin del Espaol. Ctedra (1999)

    Google Scholar 

  18. Sproat, R.: Multilingual Text-to-Speech Synthesis. Kluwer, Dordrecht (1998)

    Google Scholar 

  19. Taylor, P.: Analysis and Synthesis of Intonation using the Tilt Model. Journal of Acoustical Society of America 107(3), 1697–1714 (2000)

    Article  Google Scholar 

  20. Webb, A.: Statistical Pattern Recognition, 2nd edn. Wiley, Chichester (2002)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Escudero-Mancebo, D., Cardeñoso-Payo, V. (2006). Mining Intonation Corpora Using Knowledge Driven Sequential Clustering. In: Sichman, J.S., Coelho, H., Rezende, S.O. (eds) Advances in Artificial Intelligence - IBERAMIA-SBIA 2006. IBERAMIA SBIA 2006 2006. Lecture Notes in Computer Science(), vol 4140. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11874850_40

Download citation

  • DOI: https://doi.org/10.1007/11874850_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45462-5

  • Online ISBN: 978-3-540-45464-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics