Prosodic and Phonetic Features for Speaking Styles Classification and Detection

Veiga, Arlindo; Celorico, Dirce; Proença, Jorge; Candeias, Sara; Perdigão, Fernando

doi:10.1007/978-3-642-35292-8_10

Arlindo Veiga^7,8,
Dirce Celorico⁸,
Jorge Proença⁸,
Sara Candeias⁸ &
…
Fernando Perdigão^7,8

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 328))

739 Accesses
2 Citations

Abstract

This study presents an approach to the task of automatically classifying and detecting speaking styles. The detection of speaking styles is useful for the segmentation of multimedia data into consistent parts and has important applications, such as identifying speech segments to train acoustic models for speech recognition. In this work the database consists of daily news broadcasts in Portuguese television, on which two main speaking styles are evident: read speech from voice-over and anchors, and spontaneous speech from interviews and commentaries. Using a combination of phonetic and prosodic features we can separate these two speaking styles with a good accuracy (93.7% read, 69.5% spontaneous). This is performed in two steps. The first step separates the speech segments from the non-speech audio segments and the second step classifies read versus spontaneous speaking style. The use of phonetic and prosodic features provides alternative information that leads to an improvement of the classification and detection task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Labov, W.: Sociolinguistic Patterns. University of Pennsylvania Press (1973)
Google Scholar
Goldman-Eisler, F.: Psycholinguistics: experiments in spontaneous speech. Academic Press, London (1968)
Google Scholar
Eskenazi, M.: Trends in speaking styles research. In: EUROSPEECH 1993, PP. 501–509, Berlin (1993)
Google Scholar
Llisterri, J.: Speaking styles in speech research. In: ELSNET/ESCA/SALT Workshop on Integrating Speech and Natural Language, Dublin, Ireland (1992)
Google Scholar
Nakamura, M., Iwano, K., Furui, S.: Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance. Computer Speech and Language 22, 171–184 (2008)
Article Google Scholar
Deshmukh, O.D., Kandhway, K., Verma, A., Audhkhasi, K.: Automatic evaluation of spoken English fluency. In: Proc. of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009), Taipei, Taiwan, pp. 4829–4832 (2009)
Google Scholar
Biadsy, F., Hirschberg, J.: Using Prosody and Phonotactics in Arabic Dialect Identification. In: Proc. of Interspeech 2009, Brighton, UK (2009)
Google Scholar
Sanchez, M.H., Vergyri, D., Ferrer, L., Richey, C., Garcia, P., Knoth, B., Jarrold, W.: Using prosodic and spectral features in detecting depression in elderly males. In: Proc. of Interspeech, Florence, Italy, pp. 3001–3004 (2011)
Google Scholar
Veiga, A., Candeias, S., Lopes, C., Perdigão, F.: Characterization of hesitations using acoustic models. In: Proc. of the 17th International Congress of Phonetic Sciences (ICPhS XVII), Hong Kong, pp. 2054–2057 (2011)
Google Scholar
Moniz, H., Trancoso, I., Mata, A.: Classification of disfluent phenomena as fluent communicative devices in specific prosodic contexts. In: Proc. of Interspeech 2009, Brighton, UK, pp. 1719–1722 (2009)
Google Scholar
Braga, D., Freitas, D., Teixeira, J.P., Barros, M.J., Latsh, V.: Back Close Non-Syllabic Vowel [u] Behavior in European Portuguese: Reduction or Suppression. In: Proc. of ICSP 2001 (International Conference in Speech Processing), Seoul (2001)
Google Scholar
Candeias, S., Perdigão, F.: A realização do schwa no Português Europeu. In: Proc. of the II Workshop on Portuguese Description-JDP, 8th Symposium in Information and Human Language Technology (STIL 2011), Cuiabá, Mato Grosso, Brasil (2011)
Google Scholar
Barbosa, P., Viana, M., Trancoso, I.: Cross-variety Rhythm Typology in Portuguese. In: Proc. of Interspeech 2009, Brighton, UK (2009)
Google Scholar
Veiga, A., Candeias, S., Celorico, D., Proença, J., Perdigão, F.: Towards Automatic Classification of Speech Styles. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F., et al. (eds.) PROPOR 2012. LNCS (LNAI), vol. 7243, pp. 421–426. Springer, Heidelberg (2012)
Chapter Google Scholar
Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: a Free Tool for Segmenting, Labeling and Transcribing Speech. In: Proc. of the First International Conference on Language Resources and Evaluation (LREC), pp. 1373–1376 (1998)
Google Scholar
Delacourt, P., Wellekens, C.J.: DISTBIC: A speaker-based segmentation for audio data indexing. Speech Communication 32, 111–126 (2000)
Article Google Scholar
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (Version 5.1.05), Computer program (retrieved May 1, 2009)
Google Scholar
Platt, J.: Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines. Microsoft Research, MSRTR-98-14 (1998)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11 (2009)
Google Scholar
Reynold, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Akbacak, M., Hansen, J.H.L.: Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems. IEEE Transactions on Audio, Speech, and Language Processing 15(2), 465–477 (2007)
Article Google Scholar
Lopes, C., Veiga, A., Perdigão, F.: Using Fingerprinting to Aid Audio Segmentation. In: Proc. of the VI Jornadas en Tecnología del Habla and II Iberian SLTech Workshop, FALA 2010, Vigo (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Computer Eng. Department, University of Coimbra, Portugal
Arlindo Veiga & Fernando Perdigão
Instituto de Telecomunicações - Pole of Coimbra, Coimbra, Portugal
Arlindo Veiga, Dirce Celorico, Jorge Proença, Sara Candeias & Fernando Perdigão

Authors

Arlindo Veiga
View author publications
You can also search for this author in PubMed Google Scholar
Dirce Celorico
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Proença
View author publications
You can also search for this author in PubMed Google Scholar
Sara Candeias
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Perdigão
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escuela Politecnica Superior, Universidad Autonoma de Madrid. C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Doroteo Torre Toledano
Centro Politécnico Superior, Edificio Ada Byron, C/ María de Luna nº 1, 50018, Zaragoza, Spain
Alfonso Ortega Giménez
Universidade de Aveiro, Campus Universitário Aveiro, 3810-193, Aveiro, Portugal
António Teixeira
Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Joaquín González Rodríguez
E.T.S.I.Telecomunicacion, Universidad Politécnica de Madrid, Ciudad Universitaria s/n, 28040, Madrid, Spain
Luis Hernández Gómez & Rubén San Segundo Hernández &
Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Daniel Ramos Castro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Veiga, A., Celorico, D., Proença, J., Candeias, S., Perdigão, F. (2012). Prosodic and Phonetic Features for Speaking Styles Classification and Detection. In: Torre Toledano, D., et al. Advances in Speech and Language Technologies for Iberian Languages. Communications in Computer and Information Science, vol 328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35292-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-35292-8_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35291-1
Online ISBN: 978-3-642-35292-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics