Gender Detection in Running Speech from Glottal and Vocal Tract Correlates

Muñoz-Mulas, Cristina; Martínez-Olalla, Rafael; Gómez-Vilda, Pedro; Álvarez-Marquina, Agustín; Mazaira-Fernández, Luis Miguel

doi:10.1007/978-3-642-38847-7_4

Cristina Muñoz-Mulas²¹,
Rafael Martínez-Olalla²¹,
Pedro Gómez-Vilda²¹,
Agustín Álvarez-Marquina²¹ &
…
Luis Miguel Mazaira-Fernández²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7911))

Included in the following conference series:

International Conference on Nonlinear Speech Processing

1054 Accesses
1 Citations

Abstract

Gender detection from running speech is a very important objective to improve efficiency in tasks as speech or speaker recognition, among others. Traditionally gender detection has been focused on fundamental frequency (f0) and cepstral features derived from voiced segments of speech. The methodology presented here discards f0 as a valid feature because its estimation is complicate, or even impossible in unvoiced fragments, and its relevance in emotional speech or in strongly prosodic speech is not reliable. The approach followed consists in obtaining uncorrelated glottal and vocal tract components which are parameterized as mel-frequency coefficients. K-fold and cross-validation using QDA and GMM classifiers showed detection rates as large as 99.77 in a gender-balanced database of running speech from 340 speakers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fraile, R., Saenz-Lechon, N., Godino-Llorente, J.I., Osma-Ruiz, V., Fredouille, C.: Automatic detection of laryngeal pathologies in records of sustained vowels by means of mel-frequency cepstral coefficient parameters and differentiation of patients by sex. Folia Phoniatrica et Logopaedica 61, 146–152 (2009)
Article Google Scholar
Wu, K., Childers, D.G.: Gender recognition from speech. Part I: Coarse analysis. J. Acoust. Soc. Am. 90(4), 1828–1840 (1990)
Article Google Scholar
Childers, D.G., Wu, K.: Gender recognition from speech. Part II: Fine analysis. J. Acoust. Soc. Am. 90(4), 1841–1856 (1991)
Article Google Scholar
Sorokin, V.N., Makarov, I.S.: Gender recognition from vocal source. Acoust. Phys. 54(4), 571–578 (2009)
Article Google Scholar
Gómez, P., Fernández, R., Rodellar, V., Nieto, V., Álvarez, A., Mazaira, L.M., Martínez, R., Godino, J.I.: Glottal Source Biometrical Signature for Voice Pathology Detection. Speech Comm. 51, 759–781 (2009)
Article Google Scholar
Fant, G.: Acoustic theory of speech production. Walter de Gruyter (1970)
Google Scholar
Titze, I.: Principles of voice production. Prentice Hall, Englewood Cliffs (1994)
Google Scholar
Manolakis, D., Ingle, V.K., Kogon, S.M.: Statistical and Adaptive Signal Processing. Artech House (2005)
Google Scholar
Prasanna, S.R.M., Gudpa, C.S., Yegnanarayana, B.: Extraction of speaker-specific excitation information from linear prediction residual of speech. Speech Communication 48, 1243–1261 (2006)
Article Google Scholar
Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin Speech Database: Design of the Phonetic Corpus. In: Proc. Eurospeech 1993, vol. 1, pp. 653–656 (1993)
Google Scholar
Reynolds, D., Rose, R.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. SAP 3(1), 72–83 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Neuromorphic Speech Processing Lab, Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Campus de Montegancedo, s/n, 28223, Pozuelo de Alarcón, Madrid
Cristina Muñoz-Mulas, Rafael Martínez-Olalla, Pedro Gómez-Vilda, Agustín Álvarez-Marquina & Luis Miguel Mazaira-Fernández

Authors

Cristina Muñoz-Mulas
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Martínez-Olalla
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Gómez-Vilda
View author publications
You can also search for this author in PubMed Google Scholar
Agustín Álvarez-Marquina
View author publications
You can also search for this author in PubMed Google Scholar
Luis Miguel Mazaira-Fernández
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TCTS Lab, University of Mons, 31, Bouldevard Bolez, 7000, Mons, Belgium
Thomas Drugman
TCTS Lab, University of Mons, 31, Boulevard Dolez, 7000, Mons, Belgium
Thierry Dutoit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muñoz-Mulas, C., Martínez-Olalla, R., Gómez-Vilda, P., Álvarez-Marquina, A., Mazaira-Fernández, L.M. (2013). Gender Detection in Running Speech from Glottal and Vocal Tract Correlates. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-38847-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38846-0
Online ISBN: 978-3-642-38847-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics