Detecting Speaker-Discriminative Spectral Content in Wideband for Automatic Speaker Recognition

Fernández Gallardo, Laura

doi:10.1007/978-981-287-727-7_6

Laura Fernández Gallardo⁵

Part of the book series: T-Labs Series in Telecommunication Services ((TLABS))

454 Accesses

Abstract

It has been widely reported that the information of speaker individuality in the voice is not equally distributed on the speech spectrum, and that this is attributed to the occurrence of different phoneme events (e.g. [115, 156, 170]). Based on this finding, a variety of methods have been developed to conveniently extract the most useful information from the speech signal for further modelling, however most of them limited to clean microphone or to NB telephone speech. Considering WB-transmitted speech, the usefulness of the frequency range beyond the NB cut-off frequencies has not yet been determined. Besides, the commonly adopted MFCC features might not be appropriate for speaker verification in order to take full advantage of the WB signal, since they were developed for speech recognition and from signals band-limited to 5 kHz [51]. This work reveals some causes leading to this benefit, considering clean and degraded speech. It attempts to provide some guidance in speaker verification system configuration, identifying speaker-discriminative information in frequency bands beyond NB, and encouraging its use. First, a sub-band analysis employing transmitted speech segments is presented and the effects of channel degradations on different frequency sub-bands determined. Next, the speaker verification performances from speech signals of 0–4, 4–8, and 0–8 kHz, and from transmitted speech are compared, employing different sets of cepstral features extracted using linearly- and mel-spaced filterbanks (LFCCs and MFCCs). Finally, effective phoneme classes in WB are determined and identified as an important contribution to the superiority of WB over NB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Available from https://sites.google.com/site/nikobrummer/focal, last accessed 19th August 2014.
2.
An example of these files can be downloaded from https://catalog.ldc.upenn.edu/desc/addenda/LDC93S1.phn, last accessed 28th August 2014.
3.
Accessible at https://catalog.ldc.upenn.edu/docs/LDC93S1/PHONCODE.TXT, last accessed 28th August 2014.
4.
https://sites.google.com/site/dgromeroweb/software/, last accessed 15th July 2014.
5.
This result appears counter-intuitive. It was expected that the lack of phonemes affects more severely the performance with WB-transmitted speech. Whether the software is operating in an identical manner as for the clean and the NB-transmitted data experiments with correct input files has been triple-checked by the author. Further research would be needed in order to find a satisfactory explanation.

Author information

Authors and Affiliations

University of Canberra, Canberra, ACT, Australia
Laura Fernández Gallardo

Authors

Laura Fernández Gallardo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laura Fernández Gallardo .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fernández Gallardo, L. (2016). Detecting Speaker-Discriminative Spectral Content in Wideband for Automatic Speaker Recognition. In: Human and Automatic Speaker Recognition over Telecommunication Channels. T-Labs Series in Telecommunication Services. Springer, Singapore. https://doi.org/10.1007/978-981-287-727-7_6

Download citation

DOI: https://doi.org/10.1007/978-981-287-727-7_6
Published: 18 August 2015
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-726-0
Online ISBN: 978-981-287-727-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics