Abstract
This paper contributes to the growing literature confirming the effectiveness of subband processing for speaker recognition. Specifically, we investigate speaker identification from noisy test speech modelled using linear prediction and hidden Markov models (HMMs). After filtering the wideband signal into subbands, the output time trajectory of each is represented by 12 pseudo-cepstral coefficients which are used to train and test individual HMMs. During recognition, the HMM outputs are combined to produce an overall score for each test utterance. We find that, for particular numbers of filters, subband processing outperforms traditional wideband techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B.S. Atal, (1974). Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. Journal of the Acoustical Society of America 55(6), 1304–1312.
L. Besacier and J.-F. Bonastre (1997). Subband approach for automatic speaker recognition: Optimal division of the frequency domain. In Proceedings of 1st International Conference on Audio-and Visual-Based Biometric Person Authentication (AVBPA), Crans-Montana, Switzerland, pp. 195–202.
L. Besacier and J.-F. Bonastre (2000). Subband architecture for automatic speaker recognition. Signal Processing 80(7), 1245–1259.
H. Bourlard and S. Dupont (1996). A new ASR approach based on independent processing and recombination of partial frequency bands. In Proceedings of Fourth International Conference on Spoken Language Processing, ICSLP’96, Volume 1, Philadelphia, PA, pp. 426–429.
J.R. Deller, J.P. Proakis, and J.H.L. Hansen (1993). Discrete-Time Processing of Speech Signals. Englewood Cliffs, NJ: MacMillan.
R.A. Finan, R.I. Damper, and A.T. Sapeluk (2001). Text-dependent speaker recognition using sub-band processing. International Journal of Speech Technology 4(1), 45–62.
S. Furui, (1974). An analysis of long-term variation of feature parameters of speech and its application to talker recognition. Electronic Communications 57-A, 34–42.
S. Furui, (1981). Cepstral analysis techniques for automatic speaker verification. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-29(2), 254–272.
S. Geman, E. Bienenstock, and R. Doursat (1992). Neural networks and the bias/variance dilemma. Neural Computation 4(1), 1–58.
J. Kittler, M. Hatef, R.P.W. Duin, and J. Matas (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239.
A. Morris, A. Hagen, and H. Bourlard (1999). The full-combination sub-bands approach to noise robust HMM/ANN-based ASR. In Proceedings of 6th European Conference on Speech Communication and Technology, Eurospeech’99, Volume 2, Budapest, Hungary, pp. 599–602.
L.R. Rabiner, (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–285.
D.A. Reynolds and R.C. Rose (1995). Robust text-independent speaker identification using Gaussian mixture models. IEEE Transactions on Speech and Audio Processing 3(1), 72–83.
S.S. Stevens and J. Volkmann (1940). The relation of pitch to frequency: A revised scale. American Journal of Psychology 53(3), 329–353.
S. Tibrewala and H. Hermansky (1997). Sub-band based recognition of noisy speech. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP’97, Volume II, Munich, Germany, pp. 1255–1258.
S. Young, J. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland (2000). The HTK Book. Available from URL: http://htk.eng.cam.ac.uk/.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Higgins, J.E., Damper, R.I. (2001). An HMM-Based Subband Processing Approach to Speaker Identification. In: Bigun, J., Smeraldi, F. (eds) Audio- and Video-Based Biometric Person Authentication. AVBPA 2001. Lecture Notes in Computer Science, vol 2091. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45344-X_24
Download citation
DOI: https://doi.org/10.1007/3-540-45344-X_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42216-7
Online ISBN: 978-3-540-45344-4
eBook Packages: Springer Book Archive