Abstract
Spectral subband centroids (SSC) have been used as an additional feature to cepstral coefficients in speech and speaker recognition. SSCs are computed as the centroid frequencies of subbands and they capture the dominant frequencies of the short-term spectrum. In the baseline SSC method, the subband filters are pre-specified. To allow better adaptation to formant movements and other dynamic phenomena, we propose to adapt the subband filter boundaries on a frame-by-frame basis using a globally optimal scalar quantization scheme. The method has only one control parameter, the number of subbands. Speaker verification results on the NIST 2001 task indicate that the selection of the parameter is not critical and that the method does not require additional feature normalization.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980)
Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Proc. Speaker Odyssey: the Speaker Recognition Workshop (Odyssey 2001), Crete, Greece, pp. 213–218 (2001)
Bimbot, F., Bonastre, J.F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-Garcia, J., Petrovska-Delacretaz, D., Reynolds, D.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 2004(4), 430–451 (2004)
Gajić, B., Paliwal, K.: Robust speech recognition in noisy environments based on subband spectral centroid histograms. IEEE Trans. Audio, Speech and Language Processing 14(2), 600–608 (2006)
Paliwal, K.: Spectral subband centroid features for speech recognition. In: ICASSP 1998. Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Seattle, USA, vol. 2, pp. 617–620 (1998)
Seo, J., Jin, M., Lee, S., Jang, D., Lee, S., Yoo, C.: Audio fingerprinting based on normalized spectral subband centroids. In: ICASSP 2005. Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, vol. 3, pp. 213–216 (2005)
Thian, N., Sanderson, C., Bengio, S.: Spectral subband centroids as complementary features for speaker authentication. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, pp. 631–639. Springer, Heidelberg (2004)
Wu, X.: Optimal quantization by matrix searching. Journal of Algorithms 12(4), 663–673 (1991)
Gersho, A., Gray, R.: Vector Quantization and Signal Compression. Kluwer Academic Publishers, Boston (1991)
Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10(1), 19–41 (2000)
Rose, P.: Forensic Speaker Identification. Taylor & Francis, London (2002)
Chen, J., Huang, Y., Li, Q., Paliwal, K.: Recognition of noisy speech using dynamic spectral subband centroids. IEEE Signal Processing Letters 11(2), 258–261 (2004)
Atal, B.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. Journal of the Acoustic Society of America 55(6), 1304–1312 (1974)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kinnunen, T., Zhang, B., Zhu, J., Wang, Y. (2007). Speaker Verification with Adaptive Spectral Subband Centroids. In: Lee, SW., Li, S.Z. (eds) Advances in Biometrics. ICB 2007. Lecture Notes in Computer Science, vol 4642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74549-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-74549-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74548-8
Online ISBN: 978-3-540-74549-5
eBook Packages: Computer ScienceComputer Science (R0)