Abstract
In this paper, we demonstrate the effectiveness of superimposed features for the purpose of template matching-based speaker recognition using sparse representations. The principle behind our hypothesis is, if the test template approximately lies in the linear span of the training templates of the genuine class, then so does any linear combination of test templates. In this paper, we introduce the notion of superimposed features for the first time. Using our initial trials on the TIMIT database, we have shown that superimposed features can result in reducing the complexity cost by 80 % with a very minor decrease in identification rate by 0.67 % and a minor increase in EER by 0.85 %.
Chapter PDF
References
Campbell Jr., J.P.: Speaker recognition: a tutorial. Proc. of the IEEE 85(9), 1437–1462 (1997)
Hazen, T., et al.: Multi-modal Face and Speaker Identification on a Handheld Device. In: Proc. Wkshp. Multimodal User Authentication, pp. 120–132 (2003)
Wright, J., et al.: Robust face recognition via sparse representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 31(2), 210–227 (2009)
Pillai, J.K., et al.: Secure and Robust Iris Recognition Using Random Projections and Sparse Representations. IEEE Trans. on Pattern Analysis and Machine Intelligence 33(9), 1877–1893 (2011)
Yang, A.Y., et al.: Distributed recognition of human actions using wearable motion sensor networks. J. of Ambient Intelligence and Smart Environments 1(2), 103–115 (2009)
Naseem, I., Togneri, R., Bennamoun, M.: Sparse Representation for Speaker Identification. In: 20th Int. Conf. on Pattern Reco. (ICPR), pp. 4460–4463 (2010)
Boominathan, V., Sri Rama Murty, K.: Speaker recognition via sparse representations using orthogonal matching pursuit. In: Int. Conf. on Acoustics, Speech and Signal Process. (ICASSP), pp. 4381–4384 (2012)
Elad, M.: Sparse and Redundant Representations. Springer, New York (2009)
Zucker, S.W., Leclerc, Y.G., Mohammed, J.L.: Continuous Relaxation and Local Maxima Selection: Conditions for Equivalence. IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI 3(2), 117–127 (1981)
Garofolo, J.S.: Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database. National Institute of Standards and Technology (NIST), Gaithersburgh, MD (1988)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Process. 28(4), 357–366 (1980)
Martin, A., et al.: The DET Curve in Assessment of Detection Task Performance. In: Proc. Eurospeech 1997, vol. 4, pp. 1899–1903 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gaur, Y., Madhavi, M.C., Patil, H.A. (2013). Speaker Recognition Using Sparse Representation via Superimposed Features. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2013. Lecture Notes in Computer Science, vol 8251. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45062-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-45062-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45061-7
Online ISBN: 978-3-642-45062-4
eBook Packages: Computer ScienceComputer Science (R0)