An Adaptive Multibiometric System for Uncertain Audio Condition

Ramli, Dzati Athiar; Samad, Salina Abdul; Hussain, Aini

doi:10.1007/978-90-481-8776-8_15

Dzati Athiar Ramli³,
Salina Abdul Samad³ &
Aini Hussain³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 60))

1492 Accesses

Abstract

Performances of speaker verification systems are superb in clean noise-free conditions but the reliability of the systems drop severely in noisy environments. In this study, we propose a novel approach by introducing Support Vector Machine (SVM) as indicator system for audio reliability estimation. This approach directly validate the quality of the incoming (claimant) speech signal so as to adaptively change the weighting factor for fusion of both subsystem scores. The effectiveness of this approach has been experimented to a multibiometric verification system that employs lipreading images as visual features. This verification system uses SVM as a classifier for both subsystems. Principle Component Analysis (PCA) technique is executed for visual features extraction while for the audio feature extraction; Linear Predictive Coding (LPC) technique has been utilized. In this study, we found that the SVM indicator system is able to determine the quality of the speech signal up to 99.66%. At 10 dB SNR, EER performances are observed as 51.13%, 9.3%, and 0.27% for audio only system, fixed weighting system and adaptive weighting system, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997)
Article Google Scholar
Reynolds, D.A.: An overview of automatic speaker recognition technology. Proc. IEEE Acoustics Speech Signal Processing 4, 4072–4075 (2002)
Google Scholar
Ramli, D.A., Samad, S.A., Hussain, A.: In: Corchado, E., et al. (ed.) Score Information Decision Fusion using Support Vector Machine for a Correlation Filter Based Speaker Authentication System, vol 53, pp. 235–242. Springer, Berlin, Heidelberg (2008)
Google Scholar
Brunelli, R., Falavigna, D., Stringa, L., Poggio, T.: Automatic person recognition by using acoustic and geometric. Mach. Vis. Appl. 8, 317–325 (1995)
Google Scholar
Brunelli, R., Falavigna, D.: Personal identification using multiple cue. IEEE Trans. Pattern Anal. Mach. Int. 17(3), 955–966 (1995)
Article Google Scholar
Dieckmann, U., Plankensteiner, P., Wagner, T.: SESAM: A biometric person identification system using sensor. Pattern Recog. Lett. 18(9), 827–833 (1997)
Article Google Scholar
Jourlin, P., Luettin, J., Genoud, D., Wassner, H.: Integrating acoustic and labial information for speaker identification and verification. Proc. 5th European Conf. Speech, Commun. Technol. 3, 1603–1606 (1997)
Google Scholar
Sanderson, C., Paliwal, K.K.: Multi-modal person verification system based on face profile and speech. Fifth International Symposium on Signal Processing and Its Applications, pp. 947–950 (1999)
Google Scholar
Pan, H., Liang, Z.P., Huang, T.S.: Fusing audio and visual features of speech. Proc. IEEE Int. Conf. Image Processing 3, 214–217 (2000)
Google Scholar
Chu, S.M., Marcheret, V.L.E., Neti, C., Potamianos, G.: Multistage information fusion for audio-visual speech recognition. Proc. IEEE Int. Conf. Multimedia Expo, pp. 1651–1654 (2004)
Google Scholar
Gurban, M., Thiran, J.P.: Using entropy as a stream reliability estimate for audio-visual speech. 16th European Signal Processing Conference (2008, in press)
Google Scholar
Potamianos, G., Neti, C.: Stream confidence estimation for audio-visual speech. Proc. Int. Conf. Spoken Language 3, 746–749 (2000)
Google Scholar
Heckmann, M., Berthommier, F., Kroschel, K.: Noise adaptive stream weighting in audio-visual speech. EURASIP J. Appl. Signal Process. 2002(11), 1260–1273 (2002)
Article MATH Google Scholar
Chetty, G., Wagner, M.: Robust face-voice based speaker verification using multilevel. Image Vision Comput. 26(9), 1249–1260 (2008)
Article Google Scholar
Wark, T., Sridharan, S.: A syntactic approach to automatic lip feature extraction for speaker identification. IEEE Int. Conf. Acoustics Speech Signal Processing 6, 3693–3696 (1998)
Google Scholar
Broun, C.C., Zhang, X., Mersereau, R.M., Clements, M.: Automatic speechreading with application to speaker verification. IEEE Int. Conf. Acoustics Speech Signal Processing 1, 685–688 (2002)
Google Scholar
Fox, N.A., Reilly, R.B.: Robust multi-modal person identification with tolerance of facial expression. Proc. IEEE Int. Conf. System Man Cybernetics 1, 580–585 (2004)
Google Scholar
Sanderson, C., Paliwal, K.K.: Noise compensation in a multi-modal verification system. Proc. Int. Conf. Acoustics, Speech Signal Processing 1, 157–160 (2001)
Google Scholar
Gunn, S.R.: Support vector machine for classification and regression. Technical Report, University of Southampton (2005)
Google Scholar
Wan, V., Campbell, W.M.: Support vector machines for speaker verification and identification. Proc. Neural Networks Signal Processing 2, 775–784 (2000)
Google Scholar
Chetty, G., Wagner, M.: Liveness verification in audio-video speaker authentication. Proc. Int. Conf. Spoken Language Processing ICSLP 04, pp. 2509–2512 (2004)
Google Scholar
Chetty, G., Wagner, M.: Automated lip feature extraction for liveness verification in audio-video authentication. Proc. Image Vision Comput., pp. 17–22 (2004)
Google Scholar
Matthews, I., Cootes, J., Bangham, J., Cox, S., Harvey, R.: Extraction of visual features for lipreading. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 198–213 (2002)
Article Google Scholar
Kirby, M., Sirovich, L.: Application of the Karhunen-Loeve procedure for the characterisation of human. IEEE Trans. Pattern Anal. Mach. 12(1), 103–108 (1990)
Article Google Scholar
Rabiner, L.R., Juang, B.H.: Fundamental of Speech Recognition. Prentice-Hall, New York (1993)
Google Scholar
Furui, S.: Cepstral analysis technique for automatic speaker verification. IEEE Trans. Acoust Speech Signal Process. 29(2), 254–272 (1981)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical, Electronic & System Engineering, Engineering Faculty, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
Dzati Athiar Ramli, Salina Abdul Samad & Aini Hussain

Authors

Dzati Athiar Ramli
View author publications
You can also search for this author in PubMed Google Scholar
Salina Abdul Samad
View author publications
You can also search for this author in PubMed Google Scholar
Aini Hussain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dzati Athiar Ramli .

Editor information

Editors and Affiliations

International Association of Engineers, Hung To Road 37-39, Hong Kong, Hong Kong/PR China
Sio-Iong Ao
School of Engineering, Dept. Process & Systems Engineering, Cranfield University, Cranfield, Beds., MK43 0AL, United Kingdom
Len Gelman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ramli, D.A., Samad, S.A., Hussain, A. (2010). An Adaptive Multibiometric System for Uncertain Audio Condition. In: Ao, SI., Gelman, L. (eds) Electronic Engineering and Computing Technology. Lecture Notes in Electrical Engineering, vol 60. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-8776-8_15

Download citation

DOI: https://doi.org/10.1007/978-90-481-8776-8_15
Published: 24 February 2010
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-8775-1
Online ISBN: 978-90-481-8776-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics