Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications

Ding, Ing-Jr; Yen, Chih-Ta

doi:10.1007/s11042-013-1587-5

Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications

Published: 12 July 2013

Volume 74, pages 5131–5140, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ing-Jr Ding¹ &
Chih-Ta Yen¹

340 Accesses
11 Citations
Explore all metrics

Abstract

Speech applications, which operate a system by voice commands, facilitate web access for disabled and visually impaired users. Human-computer interactions, such as speaking and listening to web applications, provide options for developing a multimodal interaction tool in the accessible design of an intelligent web. Speaker identification and verification are essential functionalities for intelligent web programs with speech applications. This paper proposes an enhanced Gaussian mixture model (GMM) method by incorporating the information derived from the support vector machine (SVM), called EGMM-SVM, for web-based applications with speaker recognition. The EGMM-SVM improves the accuracy of the estimated likelihood scores between the speech frame and the GMM. In EGMM-SVM, SVM plays a crucial role in transmitting the quality information of the utterances from a test speaker, through the GMM when performing GMM likelihood calculations. The experimental results show that speaker recognition by using the developed EGMM-SVM with an accurate operation mechanism for Gaussian distribution derivations yields a higher recognition rate than does a conventional GMM without any considerations on the quality of test speech utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identity authentication by sensed acoustic voices from a speaking person using an efficient GMM-SVM dual modeling framework

Article 13 August 2016

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

Article 20 July 2017

References

Bharkad S, Kokare M (2012) Hartley transform based fingerprint matching. J Inf Process Syst 8(1):85–100
Article Google Scholar
Boujelbene SZ, Mezghani DBA, Ellouze N (2010) Improving SVM by modifying kernel functions for speaker identification task. Int J Digit Content Technol Appl 4(6):100–105
Article Google Scholar
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2):121–167
Article Google Scholar
Burget L, Matejka P, Schwarz P, Glembek O, Cernocky J (2007) Analysis of feature extraction and channel compensation in a GMM speaker recognition system. IEEE Trans Audio, Speech, Lang Process 15(7):1979–1986
Article Google Scholar
Campbell WM, Campbell JP, Gleason TP, Reynolds DA, Shen W (2007) Speaker verification using support vector machines and high-level features. IEEE Trans Audio, Speech, Lang Process 15(7):2085–2094
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39:1–38
MATH MathSciNet Google Scholar
Fan CI, Lin YH (2012) Full privacy minutiae-based fingerprint verification for low-computation devices. J Converg 3(2):21–24
Google Scholar
Gaikwad SK, Gawali BW, Yannawar P (2010) A review on speech recognition technique. Int J Comput Appl 10(3):16–24
Google Scholar
Griol D, Molina JM, Corrales V (2011) The VoiceApp system: Speech technologies to access the semantic web. In: CAEPIA 2011. Lecture Notes in Computer Science, vol 7023, pp 393–402
Hussain A, Abbasi AR, Afzulpurkar N (2012) Detecting & interpreting self-manipulating hand movements for student’s affect prediction. Hum-centric Comput Inf Sci 2(14):1–18
Google Scholar
Jourani R, Daoudi K, Andre-Obrecht R, Aboutajdine D (2011) Speaker verification using large margin GMM discriminative training. In: Proceedings of International Conference on Multimedia Computing and Systems. Toulouse, France, pp 1–5
Kenny P, Boulianne G, Ouellet P, Dumouchel P (2007) Speaker and session variability in GMM-based speaker verification. IEEE Trans Audio, Speech, Lang Process 15(4):1448–1460
Article Google Scholar
Linde Y, Buzo A, Gray RM (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28:84–95
Article Google Scholar
McLaren M, Vogt R, Baker B, Sridharan S (2010) Data-driven background dataset selection for SVM-based speaker verification. IEEE Trans Audio, Speech, Lang Process 18(6):1496–1506
Article Google Scholar
Qian Z, Xu D (2009) Research advances in face recognition. In: Proceedings of IEEE Chinese Conference on Pattern Recognition, pp 1–5
Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3(1):72–83
Article Google Scholar
Satone MP, Kharate GK (2012) Face recognition based on PCA on wavelet subband of average-half-face. J Inf Process Syst 8(3):483–494
Article Google Scholar
You CH, Lee KA, Li H (2009) An SVM kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition. IEEE Signal Proces Lett 16(1):49–52
Article Google Scholar
You CH, Lee KA, Li H (2010) GMM-SVM kernel with a Bhattacharyya-based distance for speaker recognition. IEEE Trans Audio, Speech, Lang Process 18(6):1300–1312
Article Google Scholar
Zhang M, Zou KQ (2008) The application of fuzzy clustering after improvement on speaker recognition. ICIC Express Lett 2(3):263–267
MathSciNet Google Scholar

Download references

Acknowledgments

This research is partially supported by the National Science Council (NSC) in Taiwan under grant NSC 101-2221-E-150-084.

Author information

Authors and Affiliations

Department of Electrical Engineering, National Formosa University, No.64, Wunhua Rd., Huwei Township, Yunlin County 632, Taiwan, Republic of China
Ing-Jr Ding & Chih-Ta Yen

Authors

Ing-Jr Ding
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Ta Yen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chih-Ta Yen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, IJ., Yen, CT. Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications. Multimed Tools Appl 74, 5131–5140 (2015). https://doi.org/10.1007/s11042-013-1587-5

Download citation

Published: 12 July 2013
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11042-013-1587-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications

Abstract

Access this article

Similar content being viewed by others

Identity authentication by sensed acoustic voices from a speaking person using an efficient GMM-SVM dual modeling framework

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications

Abstract

Access this article

Similar content being viewed by others

Identity authentication by sensed acoustic voices from a speaking person using an efficient GMM-SVM dual modeling framework

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation