GMM Based Language Identification System Using Robust Features

Manchala, Sadanandam; Prasad, V. Kamakshi

doi:10.1007/978-3-319-01931-4_21

Sadanandam Manchala²² &
V. Kamakshi Prasad²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Included in the following conference series:

International Conference on Speech and Computer

1195 Accesses
1 Citations

Abstract

In this work, we propose new features for the GMM based spoken language identification system. A two stage approach is followed for extraction of the proposed new features. MFCCs and formants are extracted from huge corpus of all languages under consideration. In the first phase, MFCCs and formants are concatenated to form the feature vector. K clusters are formed from these feature vectors and one Gaussian is designed for each cluster. In the second phase, these feature vectors are evaluated against each of the K Gaussians and the returned K probabilities are considered as the elements of the proposed new feature vector, thus forming a K-element new feature vector. This proposed method for deriving new feature vector is common for both training and testing phases. In the training phase, K-element feature vectors are generated from the language specific speech corpus and language specific GMMs are trained. In testing phase, similar procedure is followed for extraction of K-element feature vector from unknown speech utterance and evaluated against language specific GMMs. Usefulness, the language specific apriori knowledge is used for further improvement of recognition performance. The experiments are carried out on OGI database and the LID performance is nearly 100%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zissman, M.A.: Overview of Current Techniques for Automatic Language Identification of Speech. In: Proceedings of the IEEE Automatic Speech Recognition Workshop, pp. 60–62 (December 1995)
Google Scholar
Waibel, A., Geutner, P., Tomokiyo, L.M., Schultz, T., Woszczyna, M.: Multilinguality in speech and spoken language systems. Proc. IEEE 88(8), 1181–1990 (2000)
Article Google Scholar
Sugiyama, M.: Automatic language recognition using acoustic features. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 813–816 (May 1991)
Google Scholar
Zissman, M.A.: Comparison of Four Approaches to Automatic Language Identification of Telephone Speech. IEEE Trans. Speech and Audio Proc. SAP-4(1), 31–44 (1996)
Article Google Scholar
Martin, A.F., Garofolo, J.S.: NIST speech processing evaluations: LVCSR, speaker recognition, language recognition. In: Proc. IEEE Workshop on Signal Processing Applications for Public Security and Forensics, pp. 1–7 (2007)
Google Scholar
Kirchhoff, K.: Language characteristics. In: Schultz, T., Kirchhoff, K. (eds.) Multilingual Speech Processing. Elsevier (2006)
Google Scholar
Zhao, J., Shu, H., Zhang, L., Wang, X., Gong, Q., Li, P.: Cortical competition during language discrimination. NeuroImage 43, 624–633 (2008)
Article Google Scholar
Torres Carrasquillo, P.A., Reynolds, D.A., Deller Jr., J.R.: Language identification using Gaussian mixture model tokenization. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, vol. 1, pp. 757–760 (2002)
Google Scholar
Muthusamy, Y.K., Barnard, E., Cole, R.A.: Automatic language identification: A Review/Tutorial. IEEE Signal Processing Magazine (October 1994)
Google Scholar
Nakagawa, S., Suzuki, H.: A New Speech Recognition Method Based on VQ-Distortion Measure and HMM. In: Proc. Int. Conf. ASSP, pp. 673–679 (April 1993)
Google Scholar
Torres-Carrasquillo, P.A., Singer, E., Kohler, M., Greene, R., Reynolds, D.A., Deller Jr., J.R.: Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proc. ICSLP, pp. 89–92 (2002)
Google Scholar
Nagarajan, T., Murthy, H.A.: Language identification using spectral vector distribution across the languages. In: Proceedings of Int. Conf. Natural Language Processing (December 2002)
Google Scholar
Yegnanarayana, B.: Formant extraction from linear prediction phase spectrum. J. Acoust. Soc. Amer. 63, 1638–1640 (1978)
Article Google Scholar
Bruce, I.C., Karkhanis, N.V., Young, E.D., Sachs, M.B.: Robust formant tracking in noise. In: ICASSP (2002)
Google Scholar
Bruce, I.C., Mustafa, K.: Robust formant tracking for continuous speech with speaker variability. IEEE Trans. ASSP 14(2), 435–444 (2006)
Google Scholar
OGI Multi Language Telephone Speech (January 2004), http://www.cslu.ogi.edu/corpora/mlts/

Download references

Author information

Authors and Affiliations

Kakatiya University, Warangal, Andhra Pradesh, India
Sadanandam Manchala
Jawaharlal Nehru Technological University Hyderabad, Andhra Pradesh, India
V. Kamakshi Prasad

Authors

Sadanandam Manchala
View author publications
You can also search for this author in PubMed Google Scholar
V. Kamakshi Prasad
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Cybernetics, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Miloš Železný
University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal
Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation for the Russian Academy of Sciences, 14-th line, 39, 199178, St. Petersburg, Russia
Andrey Ronzhin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Manchala, S., Prasad, V.K. (2013). GMM Based Language Identification System Using Robust Features. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-01931-4_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics