An Intelligent System Based on Discrete Cosine Transform for Speech Recognition

Silva, Washington; Serra, Ginalber

doi:10.1007/978-3-642-34654-5_33

Washington Silva²¹ &
Ginalber Serra²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7637))

Included in the following conference series:

Ibero-American Conference on Artificial Intelligence

1838 Accesses

Abstract

This paper proposes a genetic-fuzzy system for speech recognition. In addition to pre-processing, with mel-cepstral coefficients, the Discrete Cosine Transform (DCT) is used to generate a two-dimensional time matrix for each pattern to be recognized. A genetic algorithm is used to optimize a Mamdani fuzzy inference system in order to obtain the best model for final recognition. The speech recognition system used in this paper was named Hybrid Method Genetic-Fuzzy Inference System for Speech Recognition (HMFE).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Picone, J.W.: Signal Modeling Techiniques in Speech Recognition. Proceedings of the IEEE 81(9), 1215–1247 (1993)
Article Google Scholar
Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, New Jersey (1993)
Google Scholar
Andrews, H.C.: Multidimensional Rotations in Feature Selection. IEEE Transaction on Computers C-20(9), 1045–1051 (1971)
Article MATH Google Scholar
Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English Digits Speech Recognition System Based on Hidden Markov Models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010)
Google Scholar
De Wachter, M., Matton, M., Demuynck, K., Wambacq, P., Cools, R., Van Compernolle, D.: Template-Based Continuous Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 15(4), 1377–1390 (2007)
Article Google Scholar
Fissore, L., Laface, P., Ravera, F.: Using Word Temporal Structure in HMM Speech Recognition. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1997), vol. 2, pp. 975–978. IEEE Press (1997)
Google Scholar
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete Cosine Trasnform. IEEE Transaction on Computers C-23(1), 90–93 (1974)
Article MATH Google Scholar
Zhou, J., Chen, P.: Generalized Discrete Cosine Transform. In: 2009 Pacific-Asia Conference on Circuits, Communications and Systems (PACCS 2009), pp. 449–452. IEEE Press (2009)
Google Scholar
Hua, Y., Liu, W.: Generalized Karhunen–Loeve Transform. IEEE Signal Processing Letters 5(6), 141–142 (1998)
Article Google Scholar
Effros, M., Feng, H., Zeger, K.: Suboptimality of the Karhunen Loéve Transform for Transform Coding. IEEE Transactions on Information Theory 50(8), 1605–1619 (2004)
Article MathSciNet MATH Google Scholar
Zeng, J., Liu, Z.Q.: Type-2 Fuzzy Hidden Markov Models and Their Application to Speech Recognition. IEEE Transactions on Fuzzy Systems 14(3), 454–467 (2006)
Article Google Scholar
Azar, M.Y., Razzazi, F.: A DCT Based Nonlinear Predictive Coding for Feature Extraction in Speech Recognition Systems. In: 2008 IEEE International Conference on Computational Intelligente for Measurement Systems and Applications (CIMSA 2008), pp. 19–22. IEEE Press (2008)
Google Scholar
Silva, W.L.S., Serra, G.L.O.: Proposta de Metodologia TCD-Fuzzy para Reconhecimento de Voz. In: X Simpósio Brasileiro de Automação Inteligente (SBAI 2011), pp. 1054–1059. SBA Press (2011)
Google Scholar
Wang, L.X.: A Course in Fuzzy Systems and Control. Prentice Hall (1994)
Google Scholar
Gang, C.: Discussion of Approximation Properties of Minimum Inference Fuzzy System. In: 29th Chinese Control Conference (CCC 2010), pp. 2540–2546 (2010)
Google Scholar
Babuška, R.: Fuzzy Modeling for Control. Kluwer Academic Publishers (1998)
Google Scholar
Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms. John Wiley & Sons, Inc. (2004)
Google Scholar
Zhou, E., Khotanzad, A.: Fuzzy Classifier Design Using Genetic Algorithms. Pattern Recognition 40(12), 3401–3414 (2007)
Article MATH Google Scholar
Weihong, Z., Shunqing, X., Ting, M.: A Fuzzy Classifier Based on Mamdani Fuzzy Logic System and Genetic Algorithm. In: 2010 IEEE Youth Conference on Information Computing and Telecommunications (YC-ICT 2010), pp. 198–201. IEEE Press (2010)
Google Scholar
Zhang, X., Wang, X., Zhang, S., Yu, F.: Approximating the True Domain of Fuzzy Inference Sentence with Genetic Algorithm. In: 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010), pp. 114–118. IEEE Press (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electroelectronics, Laboratory of Computational Intelligence Applied to Technology, Federal Institute of Education, Science and Technology, Av. Getúlio Vargas, 04, Monte Castelo, CEP: 65030-005, São Luis, Maranhão, Brazil
Washington Silva & Ginalber Serra

Authors

Washington Silva
View author publications
You can also search for this author in PubMed Google Scholar
Ginalber Serra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Facultad de Informática, Universidad Complutense de Madrid, c\ Profesor José García Santesmases, 28040, Madrid, Spain
Juan Pavón & Rubén Fuentes-Fernández &
Universidad Nacional de Colombia, Carrera 30 No 45-03, Edificio 477, Bogotá, DC, Colombia
Néstor D. Duque-Méndez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Silva, W., Serra, G. (2012). An Intelligent System Based on Discrete Cosine Transform for Speech Recognition. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds) Advances in Artificial Intelligence – IBERAMIA 2012. IBERAMIA 2012. Lecture Notes in Computer Science(), vol 7637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34654-5_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-34654-5_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34653-8
Online ISBN: 978-3-642-34654-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics