Abstract
This paper presents a parameterization technique of speech signal based on auditory filter modeling by the Gammachirp auditory filterbank (GcFB), which is designed to provide a spectrum reflecting the spectral properties of the cochlea filter, which is responsible of frequency analysis in the human auditory system. The center frequencies of the GcFB are based on the ERB-rate scale, with the bandwidth of the Gammachirp filter is measured in Equivalent Rectangular Bandwidth (ERB) of human auditory filters. Our parameterization approach gives interesting results vs. other standard techniques such as LPC (Linear Prediction Coefficients), PLP (Perceptual Linear Prediction), for recognition of isolated words of speech from the TIMIT database. The recognition system is implemented on HTK platform (Hidden Toolkit) based on the Hidden Markov Models with Gaussian Mixture observation continuous densities (HMM-GM).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Frikha, M., Hamida, A.B.: A Comparitive Survey of ANN and Hybrid HMM/ANN Architectures for Robust Speech Recognition. American Journal of Intelligent Systems 2(1), 1–8 (2012)
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing 28(4), 357–366 (1980)
Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Amer. 87(4), 1738–1752 (1990)
Ouni, K., Ellouze, N.: A Time-Frequency Analysis of Speech Based on Psychoacoustic Characteristics. In: Proceedings of the 17th International Congresses on Acoustics, ICA-ROME (2001)
Irino, T., Patterson, R.D.: A Dynamic Compressive Gammachirp Auditory Filterbank. IEEE Transactions on Audio, Speech, and Language Processing 14(6) (2006); author manuscript, available in PMC (2009)
Unokia, M., Irino, T., Glasberg, B., Moore, B.C.J., Patterson, R.D.: Comparison of the roex and gammachirp filters as representations of the auditory filter. J. Acoust. Soc. Am. 120(3), 1474–1492 (2006); available in PMC (2010)
Irino, T., Patterson, R.D.: A time-domain, level-dependent auditory filter: The Gammachirp. J. Acoust. Soc. Am. 101(1), 412–419 (1997)
Park, A.: Using Gammachirp filter for auditory analysis of speech. 18.327, Wavelets and Filter banks (2003)
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.4.1). Cambridge University Engineering Department (2009)
Zoghlami, N., Lachiri, Z., Ellouze, N.: Speech Enhancement using Auditory Spectral Attenuation. In: Proceedings of the 17th European Signal Processing Conference, EUSIPCO, Glasgow, Scotland (2009)
Patterson, R.D., Unoki, M., Irino, T.: Extending the domain of center frequencies for the compressive gammachirp auditory filter. J. Acoust. Soc. Amer. 114(5), 1529–1542 (2003)
Irino, T., Patterson, R.D.: A compressive gammachirp auditory filter for both physiological and psychophysical data. J. Acoust. Soc. Am. 109(5), 2008–2022 (2001)
Irino, T., Patterson, U.M.: A time-domain, level-dependent auditory filter: An Analysis/Synthesis Auditory Filterbank Based on an IIR Gammachirp Filter. J. Acoust. Soc. Jpn (E) 20(5), 397–406 (1999)
Moore, B.C.J.: An Introduction to the Psychology of Hearing, 5th edn. Academic Press, London (2003)
Glasberg, B.R., Moore, B.C.J.: Derivation of auditory filter shapes from notched-noise data. Hearing Research 47, 103–138 (1990)
Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. IEEE Press / Wiley-Interscience (2006)
http://www.acousticscale.org/wiki/index.php/AIM2006_Documentation
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT) Training and Test Data and Speech Header Software NIST Speech Disc CD1-1.1 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zouhir, Y., Ouni, K. (2013). Speech Signals Parameterization Based on Auditory Filter Modeling. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-38847-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38846-0
Online ISBN: 978-3-642-38847-7
eBook Packages: Computer ScienceComputer Science (R0)