Speech Signals Parameterization Based on Auditory Filter Modeling

Zouhir, Youssef; Ouni, Kaïs

doi:10.1007/978-3-642-38847-7_8

Youssef Zouhir²¹ &
Kaïs Ouni²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7911))

Included in the following conference series:

International Conference on Nonlinear Speech Processing

1049 Accesses
2 Citations

Abstract

This paper presents a parameterization technique of speech signal based on auditory filter modeling by the Gammachirp auditory filterbank (GcFB), which is designed to provide a spectrum reflecting the spectral properties of the cochlea filter, which is responsible of frequency analysis in the human auditory system. The center frequencies of the GcFB are based on the ERB-rate scale, with the bandwidth of the Gammachirp filter is measured in Equivalent Rectangular Bandwidth (ERB) of human auditory filters. Our parameterization approach gives interesting results vs. other standard techniques such as LPC (Linear Prediction Coefficients), PLP (Perceptual Linear Prediction), for recognition of isolated words of speech from the TIMIT database. The recognition system is implemented on HTK platform (Hidden Toolkit) based on the Hidden Markov Models with Gaussian Mixture observation continuous densities (HMM-GM).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Frikha, M., Hamida, A.B.: A Comparitive Survey of ANN and Hybrid HMM/ANN Architectures for Robust Speech Recognition. American Journal of Intelligent Systems 2(1), 1–8 (2012)
Article Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing 28(4), 357–366 (1980)
Article Google Scholar
Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Amer. 87(4), 1738–1752 (1990)
Article Google Scholar
Ouni, K., Ellouze, N.: A Time-Frequency Analysis of Speech Based on Psychoacoustic Characteristics. In: Proceedings of the 17th International Congresses on Acoustics, ICA-ROME (2001)
Google Scholar
Irino, T., Patterson, R.D.: A Dynamic Compressive Gammachirp Auditory Filterbank. IEEE Transactions on Audio, Speech, and Language Processing 14(6) (2006); author manuscript, available in PMC (2009)
Google Scholar
Unokia, M., Irino, T., Glasberg, B., Moore, B.C.J., Patterson, R.D.: Comparison of the roex and gammachirp filters as representations of the auditory filter. J. Acoust. Soc. Am. 120(3), 1474–1492 (2006); available in PMC (2010)
Google Scholar
Irino, T., Patterson, R.D.: A time-domain, level-dependent auditory filter: The Gammachirp. J. Acoust. Soc. Am. 101(1), 412–419 (1997)
Article Google Scholar
Park, A.: Using Gammachirp filter for auditory analysis of speech. 18.327, Wavelets and Filter banks (2003)
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.4.1). Cambridge University Engineering Department (2009)
Google Scholar
Zoghlami, N., Lachiri, Z., Ellouze, N.: Speech Enhancement using Auditory Spectral Attenuation. In: Proceedings of the 17th European Signal Processing Conference, EUSIPCO, Glasgow, Scotland (2009)
Google Scholar
Patterson, R.D., Unoki, M., Irino, T.: Extending the domain of center frequencies for the compressive gammachirp auditory filter. J. Acoust. Soc. Amer. 114(5), 1529–1542 (2003)
Article Google Scholar
Irino, T., Patterson, R.D.: A compressive gammachirp auditory filter for both physiological and psychophysical data. J. Acoust. Soc. Am. 109(5), 2008–2022 (2001)
Article Google Scholar
Irino, T., Patterson, U.M.: A time-domain, level-dependent auditory filter: An Analysis/Synthesis Auditory Filterbank Based on an IIR Gammachirp Filter. J. Acoust. Soc. Jpn (E) 20(5), 397–406 (1999)
Article Google Scholar
Moore, B.C.J.: An Introduction to the Psychology of Hearing, 5th edn. Academic Press, London (2003)
Google Scholar
Glasberg, B.R., Moore, B.C.J.: Derivation of auditory filter shapes from notched-noise data. Hearing Research 47, 103–138 (1990)
Article Google Scholar
Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. IEEE Press / Wiley-Interscience (2006)
Google Scholar
http://www.acousticscale.org/wiki/index.php/AIM2006_Documentation
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT) Training and Test Data and Speech Header Software NIST Speech Disc CD1-1.1 (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

Unité de Recherche Systèmes Mécatroniques et Signaux, École Supérieure de Technologie et d’Informatique, Université de Carthage, Tunisie
Youssef Zouhir & Kaïs Ouni

Authors

Youssef Zouhir
View author publications
You can also search for this author in PubMed Google Scholar
Kaïs Ouni
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TCTS Lab, University of Mons, 31, Bouldevard Bolez, 7000, Mons, Belgium
Thomas Drugman
TCTS Lab, University of Mons, 31, Boulevard Dolez, 7000, Mons, Belgium
Thierry Dutoit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zouhir, Y., Ouni, K. (2013). Speech Signals Parameterization Based on Auditory Filter Modeling. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-38847-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38846-0
Online ISBN: 978-3-642-38847-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics