Speech Compression Based on Frequency Warped Cepstrum and Wavelet Analysis

Ayala, Francisco J.; Herrera, Abel

doi:10.1007/978-3-642-21587-2_32

Francisco J. Ayala²⁰ &
Abel Herrera²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6718))

Included in the following conference series:

Mexican Conference on Pattern Recognition

1398 Accesses

Abstract

In this article it is described the process to extract a set of cepstral coefficients from a warped frequency space (mel and bark) and analyze the perceived differences in the reconstructed signal. We will try to determine if there is any audible improvement between these two most used scales for the purpose of speech analysis by synthesis. We will use the same procedure for parameter extraction and signal reconstruction for both functions, replacing only the warping scale. The proposed system is based on a basic cepstral analysis synthesis model on the mel scale, whose excitation signal generation process has been changed. The inverse MLSA filter was obtained in order to generate the analysis signal, then this signal is fed into a wavelet decomposition block and the resultant coefficients are sent to the decoding system where the excitation signal is reconstructed. Furthermore the mel scale is replaced by bark scale.

Download to read the full chapter text

Chapter PDF

Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Article 06 June 2016

Speech Compression with Wavelets and µ-Law for Wireless Communication

Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition

Article 17 August 2017

Keywords

References

Harma, A., Karjalainen, M.: Frequency-Warped Signal Processing for audio Applications. In: 108th AES convention Paris, France (2000)
Google Scholar
Imai, S.: Cepstral analysis synthesis on the mel frequency scale. In: IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP 1983, vol. 8, pp. 93–96 (1983)
Google Scholar
Acero, A., Hon, H.-W.: Spoken language processing: a guide to theory. Algorithm and system development (2001)
Google Scholar
Smith, J.O., Abel, J.S.: Bark and ERB bilinear transforms. In: IEEE Speech and Audio Processing, vol. 7, pp. 697–708 (1999)
Google Scholar
Tokuda, K., Kobayashi, T.: Recursive Calculation of Mel-Cepstrum from LP Coefficients (April 1994)
Google Scholar
Mallat, S.: A wavelet tour of signal processing, pp. 255–263 (1999)
Google Scholar
Shivraman, G., Nilesh, N.: Speech compression using wavelets. Department of electrical engineering, Veermata Jijabai Technological Institute, University of Mumbai, pp. 29–54 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Digital speech processing laboratory, Universidad Nacional Autónoma de México, Facultad de Ingeniería, Ciudad Universitaria, 04510, Mexico City, Mexico
Francisco J. Ayala & Abel Herrera

Authors

Francisco J. Ayala
View author publications
You can also search for this author in PubMed Google Scholar
Abel Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Optics and Electronics (INAOE), Computer Science Department, National Institute of Astrophysics, Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez-Trinidad
Optics and Electronics (INAOE), Computer Science Department, National Institute for Astrophysics, Luis Enrique Erro No. 1, 72840 Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco-Ochoa
Cancun Technological Institute (ITC), Av. Kabah, Km. 3,, 77515, Cancun, Qintana Roo, Mexico
Cherif Ben-Youssef Brants
Department of Computer Science, University of York, UK
Edwin Robert Hancock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ayala, F.J., Herrera, A. (2011). Speech Compression Based on Frequency Warped Cepstrum and Wavelet Analysis. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Ben-Youssef Brants, C., Hancock, E.R. (eds) Pattern Recognition. MCPR 2011. Lecture Notes in Computer Science, vol 6718. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21587-2_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-21587-2_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21586-5
Online ISBN: 978-3-642-21587-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Speech Compression Based on Frequency Warped Cepstrum and Wavelet Analysis

Abstract

Chapter PDF

Similar content being viewed by others

Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Speech Compression with Wavelets and µ-Law for Wireless Communication

Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Speech Compression Based on Frequency Warped Cepstrum and Wavelet Analysis

Abstract

Chapter PDF

Similar content being viewed by others

Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Speech Compression with Wavelets and µ-Law for Wireless Communication

Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation