Abstract
We developed a real-time wideband speech codec adopting a wavelet packet based methodology. The transform domain coefficients were first quantized by means of a mid-tread uniform quantizer and then encoded with an arithmetic coding. In the first step the wavelet coefficients were quantized by using a psycho-acoustic model. The second step was carried out by adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network. The neural network was trained on the TIMIT corpus and his weights updated in real-time during the compression in order to model better the speech characteristics of the current speaker. The coding/decoding algorithm was first written in C and then optimised on the TMS320C6000 DSP platform.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Daubechies, I.: Orthonormal bases of compactly supported wavelets. Comm. Pure Appl. Math., Vol. 4 (1988) 909–996
Daubechies, I.: Ten Lectures on Wavalets. SIAM, Philadelphia, PA (1992)
Singh, I., Agathoklis, P., Antoniou, A.: Wavelet-based compression of speech signals on the TMS320C30 digital signal processor. IEEE Symposium on Advances in Digital Filtering and Signal Processing (1998) 178–182
Fu, X., Zhang, Z.: TMS320C6000 DSP Multichannel Vocoder Technology Demonstration Kit Host Side Design. Texas Instruments Application Report, Literature Number SPRA558B (2000)
Wickerhauser, M.V.: INRIA Lectures on wavelet packet algorithms. Lecture Notes in Computer Science. Problemes Non P.-L. Lions, Ed., Roquencourt, France (1991)
Villasenor, J.D., Belzer, B., Liao, J.: Filter Evaluation and Selection in Wavelet Image Compression. Proceedings of IEEE Data Compression Conference (1994) 351–360
Vetterli, M., Kovacevic, J.: Wavelets and subband coding. Prentice-Hall, Englewood Cliffs, NJ 1995)
Mallat, S.: A wavelet tour of signal processing. 2nd edn. Academic Press (1998)
Johnston, J.D., Sinha, D., Dorward, S., Quackenbush, S.R.: AT&T Perceptual Audio Coder. Collected Papers on Digital Audio Bit-Rate Reduction. N. Gilchrist and C. Grewin, Editors, AES (1996)
Jayant, N. S., Noll, P.: Digital Coding of Waveforms: Principles and Applications to Speech and Video. Prentice-Hall, Englewood Cliffs, NJ (1984)
Schroeder, M.R., Atal, B.S., Hall, J. L.: Optimizing digital speech coders by exploiting masking properties of the human ear. Journal of the Acoustical Society of America, Vol. 66, no. 6 (1979) 1647–1652
Carnero, B., Drygajlo, A.: Perceptual Speech Coding and Enhancement Using Frame-Synchronized Fast Wavelet Packet Transform Algorithms. IEEE Trans, on Signal Processing, Vol. 47, no. 6, (1999)
Golchin, F., Paliwal, K.K.: Lossless coding of MPEG-1 Layer III encoded audio streams. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2 (2000)
Kohonen, T.: Self-Organization and Associative Memory. 2nd edn. Springer-Verlag, Berlin (1987)
The Mathworks Inc. (ed.): Neural networks toolbook. (2000)
Papamichalis, P.E.: Practical approaches to speech coding. Prentice-Hall, Englewood Cliffs New Jersey (1987)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press (1988) 910–915
Texas Instruments Inc. (ed.): TMS320C6201/6701 Evaluation Module Technical Reference. Literature Number SPRU305 (1998)
Dart, D.: Understanding the Functional Enhancements of DSP/BIOS II and their Utilization in Real-Time DSP Applications. Texas Instruments Application Report, Literature Number SPRA648 (2000)
Montuori, A., Quaglia, D.: A Tutorial on Subband Audio Coding Using the TMS320C6211 Starter Kit. Application Report, Texas Instruments DSP Challenge 2000 (2001)
Quaglia, D., Montuori, A., De Martin, J. C., Pasero, E.: Interactive DSP Educational Platform for Real-Time Subband Audio Coding. Proceedings of ICASSP 2002, International Conference on Acoustics, Speech, and Signal Processing (2002)
Pasero, E., Montuori, A.: Wavelet Based Wideband Speech Coding on the TMS320C67 for Real-Time Transmission. Proceedings of IEEE Multimedia Technology and Applications Conference (2001) 208–212
ITU-T: G.722.1. Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Series G: Transmission Systems and Media, Digital Systems and Networks (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pasero, E., Montuori, A. (2002). Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2002. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45808-5_18
Download citation
DOI: https://doi.org/10.1007/3-540-45808-5_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44265-3
Online ISBN: 978-3-540-45808-1
eBook Packages: Springer Book Archive