Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks

Pasero, Eros; Montuori, Alfonso

doi:10.1007/3-540-45808-5_18

Eros Pasero⁷ &
Alfonso Montuori⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2486))

Included in the following conference series:

Italian Workshop on Neural Nets

908 Accesses

Abstract

We developed a real-time wideband speech codec adopting a wavelet packet based methodology. The transform domain coefficients were first quantized by means of a mid-tread uniform quantizer and then encoded with an arithmetic coding. In the first step the wavelet coefficients were quantized by using a psycho-acoustic model. The second step was carried out by adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network. The neural network was trained on the TIMIT corpus and his weights updated in real-time during the compression in order to model better the speech characteristics of the current speaker. The coding/decoding algorithm was first written in C and then optimised on the TMS320C6000 DSP platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Daubechies, I.: Orthonormal bases of compactly supported wavelets. Comm. Pure Appl. Math., Vol. 4 (1988) 909–996
Article MathSciNet Google Scholar
Daubechies, I.: Ten Lectures on Wavalets. SIAM, Philadelphia, PA (1992)
Google Scholar
Singh, I., Agathoklis, P., Antoniou, A.: Wavelet-based compression of speech signals on the TMS320C30 digital signal processor. IEEE Symposium on Advances in Digital Filtering and Signal Processing (1998) 178–182
Google Scholar
Fu, X., Zhang, Z.: TMS320C6000 DSP Multichannel Vocoder Technology Demonstration Kit Host Side Design. Texas Instruments Application Report, Literature Number SPRA558B (2000)
Google Scholar
Wickerhauser, M.V.: INRIA Lectures on wavelet packet algorithms. Lecture Notes in Computer Science. Problemes Non P.-L. Lions, Ed., Roquencourt, France (1991)
Google Scholar
Villasenor, J.D., Belzer, B., Liao, J.: Filter Evaluation and Selection in Wavelet Image Compression. Proceedings of IEEE Data Compression Conference (1994) 351–360
Google Scholar
Vetterli, M., Kovacevic, J.: Wavelets and subband coding. Prentice-Hall, Englewood Cliffs, NJ 1995)
MATH Google Scholar
Mallat, S.: A wavelet tour of signal processing. 2nd edn. Academic Press (1998)
Google Scholar
Johnston, J.D., Sinha, D., Dorward, S., Quackenbush, S.R.: AT&T Perceptual Audio Coder. Collected Papers on Digital Audio Bit-Rate Reduction. N. Gilchrist and C. Grewin, Editors, AES (1996)
Google Scholar
Jayant, N. S., Noll, P.: Digital Coding of Waveforms: Principles and Applications to Speech and Video. Prentice-Hall, Englewood Cliffs, NJ (1984)
Google Scholar
Schroeder, M.R., Atal, B.S., Hall, J. L.: Optimizing digital speech coders by exploiting masking properties of the human ear. Journal of the Acoustical Society of America, Vol. 66, no. 6 (1979) 1647–1652
Article Google Scholar
Carnero, B., Drygajlo, A.: Perceptual Speech Coding and Enhancement Using Frame-Synchronized Fast Wavelet Packet Transform Algorithms. IEEE Trans, on Signal Processing, Vol. 47, no. 6, (1999)
Google Scholar
Golchin, F., Paliwal, K.K.: Lossless coding of MPEG-1 Layer III encoded audio streams. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2 (2000)
Google Scholar
Kohonen, T.: Self-Organization and Associative Memory. 2nd edn. Springer-Verlag, Berlin (1987)
Google Scholar
The Mathworks Inc. (ed.): Neural networks toolbook. (2000)
Google Scholar
Papamichalis, P.E.: Practical approaches to speech coding. Prentice-Hall, Englewood Cliffs New Jersey (1987)
Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press (1988) 910–915
Google Scholar
Texas Instruments Inc. (ed.): TMS320C6201/6701 Evaluation Module Technical Reference. Literature Number SPRU305 (1998)
Google Scholar
Dart, D.: Understanding the Functional Enhancements of DSP/BIOS II and their Utilization in Real-Time DSP Applications. Texas Instruments Application Report, Literature Number SPRA648 (2000)
Google Scholar
Montuori, A., Quaglia, D.: A Tutorial on Subband Audio Coding Using the TMS320C6211 Starter Kit. Application Report, Texas Instruments DSP Challenge 2000 (2001)
Google Scholar
Quaglia, D., Montuori, A., De Martin, J. C., Pasero, E.: Interactive DSP Educational Platform for Real-Time Subband Audio Coding. Proceedings of ICASSP 2002, International Conference on Acoustics, Speech, and Signal Processing (2002)
Google Scholar
Pasero, E., Montuori, A.: Wavelet Based Wideband Speech Coding on the TMS320C67 for Real-Time Transmission. Proceedings of IEEE Multimedia Technology and Applications Conference (2001) 208–212
Google Scholar
ITU-T: G.722.1. Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Series G: Transmission Systems and Media, Digital Systems and Networks (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Elettronica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, 10100, Torino, Italy
Eros Pasero & Alfonso Montuori

Authors

Eros Pasero
View author publications
You can also search for this author in PubMed Google Scholar
Alfonso Montuori
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Fisica, “E.R. Caianiello”, Via S. Allende, Baronissi (Salerno), Italy
Maria Marinaro
International Institut for Advanced Scientific Studies “E.R. Caianiello”, IIASS, Via G. Pellegrino, 19, 84019, Vietri Sul Mare (Salerno), Italy
Maria Marinaro
Dipartimento di Matematica ed Informatica, Via S. Allende, Baronissi (Salerno), Italy
Roberto Tagliaferri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pasero, E., Montuori, A. (2002). Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2002. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45808-5_18

Download citation

DOI: https://doi.org/10.1007/3-540-45808-5_18
Published: 26 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44265-3
Online ISBN: 978-3-540-45808-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics