Advertisement

Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks

  • Eros Pasero
  • Alfonso Montuori
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2486)

Abstract

We developed a real-time wideband speech codec adopting a wavelet packet based methodology. The transform domain coefficients were first quantized by means of a mid-tread uniform quantizer and then encoded with an arithmetic coding. In the first step the wavelet coefficients were quantized by using a psycho-acoustic model. The second step was carried out by adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network. The neural network was trained on the TIMIT corpus and his weights updated in real-time during the compression in order to model better the speech characteristics of the current speaker. The coding/decoding algorithm was first written in C and then optimised on the TMS320C6000 DSP platform.

Keywords

Wideband speech Competitive Neural Network 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Daubechies, I.: Orthonormal bases of compactly supported wavelets. Comm. Pure Appl. Math., Vol. 4 (1988) 909–996CrossRefMathSciNetGoogle Scholar
  2. [2]
    Daubechies, I.: Ten Lectures on Wavalets. SIAM, Philadelphia, PA (1992)Google Scholar
  3. [3]
    Singh, I., Agathoklis, P., Antoniou, A.: Wavelet-based compression of speech signals on the TMS320C30 digital signal processor. IEEE Symposium on Advances in Digital Filtering and Signal Processing (1998) 178–182Google Scholar
  4. [4]
    Fu, X., Zhang, Z.: TMS320C6000 DSP Multichannel Vocoder Technology Demonstration Kit Host Side Design. Texas Instruments Application Report, Literature Number SPRA558B (2000)Google Scholar
  5. [5]
    Wickerhauser, M.V.: INRIA Lectures on wavelet packet algorithms. Lecture Notes in Computer Science. Problemes Non P.-L. Lions, Ed., Roquencourt, France (1991)Google Scholar
  6. [6]
    Villasenor, J.D., Belzer, B., Liao, J.: Filter Evaluation and Selection in Wavelet Image Compression. Proceedings of IEEE Data Compression Conference (1994) 351–360Google Scholar
  7. [7]
    Vetterli, M., Kovacevic, J.: Wavelets and subband coding. Prentice-Hall, Englewood Cliffs, NJ 1995)zbMATHGoogle Scholar
  8. [8]
    Mallat, S.: A wavelet tour of signal processing. 2nd edn. Academic Press (1998)Google Scholar
  9. [9]
    Johnston, J.D., Sinha, D., Dorward, S., Quackenbush, S.R.: AT&T Perceptual Audio Coder. Collected Papers on Digital Audio Bit-Rate Reduction. N. Gilchrist and C. Grewin, Editors, AES (1996)Google Scholar
  10. [10]
    Jayant, N. S., Noll, P.: Digital Coding of Waveforms: Principles and Applications to Speech and Video. Prentice-Hall, Englewood Cliffs, NJ (1984)Google Scholar
  11. [11]
    Schroeder, M.R., Atal, B.S., Hall, J. L.: Optimizing digital speech coders by exploiting masking properties of the human ear. Journal of the Acoustical Society of America, Vol. 66, no. 6 (1979) 1647–1652CrossRefGoogle Scholar
  12. [12]
    Carnero, B., Drygajlo, A.: Perceptual Speech Coding and Enhancement Using Frame-Synchronized Fast Wavelet Packet Transform Algorithms. IEEE Trans, on Signal Processing, Vol. 47, no. 6, (1999)Google Scholar
  13. [13]
    Golchin, F., Paliwal, K.K.: Lossless coding of MPEG-1 Layer III encoded audio streams. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2 (2000)Google Scholar
  14. [14]
    Kohonen, T.: Self-Organization and Associative Memory. 2nd edn. Springer-Verlag, Berlin (1987)Google Scholar
  15. [15]
    The Mathworks Inc. (ed.): Neural networks toolbook. (2000)Google Scholar
  16. [16]
    Papamichalis, P.E.: Practical approaches to speech coding. Prentice-Hall, Englewood Cliffs New Jersey (1987)Google Scholar
  17. [17]
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press (1988) 910–915Google Scholar
  18. [18]
    Texas Instruments Inc. (ed.): TMS320C6201/6701 Evaluation Module Technical Reference. Literature Number SPRU305 (1998)Google Scholar
  19. [19]
    Dart, D.: Understanding the Functional Enhancements of DSP/BIOS II and their Utilization in Real-Time DSP Applications. Texas Instruments Application Report, Literature Number SPRA648 (2000)Google Scholar
  20. [20]
    Montuori, A., Quaglia, D.: A Tutorial on Subband Audio Coding Using the TMS320C6211 Starter Kit. Application Report, Texas Instruments DSP Challenge 2000 (2001)Google Scholar
  21. [21]
    Quaglia, D., Montuori, A., De Martin, J. C., Pasero, E.: Interactive DSP Educational Platform for Real-Time Subband Audio Coding. Proceedings of ICASSP 2002, International Conference on Acoustics, Speech, and Signal Processing (2002)Google Scholar
  22. [22]
    Pasero, E., Montuori, A.: Wavelet Based Wideband Speech Coding on the TMS320C67 for Real-Time Transmission. Proceedings of IEEE Multimedia Technology and Applications Conference (2001) 208–212Google Scholar
  23. [23]
    ITU-T: G.722.1. Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Series G: Transmission Systems and Media, Digital Systems and Networks (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Eros Pasero
    • 1
  • Alfonso Montuori
    • 1
  1. 1.Dipartimento di ElettronicaPolitecnico di TorinoTorinoItaly

Personalised recommendations