Skip to main content

Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks

  • Conference paper
  • First Online:
Neural Nets (WIRN 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2486))

Included in the following conference series:

  • 908 Accesses

Abstract

We developed a real-time wideband speech codec adopting a wavelet packet based methodology. The transform domain coefficients were first quantized by means of a mid-tread uniform quantizer and then encoded with an arithmetic coding. In the first step the wavelet coefficients were quantized by using a psycho-acoustic model. The second step was carried out by adapting the probability model of the quantized coefficients frame by frame by means of a competitive neural network. The neural network was trained on the TIMIT corpus and his weights updated in real-time during the compression in order to model better the speech characteristics of the current speaker. The coding/decoding algorithm was first written in C and then optimised on the TMS320C6000 DSP platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Daubechies, I.: Orthonormal bases of compactly supported wavelets. Comm. Pure Appl. Math., Vol. 4 (1988) 909–996

    Article  MathSciNet  Google Scholar 

  2. Daubechies, I.: Ten Lectures on Wavalets. SIAM, Philadelphia, PA (1992)

    Google Scholar 

  3. Singh, I., Agathoklis, P., Antoniou, A.: Wavelet-based compression of speech signals on the TMS320C30 digital signal processor. IEEE Symposium on Advances in Digital Filtering and Signal Processing (1998) 178–182

    Google Scholar 

  4. Fu, X., Zhang, Z.: TMS320C6000 DSP Multichannel Vocoder Technology Demonstration Kit Host Side Design. Texas Instruments Application Report, Literature Number SPRA558B (2000)

    Google Scholar 

  5. Wickerhauser, M.V.: INRIA Lectures on wavelet packet algorithms. Lecture Notes in Computer Science. Problemes Non P.-L. Lions, Ed., Roquencourt, France (1991)

    Google Scholar 

  6. Villasenor, J.D., Belzer, B., Liao, J.: Filter Evaluation and Selection in Wavelet Image Compression. Proceedings of IEEE Data Compression Conference (1994) 351–360

    Google Scholar 

  7. Vetterli, M., Kovacevic, J.: Wavelets and subband coding. Prentice-Hall, Englewood Cliffs, NJ 1995)

    MATH  Google Scholar 

  8. Mallat, S.: A wavelet tour of signal processing. 2nd edn. Academic Press (1998)

    Google Scholar 

  9. Johnston, J.D., Sinha, D., Dorward, S., Quackenbush, S.R.: AT&T Perceptual Audio Coder. Collected Papers on Digital Audio Bit-Rate Reduction. N. Gilchrist and C. Grewin, Editors, AES (1996)

    Google Scholar 

  10. Jayant, N. S., Noll, P.: Digital Coding of Waveforms: Principles and Applications to Speech and Video. Prentice-Hall, Englewood Cliffs, NJ (1984)

    Google Scholar 

  11. Schroeder, M.R., Atal, B.S., Hall, J. L.: Optimizing digital speech coders by exploiting masking properties of the human ear. Journal of the Acoustical Society of America, Vol. 66, no. 6 (1979) 1647–1652

    Article  Google Scholar 

  12. Carnero, B., Drygajlo, A.: Perceptual Speech Coding and Enhancement Using Frame-Synchronized Fast Wavelet Packet Transform Algorithms. IEEE Trans, on Signal Processing, Vol. 47, no. 6, (1999)

    Google Scholar 

  13. Golchin, F., Paliwal, K.K.: Lossless coding of MPEG-1 Layer III encoded audio streams. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2 (2000)

    Google Scholar 

  14. Kohonen, T.: Self-Organization and Associative Memory. 2nd edn. Springer-Verlag, Berlin (1987)

    Google Scholar 

  15. The Mathworks Inc. (ed.): Neural networks toolbook. (2000)

    Google Scholar 

  16. Papamichalis, P.E.: Practical approaches to speech coding. Prentice-Hall, Englewood Cliffs New Jersey (1987)

    Google Scholar 

  17. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press (1988) 910–915

    Google Scholar 

  18. Texas Instruments Inc. (ed.): TMS320C6201/6701 Evaluation Module Technical Reference. Literature Number SPRU305 (1998)

    Google Scholar 

  19. Dart, D.: Understanding the Functional Enhancements of DSP/BIOS II and their Utilization in Real-Time DSP Applications. Texas Instruments Application Report, Literature Number SPRA648 (2000)

    Google Scholar 

  20. Montuori, A., Quaglia, D.: A Tutorial on Subband Audio Coding Using the TMS320C6211 Starter Kit. Application Report, Texas Instruments DSP Challenge 2000 (2001)

    Google Scholar 

  21. Quaglia, D., Montuori, A., De Martin, J. C., Pasero, E.: Interactive DSP Educational Platform for Real-Time Subband Audio Coding. Proceedings of ICASSP 2002, International Conference on Acoustics, Speech, and Signal Processing (2002)

    Google Scholar 

  22. Pasero, E., Montuori, A.: Wavelet Based Wideband Speech Coding on the TMS320C67 for Real-Time Transmission. Proceedings of IEEE Multimedia Technology and Applications Conference (2001) 208–212

    Google Scholar 

  23. ITU-T: G.722.1. Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Series G: Transmission Systems and Media, Digital Systems and Networks (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pasero, E., Montuori, A. (2002). Real-Time Perceptual Coding of Wideband Speech by Competitive Neural Networks. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2002. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45808-5_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-45808-5_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44265-3

  • Online ISBN: 978-3-540-45808-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics