Skip to main content

Telecommunication systems make intensive use of speech coders. In wireless systems, where bandwidth is limited, speech coders provide one of the enabling technologies to reach more users and furnish better services. In wireline systems, where bandwidth can be less of an issue, speech is also digitized and compressed to a certain extent depending on the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 629.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 799.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 799.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.S. Spanias, “Speech coding: a tutoral review”, Proceedings of the IEEE, vol. 82, no. 10, pp. 1541–1582, October 1994.

    Google Scholar 

  2. B. Kleijn and K. Paliwal, eds., Speech Coding and Synthesis, Elsevier, 1995.

    Google Scholar 

  3. L.R. Rabiner, R.W. Shafer, Digital Processing of Speech Signals, Prentice-Hall Signal Processing Series, 1978.

    Google Scholar 

  4. R.A., Salami, L., Hanzo, R., Steele, K.H.J. Wong, and I. Wassell, Speech coding, in R., Steele, eds., Mobile Radio Communications, chapter 3, pp. 186–346. IEEE Press – Wiley, 1992.

    Google Scholar 

  5. B. Bessette, R. Salami, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkola, and K. Jarvinen, “The adaptive multirate wideband speech codec (AMR-WB)”, IEEE Transactions on Speech and Audio Processing, vol. 10, no. 8, pp. 620–636, November 2002.

    Google Scholar 

  6. R. Salami, C. Laflamme, B. Bessette, and J.-P. Adoul, “ITU-T Recommendation G.729 Annex A: reduced complexity 8 kbit/s CS-ACELP codec for digital simultaneous voice and data”, IEEE Communications Magazine, vol. 35, no. 9, pp. 56–63, September 1997.

    Google Scholar 

  7. C. Laflamme, J.P. Adoul, H.Y. Su, and S. Morissette, “On reducing computational complexity of codebook search in CELP coder through the use of algebraic codes”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 177–180, Albuquerque, New Mexico, USA, April 3–6, 1990.

    Google Scholar 

  8. K. Järvinen et al. “GSM enhanced full rate codec”, IEEE 1997 International Conference on Acoustics, Speech and Signal Processing, pp. 771–774, Munich, Germany, April 20–24, 1997.

    Google Scholar 

  9. ITU-T Recommendation P.48, “Specification for an intermediate reference system, volume V of the Blue Book”, pp. 81–86, ITU, Geneva, February 1996.

    Google Scholar 

  10. J. Thiemann, Acoustic Noise Suppression for Speech Signals Using Auditory Masking Effects, Masters Thesis, McGill University, 2001.

    Google Scholar 

  11. S. Ahmadi and M. Jelinek, “On the architecture, operation, and applications of VMR-WB: the new cdma2000 wideband speech coding standard”, IEEE Communications Magazine, vol. 44, no. 5, pp. 74–81, May 2006.

    Google Scholar 

  12. M. Jelinek and R. Salami, “Noise reduction method for wideband speech coding”, 12th European Signal Processing Conference (EUSIPCO 2004), pp. 1959–1962, Vienna, Austria, September 6–10, 2004.

    Google Scholar 

  13. A.S. Spanias, “Perceptual coding of digital audio”, Proceedings of the IEEE, vol. 88, no. 4, pp. 451–513, April 2000.

    Google Scholar 

  14. E. Ordentlich and Y. Shoham, “Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps”, IEEE 1991 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91), pp. 9–12, Toronto, Canada, May 14–17, 1991.

    Google Scholar 

  15. 3GPP Technical Specification TS26.401, “General audio codec audio processing functions; Enhanced aacPlus general audio codec; General description”, June 2006.

    Google Scholar 

  16. 3GPP Technical Specification TS26.290, “Audio codec processing functions; Extended adaptive multi-Rate – wideband (AMR-WB+) codec; Transcoding functions”, June 2005.

    Google Scholar 

  17. M. Schug, A. Groschel, M. Beer, and F. Henn, “Enhancing audio coding efficiency of MPEG Layer-2 with spectral band replication (SBR) for DigitalRadio (EUREKA 147/DAB) in a backwards compatible way”, 114th Audio Engineering Society Convention, preprint no. 5850, Amsterdam, The Netherlands, March 22–25, 2003.

    Google Scholar 

  18. R. Salami, R. Lefebvre, and C. Laflamme, “A wideband codec at 16/24 kbit/s with 10 ms frames”, 1997 IEEE Workshop on Speech Coding, pp. 103–104, Pocono Manor, Pennsylvania, USA, September 7–10, 1997.

    Google Scholar 

  19. ITU-T Rec. G.722.1, “Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss”, September 1999.

    Google Scholar 

  20. T.B. Minde, S. Bruhn, E. Ekudden, and H. Hermansson, “Requirements on speech coders imposed by speech service solutions in cellular systems”, 1997 IEEE Workshop on Speech Coding,pp. 89–90, Pocono Manor, Pennsylvania, USA, September 7–10, 1997.

    Google Scholar 

  21. A. Uvliden, S. Bruhn, and R. Hagen, “Adaptive multi-rate. A speech service adapted to cellular radio network quality”, 32nd Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 343–347, Pacific Grove, California, USA, November 1–4, 1998.

    Google Scholar 

  22. S. Bruhn, P. Blocher, K. Hellwig, and J. Sjöberg, “Concepts and solutions for link adaptation and inband signalling for the GSM AMR speech coding standard”, IEEE Vehicular Technology Conference, pp. 2451–2455, Amsterdam, The Netherlands,September 19–22, 1999.

    Google Scholar 

  23. K. Järvinen, “Standardisation of the adaptive multi-rate codec”, 10th European Signal Processing Conference (EUSIPCO 2000),pp. 1313–1316, Tampere, Finland, September 4–8, 2000.

    Google Scholar 

  24. J. Sjöberg, M. Westerlund, A. Lakaniemi, and Q. Xie, “Real-time transport protocol (RTP) payload format and file storage format for the adaptive multi-rate (AMR) and adaptive multi-rate wideband (AMR-WB) audio codec”, IETF RFC 3267, June 2002.

    Google Scholar 

  25. D.J. Goodman, “Embedded DPCM for variable bit rate transmission”, IEEE Transactions on Communications, vol. 28, no. 7, pp. 1040–1046, July 1980.

    Google Scholar 

  26. R.D. De Iacovo and D. Sereno, “Embedded CELP coding for variable bit-rate between 6.4 and 9.6 kbit/s”, IEEE 1991 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91), pp. 681–684, Toronto, Canada, May 14–17, 1991.

    Google Scholar 

  27. S.A. Ramprashad, “A two stage hybrid embedded speech/audio coding structure”, IEEE 1998 International Conference on Acoustics, Speech, and Signal Processing (ISACCP’98), pp. 337–340, Seattle, Washington, USA, May 12–15, 1998.

    Google Scholar 

  28. S. Ragot et al., “ITU-T G.729.1: an 8–32 kbit/s scalable coder interoperable with G.729 for wideband telephony and voice over IP”, IEEE 2007 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2007), Honolulu, Hawaii, USA, April 15–20, 2007.

    Google Scholar 

  29. A. Gersho, and E. Paksoy, “An overview of variable rate speech coding for cellular networks”, International Conference on Selected Topics in Wireless Communications, pp. 172–175, Vancouver, Canada, June 25–26, 1992.

    Google Scholar 

  30. E. Paksoy, K. Srinivasan, and A. Gersho, “Variable Bit Rate CELP coding of speech with phonetic classification”, European Transactions on Telecommunications and Related Technologies, vol. 5, no. 5, pp. 591–602, September–October 1994.

    Google Scholar 

  31. A. DeJaco, W. Gardner, P. Jacobs, and C. Lee, “QCELP: The North American CDMA digital cellular variable rate speech coding standard”, 1993 IEEE Workshop on Speech Coding for Telecommunications, pp. 5–6, Sainte-Adèle, Québec, Canada, October 13–15, 1993.

    Google Scholar 

  32. W.B. Kleijin, P. Kroon, and D. Nahumi, “The RCELP speech-coding algorithm”, European Transactions on Telecommunications and Related Technologies, vol. 5, no. 5, pp. 573–582, September–October, 1994.

    Google Scholar 

  33. S.C. Greer, and A. DeJaco, “Standardization of the selectable mode vocoder”, IEEE 2001 International Conference on Acoustics, Speech and Signal Processing (ICASSP’01), pp. 953–956, Salt Lake City, Utah, USA, May 7–11, 2001.

    Google Scholar 

  34. M. Tammi, M. Jelinek, and V.T. Ruoppila, “A signal modification method for variable bit rate wideband speech coding”, IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 799–810, September 2005.

    Google Scholar 

  35. M. Jelinek, R. Salami, S. Ahmadi, B. Bessette, P. Gournay,C. Laflamme, and R. Lefebvre, “Advances in source-controlled variable bit rate wideband speech coding”, Special Workshop in Maui (SWIM), Lectures by Masters in Speech Processing, Maui, Hawaii, January 12–14, 2004.

    Google Scholar 

  36. A. Glavieux, Channel Coding in Communication Networks: From Theory to Turbo Codes, Iste Publishing Company, 2007.

    Google Scholar 

  37. A. Gersho and R.M. Gray, Vector Quantization and Signal Compression, Kluwer Academic Publishers, 1991.

    Google Scholar 

  38. M. Skoglund, “On channel-constrained vector quantization and index assignment for discrete memoryless channels”, IEEE Transactions on Information Theory, vol. 45, no. 6, pp. 2615–2622, November 1999.

    Google Scholar 

  39. H. Kumazawa, M. Kasahara, and T. Namekawa. “A construction of vector quantizers for noisy channels”, Electronics and Engineering in Japan, vol. 67-B(1), pp. 39–47, January 1984.

    Google Scholar 

  40. K. Zeger, and A. Gersho, “Pseudo-gray coding”, IEEE Transactions on Communications, vol. 38, no. 12, pp. 2147–2158, December 1990.

    Google Scholar 

  41. J. Skoglund, and J. Linden, “Predictive VQ for noisy channel spectrum coding: AR or MA?”, IEEE 1997 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97),pp. 1351–1354, Munich, Germany, April 21–24, 1997.

    Google Scholar 

  42. T. Eriksson, J. Linden, and J. Skoglund, “Interframe LSF quantization for noisy channels”, IEEE Transactions on Speech and Audio Processing, vol. 7, no. 5, pp. 495–509, September 1999.

    Google Scholar 

  43. J.G. Beerends, A.W. Rix, M.P. Hollier, and A.P. Hekstra, “Perceptual evaluation of speech quality (PESQ): The new ITU standard for end-to-end speech quality assessment, Part I – time-delay compensation; Part II – psychoacoustic model”, Journal of the Audio Engineering Society, vol. 50, no. 10, pp. 765–778, October 2002.

    Google Scholar 

  44. “AMR Wideband Speech Codec; Frame Structure”, 3GPP Technical Specification 3GPP TS 26.201, March 2001.

    Google Scholar 

  45. M. Chibani, “Increasing the robustness of CELP speech codecs against packet losses”, Ph.D. Thesis, University of Sherbrooke, Canada, January 2007.

    Google Scholar 

  46. C. Perkins, O. Hodson, and V. Hardman, “A survey of packet-loss recovery techniques for streaming audio”, IEEE Network, pp. 40–48, September–October 1998.

    Google Scholar 

  47. B.W. Wah, X. Su, and D. Lin, “A survey of error-concealment schemes for real-time audio and video transmission over the Internet”, 2000 International Symposium on Multimedia Software Engineering, pp. 17–24, Taipei, Taiwan, December 11–13, 2000.

    Google Scholar 

  48. E. Gündüzhan and K. Momtahan, “A linear prediction based packet loss concealment algorithm for PCM coded speech”, IEEE Transaction on Speech and Audio Processing, vol. 9, no. 8, pp. 778–785, November 2001.

    Google Scholar 

  49. J. Lindblom and P. Hedelin, “Packet loss concealment based on sinusoidal modeling”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 65–67, Ibaraki, Japan, October 6–9, 2002.

    Google Scholar 

  50. V. Vilaysouk and R. Lefebvre, “A hybrid concealment algorithm for non-predictive wideband audio coders”, 120th Audio Engineering Society Convention, preprint no.6670, Paris, France, May 20–23, 2006.

    Google Scholar 

  51. R. Salami, C. Laflamme, J. Adoul, A. Kataoka, S. Hayashi, T. Moriya, C. Lamblin, D. Massaloux, S. Proust, P. Kroon, and Y. Shoham, “Design and description of CS-ACELP: a toll quality 8 kb/s speech coder”, IEEE Transactions on Speech Audio Processing, vol. 6, no. 2, pp. 116–130, March 1998.

    Google Scholar 

  52. H. Sanneck and N. Le, “Speech property-based FEC for Internet telephony applications”, Proceedings of SPIE vol. 3969, pp. 38–51, Multimedia Computing and Networking 2000, San Jose, California, USA, January 24–26, 2000.

    Google Scholar 

  53. S.V. Andersen et al., “ILBC – A linear predictive coder with robustness to packet losses”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 23–25, Ibaraki, Japan, October 6–9, 2002.

    Google Scholar 

  54. R. Lefebvre, P. Gournay, and R. Salami, “A study of design compromises for speech coders in packet networks”, IEEE 2004 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2004), pp. 265–268, Montréal, Canada, May 17–21, 2004.

    Google Scholar 

  55. M. Chibani, P. Gournay, and R. Lefebvre, “Increasing the robustness of CELP-based coders by constrained optimization”, IEEE 2005 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2005), pp. 785–788, Philadelphia, Pennsylvania, USA, March 19–23, 2005.

    Google Scholar 

  56. M. Chibani, R. Lefebvre, and P. Gournay, “Resynchronization of the Adaptive codebook in a constrained CELP codec after a frame erasure”, IEEE 2006 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2006), pp. 13–16, Toulouse, France, March 14–19, 2006.

    Google Scholar 

  57. P. Gournay, F. Rousseau, and R. Lefebvre, “Improved packet loss recovery using late frames for prediction-based speech coders”, IEEE 2003 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2003), pp. 108–111, Hong Kong, April 6–10, 2003.

    Google Scholar 

  58. K.D. Anderson, and P. Gournay, “Pitch resynchronization while recovering from a late frame in a predictive decoder”, 9th International Conference on Spoken Language Processing (Interspeech 2006 – ICSLP), pp. 245–248, Pittsburgh, Pennsylvania, USA, September 17–21, 2006.

    Google Scholar 

  59. A. Ramjee, J. Kurose, D. Towsley, and H. Schulzrinne, “Adaptive playout mechanisms for packetized audio applications in wide-area networks”, The Conference on Computer Communications, 13th Annual Joint Conference of the IEEE Computer and Communications Societies, Networking for Global Communications (INFOCOM’94), pp. 680–688, Toronto, Canada, June 12–16, 1994.

    Google Scholar 

  60. Y.J. Liang, N. Färber, and B. Girod, “Adaptive playout scheduling using time-scale modification in packet voice communications”, IEEE 2001 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’2001), pp. 1445–1448, Salt Lake City, Utah, USA, May 7–11, 2001.

    Google Scholar 

  61. S. Roucos and A.M. Wilgus, “High quality time-scale modification for speech”, IEEE 1985 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’85), pp. 493–496, Tampa, Florida, USA, March 26–29, 1985.

    Google Scholar 

  62. H. Valbret, E. Moulines, and J.-P. Tubach, “Voice transformation using PSOLA technique”, Speech Communication, vol. 11, no. 2–3, pp. 175–187, June 1992.

    Google Scholar 

  63. D. Malah, “Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 27, no. 2,pp. 121–133, April 1979.

    Google Scholar 

  64. P. Gournay, and K.D. Anderson, “Performance analysis of a decoder-based time scaling algorithm for variable jitter buffering of speech over packet networks”, IEEE 2006 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2006), pp. 17–20, Toulouse, France, March 14–19, 2006.

    Google Scholar 

  65. J. Rosenberg, and H. Schulzrinne, “An RTP payload format for generic forward Error Correction”, IETF RFC 2733, December 1999.

    Google Scholar 

  66. C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J.C. Bolot, A. Vega-Garcia, and S. Fosse-Parisis, “RTP payload for redundant audio data”, IETF RFC2198, September 1997.

    Google Scholar 

  67. J.C. Bolot, S. Fosse-Parisis, and D. Towsley, “Adaptive FEC-based error control for Internet telephony”, Proceedings of IEEE INFOCOM’99, pp. 1453–1460, March 1999.

    Google Scholar 

  68. C. Padhye, K. Christensen, and W. Moreno, “A new adaptive FEC loss control algorithm for voice Over IP applications”, 19th IEEE International Performance, Computing and Communication Conference (IPCCC 2000), pp. 307–313, Phoenix, Arizona, USA, February 20–22, 2000.

    Google Scholar 

  69. I. Johansson, T. Frankkila, and P. Synnergren, “Bandwidth efficient AMR operation for VoIP”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 150–152, Ibaraki, Japan, October 6–9, 2002.

    Google Scholar 

  70. L.-A. Larzon, M. Degermark, S. Pink, “The lightweight user datagram protocol (UDP-Lite)”, L.-E. Jonsson and G. Fairhurst, eds., IETF RFC 3828, July 2004.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Lefebvre, R., Gournay, P. (2008). Speech Coders. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_31

Download citation

Publish with us

Policies and ethics