Telecommunication systems make intensive use of speech coders. In wireless systems, where bandwidth is limited, speech coders provide one of the enabling technologies to reach more users and furnish better services. In wireline systems, where bandwidth can be less of an issue, speech is also digitized and compressed to a certain extent depending on the system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A.S. Spanias, “Speech coding: a tutoral review”, Proceedings of the IEEE, vol. 82, no. 10, pp. 1541–1582, October 1994.
B. Kleijn and K. Paliwal, eds., Speech Coding and Synthesis, Elsevier, 1995.
L.R. Rabiner, R.W. Shafer, Digital Processing of Speech Signals, Prentice-Hall Signal Processing Series, 1978.
R.A., Salami, L., Hanzo, R., Steele, K.H.J. Wong, and I. Wassell, Speech coding, in R., Steele, eds., Mobile Radio Communications, chapter 3, pp. 186–346. IEEE Press – Wiley, 1992.
B. Bessette, R. Salami, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkola, and K. Jarvinen, “The adaptive multirate wideband speech codec (AMR-WB)”, IEEE Transactions on Speech and Audio Processing, vol. 10, no. 8, pp. 620–636, November 2002.
R. Salami, C. Laflamme, B. Bessette, and J.-P. Adoul, “ITU-T Recommendation G.729 Annex A: reduced complexity 8 kbit/s CS-ACELP codec for digital simultaneous voice and data”, IEEE Communications Magazine, vol. 35, no. 9, pp. 56–63, September 1997.
C. Laflamme, J.P. Adoul, H.Y. Su, and S. Morissette, “On reducing computational complexity of codebook search in CELP coder through the use of algebraic codes”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 177–180, Albuquerque, New Mexico, USA, April 3–6, 1990.
K. Järvinen et al. “GSM enhanced full rate codec”, IEEE 1997 International Conference on Acoustics, Speech and Signal Processing, pp. 771–774, Munich, Germany, April 20–24, 1997.
ITU-T Recommendation P.48, “Specification for an intermediate reference system, volume V of the Blue Book”, pp. 81–86, ITU, Geneva, February 1996.
J. Thiemann, Acoustic Noise Suppression for Speech Signals Using Auditory Masking Effects, Masters Thesis, McGill University, 2001.
S. Ahmadi and M. Jelinek, “On the architecture, operation, and applications of VMR-WB: the new cdma2000 wideband speech coding standard”, IEEE Communications Magazine, vol. 44, no. 5, pp. 74–81, May 2006.
M. Jelinek and R. Salami, “Noise reduction method for wideband speech coding”, 12th European Signal Processing Conference (EUSIPCO 2004), pp. 1959–1962, Vienna, Austria, September 6–10, 2004.
A.S. Spanias, “Perceptual coding of digital audio”, Proceedings of the IEEE, vol. 88, no. 4, pp. 451–513, April 2000.
E. Ordentlich and Y. Shoham, “Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps”, IEEE 1991 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91), pp. 9–12, Toronto, Canada, May 14–17, 1991.
3GPP Technical Specification TS26.401, “General audio codec audio processing functions; Enhanced aacPlus general audio codec; General description”, June 2006.
3GPP Technical Specification TS26.290, “Audio codec processing functions; Extended adaptive multi-Rate – wideband (AMR-WB+) codec; Transcoding functions”, June 2005.
M. Schug, A. Groschel, M. Beer, and F. Henn, “Enhancing audio coding efficiency of MPEG Layer-2 with spectral band replication (SBR) for DigitalRadio (EUREKA 147/DAB) in a backwards compatible way”, 114th Audio Engineering Society Convention, preprint no. 5850, Amsterdam, The Netherlands, March 22–25, 2003.
R. Salami, R. Lefebvre, and C. Laflamme, “A wideband codec at 16/24 kbit/s with 10 ms frames”, 1997 IEEE Workshop on Speech Coding, pp. 103–104, Pocono Manor, Pennsylvania, USA, September 7–10, 1997.
ITU-T Rec. G.722.1, “Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss”, September 1999.
T.B. Minde, S. Bruhn, E. Ekudden, and H. Hermansson, “Requirements on speech coders imposed by speech service solutions in cellular systems”, 1997 IEEE Workshop on Speech Coding,pp. 89–90, Pocono Manor, Pennsylvania, USA, September 7–10, 1997.
A. Uvliden, S. Bruhn, and R. Hagen, “Adaptive multi-rate. A speech service adapted to cellular radio network quality”, 32nd Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 343–347, Pacific Grove, California, USA, November 1–4, 1998.
S. Bruhn, P. Blocher, K. Hellwig, and J. Sjöberg, “Concepts and solutions for link adaptation and inband signalling for the GSM AMR speech coding standard”, IEEE Vehicular Technology Conference, pp. 2451–2455, Amsterdam, The Netherlands,September 19–22, 1999.
K. Järvinen, “Standardisation of the adaptive multi-rate codec”, 10th European Signal Processing Conference (EUSIPCO 2000),pp. 1313–1316, Tampere, Finland, September 4–8, 2000.
J. Sjöberg, M. Westerlund, A. Lakaniemi, and Q. Xie, “Real-time transport protocol (RTP) payload format and file storage format for the adaptive multi-rate (AMR) and adaptive multi-rate wideband (AMR-WB) audio codec”, IETF RFC 3267, June 2002.
D.J. Goodman, “Embedded DPCM for variable bit rate transmission”, IEEE Transactions on Communications, vol. 28, no. 7, pp. 1040–1046, July 1980.
R.D. De Iacovo and D. Sereno, “Embedded CELP coding for variable bit-rate between 6.4 and 9.6 kbit/s”, IEEE 1991 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91), pp. 681–684, Toronto, Canada, May 14–17, 1991.
S.A. Ramprashad, “A two stage hybrid embedded speech/audio coding structure”, IEEE 1998 International Conference on Acoustics, Speech, and Signal Processing (ISACCP’98), pp. 337–340, Seattle, Washington, USA, May 12–15, 1998.
S. Ragot et al., “ITU-T G.729.1: an 8–32 kbit/s scalable coder interoperable with G.729 for wideband telephony and voice over IP”, IEEE 2007 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2007), Honolulu, Hawaii, USA, April 15–20, 2007.
A. Gersho, and E. Paksoy, “An overview of variable rate speech coding for cellular networks”, International Conference on Selected Topics in Wireless Communications, pp. 172–175, Vancouver, Canada, June 25–26, 1992.
E. Paksoy, K. Srinivasan, and A. Gersho, “Variable Bit Rate CELP coding of speech with phonetic classification”, European Transactions on Telecommunications and Related Technologies, vol. 5, no. 5, pp. 591–602, September–October 1994.
A. DeJaco, W. Gardner, P. Jacobs, and C. Lee, “QCELP: The North American CDMA digital cellular variable rate speech coding standard”, 1993 IEEE Workshop on Speech Coding for Telecommunications, pp. 5–6, Sainte-Adèle, Québec, Canada, October 13–15, 1993.
W.B. Kleijin, P. Kroon, and D. Nahumi, “The RCELP speech-coding algorithm”, European Transactions on Telecommunications and Related Technologies, vol. 5, no. 5, pp. 573–582, September–October, 1994.
S.C. Greer, and A. DeJaco, “Standardization of the selectable mode vocoder”, IEEE 2001 International Conference on Acoustics, Speech and Signal Processing (ICASSP’01), pp. 953–956, Salt Lake City, Utah, USA, May 7–11, 2001.
M. Tammi, M. Jelinek, and V.T. Ruoppila, “A signal modification method for variable bit rate wideband speech coding”, IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 799–810, September 2005.
M. Jelinek, R. Salami, S. Ahmadi, B. Bessette, P. Gournay,C. Laflamme, and R. Lefebvre, “Advances in source-controlled variable bit rate wideband speech coding”, Special Workshop in Maui (SWIM), Lectures by Masters in Speech Processing, Maui, Hawaii, January 12–14, 2004.
A. Glavieux, Channel Coding in Communication Networks: From Theory to Turbo Codes, Iste Publishing Company, 2007.
A. Gersho and R.M. Gray, Vector Quantization and Signal Compression, Kluwer Academic Publishers, 1991.
M. Skoglund, “On channel-constrained vector quantization and index assignment for discrete memoryless channels”, IEEE Transactions on Information Theory, vol. 45, no. 6, pp. 2615–2622, November 1999.
H. Kumazawa, M. Kasahara, and T. Namekawa. “A construction of vector quantizers for noisy channels”, Electronics and Engineering in Japan, vol. 67-B(1), pp. 39–47, January 1984.
K. Zeger, and A. Gersho, “Pseudo-gray coding”, IEEE Transactions on Communications, vol. 38, no. 12, pp. 2147–2158, December 1990.
J. Skoglund, and J. Linden, “Predictive VQ for noisy channel spectrum coding: AR or MA?”, IEEE 1997 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97),pp. 1351–1354, Munich, Germany, April 21–24, 1997.
T. Eriksson, J. Linden, and J. Skoglund, “Interframe LSF quantization for noisy channels”, IEEE Transactions on Speech and Audio Processing, vol. 7, no. 5, pp. 495–509, September 1999.
J.G. Beerends, A.W. Rix, M.P. Hollier, and A.P. Hekstra, “Perceptual evaluation of speech quality (PESQ): The new ITU standard for end-to-end speech quality assessment, Part I – time-delay compensation; Part II – psychoacoustic model”, Journal of the Audio Engineering Society, vol. 50, no. 10, pp. 765–778, October 2002.
“AMR Wideband Speech Codec; Frame Structure”, 3GPP Technical Specification 3GPP TS 26.201, March 2001.
M. Chibani, “Increasing the robustness of CELP speech codecs against packet losses”, Ph.D. Thesis, University of Sherbrooke, Canada, January 2007.
C. Perkins, O. Hodson, and V. Hardman, “A survey of packet-loss recovery techniques for streaming audio”, IEEE Network, pp. 40–48, September–October 1998.
B.W. Wah, X. Su, and D. Lin, “A survey of error-concealment schemes for real-time audio and video transmission over the Internet”, 2000 International Symposium on Multimedia Software Engineering, pp. 17–24, Taipei, Taiwan, December 11–13, 2000.
E. Gündüzhan and K. Momtahan, “A linear prediction based packet loss concealment algorithm for PCM coded speech”, IEEE Transaction on Speech and Audio Processing, vol. 9, no. 8, pp. 778–785, November 2001.
J. Lindblom and P. Hedelin, “Packet loss concealment based on sinusoidal modeling”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 65–67, Ibaraki, Japan, October 6–9, 2002.
V. Vilaysouk and R. Lefebvre, “A hybrid concealment algorithm for non-predictive wideband audio coders”, 120th Audio Engineering Society Convention, preprint no.6670, Paris, France, May 20–23, 2006.
R. Salami, C. Laflamme, J. Adoul, A. Kataoka, S. Hayashi, T. Moriya, C. Lamblin, D. Massaloux, S. Proust, P. Kroon, and Y. Shoham, “Design and description of CS-ACELP: a toll quality 8 kb/s speech coder”, IEEE Transactions on Speech Audio Processing, vol. 6, no. 2, pp. 116–130, March 1998.
H. Sanneck and N. Le, “Speech property-based FEC for Internet telephony applications”, Proceedings of SPIE vol. 3969, pp. 38–51, Multimedia Computing and Networking 2000, San Jose, California, USA, January 24–26, 2000.
S.V. Andersen et al., “ILBC – A linear predictive coder with robustness to packet losses”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 23–25, Ibaraki, Japan, October 6–9, 2002.
R. Lefebvre, P. Gournay, and R. Salami, “A study of design compromises for speech coders in packet networks”, IEEE 2004 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2004), pp. 265–268, Montréal, Canada, May 17–21, 2004.
M. Chibani, P. Gournay, and R. Lefebvre, “Increasing the robustness of CELP-based coders by constrained optimization”, IEEE 2005 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2005), pp. 785–788, Philadelphia, Pennsylvania, USA, March 19–23, 2005.
M. Chibani, R. Lefebvre, and P. Gournay, “Resynchronization of the Adaptive codebook in a constrained CELP codec after a frame erasure”, IEEE 2006 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2006), pp. 13–16, Toulouse, France, March 14–19, 2006.
P. Gournay, F. Rousseau, and R. Lefebvre, “Improved packet loss recovery using late frames for prediction-based speech coders”, IEEE 2003 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2003), pp. 108–111, Hong Kong, April 6–10, 2003.
K.D. Anderson, and P. Gournay, “Pitch resynchronization while recovering from a late frame in a predictive decoder”, 9th International Conference on Spoken Language Processing (Interspeech 2006 – ICSLP), pp. 245–248, Pittsburgh, Pennsylvania, USA, September 17–21, 2006.
A. Ramjee, J. Kurose, D. Towsley, and H. Schulzrinne, “Adaptive playout mechanisms for packetized audio applications in wide-area networks”, The Conference on Computer Communications, 13th Annual Joint Conference of the IEEE Computer and Communications Societies, Networking for Global Communications (INFOCOM’94), pp. 680–688, Toronto, Canada, June 12–16, 1994.
Y.J. Liang, N. Färber, and B. Girod, “Adaptive playout scheduling using time-scale modification in packet voice communications”, IEEE 2001 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’2001), pp. 1445–1448, Salt Lake City, Utah, USA, May 7–11, 2001.
S. Roucos and A.M. Wilgus, “High quality time-scale modification for speech”, IEEE 1985 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’85), pp. 493–496, Tampa, Florida, USA, March 26–29, 1985.
H. Valbret, E. Moulines, and J.-P. Tubach, “Voice transformation using PSOLA technique”, Speech Communication, vol. 11, no. 2–3, pp. 175–187, June 1992.
D. Malah, “Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 27, no. 2,pp. 121–133, April 1979.
P. Gournay, and K.D. Anderson, “Performance analysis of a decoder-based time scaling algorithm for variable jitter buffering of speech over packet networks”, IEEE 2006 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2006), pp. 17–20, Toulouse, France, March 14–19, 2006.
J. Rosenberg, and H. Schulzrinne, “An RTP payload format for generic forward Error Correction”, IETF RFC 2733, December 1999.
C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J.C. Bolot, A. Vega-Garcia, and S. Fosse-Parisis, “RTP payload for redundant audio data”, IETF RFC2198, September 1997.
J.C. Bolot, S. Fosse-Parisis, and D. Towsley, “Adaptive FEC-based error control for Internet telephony”, Proceedings of IEEE INFOCOM’99, pp. 1453–1460, March 1999.
C. Padhye, K. Christensen, and W. Moreno, “A new adaptive FEC loss control algorithm for voice Over IP applications”, 19th IEEE International Performance, Computing and Communication Conference (IPCCC 2000), pp. 307–313, Phoenix, Arizona, USA, February 20–22, 2000.
I. Johansson, T. Frankkila, and P. Synnergren, “Bandwidth efficient AMR operation for VoIP”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 150–152, Ibaraki, Japan, October 6–9, 2002.
L.-A. Larzon, M. Degermark, S. Pink, “The lightweight user datagram protocol (UDP-Lite)”, L.-E. Jonsson and G. Fairhurst, eds., IETF RFC 3828, July 2004.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Lefebvre, R., Gournay, P. (2008). Speech Coders. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_31
Download citation
DOI: https://doi.org/10.1007/978-0-387-30441-0_31
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77698-9
Online ISBN: 978-0-387-30441-0
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)