Speech Coders

Lefebvre, Roch; Gournay, Philippe

doi:10.1007/978-0-387-30441-0_31

Roch Lefebvre⁴ &
Philippe Gournay⁴

695 Accesses

Telecommunication systems make intensive use of speech coders. In wireless systems, where bandwidth is limited, speech coders provide one of the enabling technologies to reach more users and furnish better services. In wireline systems, where bandwidth can be less of an issue, speech is also digitized and compressed to a certain extent depending on the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 629.00; Price excludes VAT (USA)

Softcover Book: USD 799.99; Price excludes VAT (USA)

Hardcover Book: USD 799.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A.S. Spanias, “Speech coding: a tutoral review”, Proceedings of the IEEE, vol. 82, no. 10, pp. 1541–1582, October 1994.
Google Scholar
B. Kleijn and K. Paliwal, eds., Speech Coding and Synthesis, Elsevier, 1995.
Google Scholar
L.R. Rabiner, R.W. Shafer, Digital Processing of Speech Signals, Prentice-Hall Signal Processing Series, 1978.
Google Scholar
R.A., Salami, L., Hanzo, R., Steele, K.H.J. Wong, and I. Wassell, Speech coding, in R., Steele, eds., Mobile Radio Communications, chapter 3, pp. 186–346. IEEE Press – Wiley, 1992.
Google Scholar
B. Bessette, R. Salami, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkola, and K. Jarvinen, “The adaptive multirate wideband speech codec (AMR-WB)”, IEEE Transactions on Speech and Audio Processing, vol. 10, no. 8, pp. 620–636, November 2002.
Google Scholar
R. Salami, C. Laflamme, B. Bessette, and J.-P. Adoul, “ITU-T Recommendation G.729 Annex A: reduced complexity 8 kbit/s CS-ACELP codec for digital simultaneous voice and data”, IEEE Communications Magazine, vol. 35, no. 9, pp. 56–63, September 1997.
Google Scholar
C. Laflamme, J.P. Adoul, H.Y. Su, and S. Morissette, “On reducing computational complexity of codebook search in CELP coder through the use of algebraic codes”, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 177–180, Albuquerque, New Mexico, USA, April 3–6, 1990.
Google Scholar
K. Järvinen et al. “GSM enhanced full rate codec”, IEEE 1997 International Conference on Acoustics, Speech and Signal Processing, pp. 771–774, Munich, Germany, April 20–24, 1997.
Google Scholar
ITU-T Recommendation P.48, “Specification for an intermediate reference system, volume V of the Blue Book”, pp. 81–86, ITU, Geneva, February 1996.
Google Scholar
J. Thiemann, Acoustic Noise Suppression for Speech Signals Using Auditory Masking Effects, Masters Thesis, McGill University, 2001.
Google Scholar
S. Ahmadi and M. Jelinek, “On the architecture, operation, and applications of VMR-WB: the new cdma2000 wideband speech coding standard”, IEEE Communications Magazine, vol. 44, no. 5, pp. 74–81, May 2006.
Google Scholar
M. Jelinek and R. Salami, “Noise reduction method for wideband speech coding”, 12th European Signal Processing Conference (EUSIPCO 2004), pp. 1959–1962, Vienna, Austria, September 6–10, 2004.
Google Scholar
A.S. Spanias, “Perceptual coding of digital audio”, Proceedings of the IEEE, vol. 88, no. 4, pp. 451–513, April 2000.
Google Scholar
E. Ordentlich and Y. Shoham, “Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps”, IEEE 1991 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91), pp. 9–12, Toronto, Canada, May 14–17, 1991.
Google Scholar
3GPP Technical Specification TS26.401, “General audio codec audio processing functions; Enhanced aacPlus general audio codec; General description”, June 2006.
Google Scholar
3GPP Technical Specification TS26.290, “Audio codec processing functions; Extended adaptive multi-Rate – wideband (AMR-WB+) codec; Transcoding functions”, June 2005.
Google Scholar
M. Schug, A. Groschel, M. Beer, and F. Henn, “Enhancing audio coding efficiency of MPEG Layer-2 with spectral band replication (SBR) for DigitalRadio (EUREKA 147/DAB) in a backwards compatible way”, 114th Audio Engineering Society Convention, preprint no. 5850, Amsterdam, The Netherlands, March 22–25, 2003.
Google Scholar
R. Salami, R. Lefebvre, and C. Laflamme, “A wideband codec at 16/24 kbit/s with 10 ms frames”, 1997 IEEE Workshop on Speech Coding, pp. 103–104, Pocono Manor, Pennsylvania, USA, September 7–10, 1997.
Google Scholar
ITU-T Rec. G.722.1, “Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss”, September 1999.
Google Scholar
T.B. Minde, S. Bruhn, E. Ekudden, and H. Hermansson, “Requirements on speech coders imposed by speech service solutions in cellular systems”, 1997 IEEE Workshop on Speech Coding,pp. 89–90, Pocono Manor, Pennsylvania, USA, September 7–10, 1997.
Google Scholar
A. Uvliden, S. Bruhn, and R. Hagen, “Adaptive multi-rate. A speech service adapted to cellular radio network quality”, 32nd Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 343–347, Pacific Grove, California, USA, November 1–4, 1998.
Google Scholar
S. Bruhn, P. Blocher, K. Hellwig, and J. Sjöberg, “Concepts and solutions for link adaptation and inband signalling for the GSM AMR speech coding standard”, IEEE Vehicular Technology Conference, pp. 2451–2455, Amsterdam, The Netherlands,September 19–22, 1999.
Google Scholar
K. Järvinen, “Standardisation of the adaptive multi-rate codec”, 10th European Signal Processing Conference (EUSIPCO 2000),pp. 1313–1316, Tampere, Finland, September 4–8, 2000.
Google Scholar
J. Sjöberg, M. Westerlund, A. Lakaniemi, and Q. Xie, “Real-time transport protocol (RTP) payload format and file storage format for the adaptive multi-rate (AMR) and adaptive multi-rate wideband (AMR-WB) audio codec”, IETF RFC 3267, June 2002.
Google Scholar
D.J. Goodman, “Embedded DPCM for variable bit rate transmission”, IEEE Transactions on Communications, vol. 28, no. 7, pp. 1040–1046, July 1980.
Google Scholar
R.D. De Iacovo and D. Sereno, “Embedded CELP coding for variable bit-rate between 6.4 and 9.6 kbit/s”, IEEE 1991 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’91), pp. 681–684, Toronto, Canada, May 14–17, 1991.
Google Scholar
S.A. Ramprashad, “A two stage hybrid embedded speech/audio coding structure”, IEEE 1998 International Conference on Acoustics, Speech, and Signal Processing (ISACCP’98), pp. 337–340, Seattle, Washington, USA, May 12–15, 1998.
Google Scholar
S. Ragot et al., “ITU-T G.729.1: an 8–32 kbit/s scalable coder interoperable with G.729 for wideband telephony and voice over IP”, IEEE 2007 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2007), Honolulu, Hawaii, USA, April 15–20, 2007.
Google Scholar
A. Gersho, and E. Paksoy, “An overview of variable rate speech coding for cellular networks”, International Conference on Selected Topics in Wireless Communications, pp. 172–175, Vancouver, Canada, June 25–26, 1992.
Google Scholar
E. Paksoy, K. Srinivasan, and A. Gersho, “Variable Bit Rate CELP coding of speech with phonetic classification”, European Transactions on Telecommunications and Related Technologies, vol. 5, no. 5, pp. 591–602, September–October 1994.
Google Scholar
A. DeJaco, W. Gardner, P. Jacobs, and C. Lee, “QCELP: The North American CDMA digital cellular variable rate speech coding standard”, 1993 IEEE Workshop on Speech Coding for Telecommunications, pp. 5–6, Sainte-Adèle, Québec, Canada, October 13–15, 1993.
Google Scholar
W.B. Kleijin, P. Kroon, and D. Nahumi, “The RCELP speech-coding algorithm”, European Transactions on Telecommunications and Related Technologies, vol. 5, no. 5, pp. 573–582, September–October, 1994.
Google Scholar
S.C. Greer, and A. DeJaco, “Standardization of the selectable mode vocoder”, IEEE 2001 International Conference on Acoustics, Speech and Signal Processing (ICASSP’01), pp. 953–956, Salt Lake City, Utah, USA, May 7–11, 2001.
Google Scholar
M. Tammi, M. Jelinek, and V.T. Ruoppila, “A signal modification method for variable bit rate wideband speech coding”, IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 799–810, September 2005.
Google Scholar
M. Jelinek, R. Salami, S. Ahmadi, B. Bessette, P. Gournay,C. Laflamme, and R. Lefebvre, “Advances in source-controlled variable bit rate wideband speech coding”, Special Workshop in Maui (SWIM), Lectures by Masters in Speech Processing, Maui, Hawaii, January 12–14, 2004.
Google Scholar
A. Glavieux, Channel Coding in Communication Networks: From Theory to Turbo Codes, Iste Publishing Company, 2007.
Google Scholar
A. Gersho and R.M. Gray, Vector Quantization and Signal Compression, Kluwer Academic Publishers, 1991.
Google Scholar
M. Skoglund, “On channel-constrained vector quantization and index assignment for discrete memoryless channels”, IEEE Transactions on Information Theory, vol. 45, no. 6, pp. 2615–2622, November 1999.
Google Scholar
H. Kumazawa, M. Kasahara, and T. Namekawa. “A construction of vector quantizers for noisy channels”, Electronics and Engineering in Japan, vol. 67-B(1), pp. 39–47, January 1984.
Google Scholar
K. Zeger, and A. Gersho, “Pseudo-gray coding”, IEEE Transactions on Communications, vol. 38, no. 12, pp. 2147–2158, December 1990.
Google Scholar
J. Skoglund, and J. Linden, “Predictive VQ for noisy channel spectrum coding: AR or MA?”, IEEE 1997 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’97),pp. 1351–1354, Munich, Germany, April 21–24, 1997.
Google Scholar
T. Eriksson, J. Linden, and J. Skoglund, “Interframe LSF quantization for noisy channels”, IEEE Transactions on Speech and Audio Processing, vol. 7, no. 5, pp. 495–509, September 1999.
Google Scholar
J.G. Beerends, A.W. Rix, M.P. Hollier, and A.P. Hekstra, “Perceptual evaluation of speech quality (PESQ): The new ITU standard for end-to-end speech quality assessment, Part I – time-delay compensation; Part II – psychoacoustic model”, Journal of the Audio Engineering Society, vol. 50, no. 10, pp. 765–778, October 2002.
Google Scholar
“AMR Wideband Speech Codec; Frame Structure”, 3GPP Technical Specification 3GPP TS 26.201, March 2001.
Google Scholar
M. Chibani, “Increasing the robustness of CELP speech codecs against packet losses”, Ph.D. Thesis, University of Sherbrooke, Canada, January 2007.
Google Scholar
C. Perkins, O. Hodson, and V. Hardman, “A survey of packet-loss recovery techniques for streaming audio”, IEEE Network, pp. 40–48, September–October 1998.
Google Scholar
B.W. Wah, X. Su, and D. Lin, “A survey of error-concealment schemes for real-time audio and video transmission over the Internet”, 2000 International Symposium on Multimedia Software Engineering, pp. 17–24, Taipei, Taiwan, December 11–13, 2000.
Google Scholar
E. Gündüzhan and K. Momtahan, “A linear prediction based packet loss concealment algorithm for PCM coded speech”, IEEE Transaction on Speech and Audio Processing, vol. 9, no. 8, pp. 778–785, November 2001.
Google Scholar
J. Lindblom and P. Hedelin, “Packet loss concealment based on sinusoidal modeling”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 65–67, Ibaraki, Japan, October 6–9, 2002.
Google Scholar
V. Vilaysouk and R. Lefebvre, “A hybrid concealment algorithm for non-predictive wideband audio coders”, 120th Audio Engineering Society Convention, preprint no.6670, Paris, France, May 20–23, 2006.
Google Scholar
R. Salami, C. Laflamme, J. Adoul, A. Kataoka, S. Hayashi, T. Moriya, C. Lamblin, D. Massaloux, S. Proust, P. Kroon, and Y. Shoham, “Design and description of CS-ACELP: a toll quality 8 kb/s speech coder”, IEEE Transactions on Speech Audio Processing, vol. 6, no. 2, pp. 116–130, March 1998.
Google Scholar
H. Sanneck and N. Le, “Speech property-based FEC for Internet telephony applications”, Proceedings of SPIE vol. 3969, pp. 38–51, Multimedia Computing and Networking 2000, San Jose, California, USA, January 24–26, 2000.
Google Scholar
S.V. Andersen et al., “ILBC – A linear predictive coder with robustness to packet losses”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 23–25, Ibaraki, Japan, October 6–9, 2002.
Google Scholar
R. Lefebvre, P. Gournay, and R. Salami, “A study of design compromises for speech coders in packet networks”, IEEE 2004 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2004), pp. 265–268, Montréal, Canada, May 17–21, 2004.
Google Scholar
M. Chibani, P. Gournay, and R. Lefebvre, “Increasing the robustness of CELP-based coders by constrained optimization”, IEEE 2005 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2005), pp. 785–788, Philadelphia, Pennsylvania, USA, March 19–23, 2005.
Google Scholar
M. Chibani, R. Lefebvre, and P. Gournay, “Resynchronization of the Adaptive codebook in a constrained CELP codec after a frame erasure”, IEEE 2006 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2006), pp. 13–16, Toulouse, France, March 14–19, 2006.
Google Scholar
P. Gournay, F. Rousseau, and R. Lefebvre, “Improved packet loss recovery using late frames for prediction-based speech coders”, IEEE 2003 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2003), pp. 108–111, Hong Kong, April 6–10, 2003.
Google Scholar
K.D. Anderson, and P. Gournay, “Pitch resynchronization while recovering from a late frame in a predictive decoder”, 9th International Conference on Spoken Language Processing (Interspeech 2006 – ICSLP), pp. 245–248, Pittsburgh, Pennsylvania, USA, September 17–21, 2006.
Google Scholar
A. Ramjee, J. Kurose, D. Towsley, and H. Schulzrinne, “Adaptive playout mechanisms for packetized audio applications in wide-area networks”, The Conference on Computer Communications, 13th Annual Joint Conference of the IEEE Computer and Communications Societies, Networking for Global Communications (INFOCOM’94), pp. 680–688, Toronto, Canada, June 12–16, 1994.
Google Scholar
Y.J. Liang, N. Färber, and B. Girod, “Adaptive playout scheduling using time-scale modification in packet voice communications”, IEEE 2001 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’2001), pp. 1445–1448, Salt Lake City, Utah, USA, May 7–11, 2001.
Google Scholar
S. Roucos and A.M. Wilgus, “High quality time-scale modification for speech”, IEEE 1985 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’85), pp. 493–496, Tampa, Florida, USA, March 26–29, 1985.
Google Scholar
H. Valbret, E. Moulines, and J.-P. Tubach, “Voice transformation using PSOLA technique”, Speech Communication, vol. 11, no. 2–3, pp. 175–187, June 1992.
Google Scholar
D. Malah, “Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 27, no. 2,pp. 121–133, April 1979.
Google Scholar
P. Gournay, and K.D. Anderson, “Performance analysis of a decoder-based time scaling algorithm for variable jitter buffering of speech over packet networks”, IEEE 2006 International Conference on Acoustics, Speech and Signal Processing (ICASSP’2006), pp. 17–20, Toulouse, France, March 14–19, 2006.
Google Scholar
J. Rosenberg, and H. Schulzrinne, “An RTP payload format for generic forward Error Correction”, IETF RFC 2733, December 1999.
Google Scholar
C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J.C. Bolot, A. Vega-Garcia, and S. Fosse-Parisis, “RTP payload for redundant audio data”, IETF RFC2198, September 1997.
Google Scholar
J.C. Bolot, S. Fosse-Parisis, and D. Towsley, “Adaptive FEC-based error control for Internet telephony”, Proceedings of IEEE INFOCOM’99, pp. 1453–1460, March 1999.
Google Scholar
C. Padhye, K. Christensen, and W. Moreno, “A new adaptive FEC loss control algorithm for voice Over IP applications”, 19th IEEE International Performance, Computing and Communication Conference (IPCCC 2000), pp. 307–313, Phoenix, Arizona, USA, February 20–22, 2000.
Google Scholar
I. Johansson, T. Frankkila, and P. Synnergren, “Bandwidth efficient AMR operation for VoIP”, 2002 IEEE Speech Coding Workshop Proceedings, pp. 150–152, Ibaraki, Japan, October 6–9, 2002.
Google Scholar
L.-A. Larzon, M. Degermark, S. Pink, “The lightweight user datagram protocol (UDP-Lite)”, L.-E. Jonsson and G. Fairhurst, eds., IETF RFC 3828, July 2004.
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Sherbrooke, Sherbrooke, QC, Canada
Roch Lefebvre & Philippe Gournay

Authors

Roch Lefebvre
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Gournay
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Research Council Institute for Microstructural Sciences, Acoustics and Signal Processing Group, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada
David Havelock
Department of Environmental Psychology, Osaka University Graduate School of Human Sciences, 1-2 Yamadaok Suita, Osaka, Japan
Sonoko Kuwano
Institute of Technical Acoustics, RWTH Aachen University, Aachen, Germany
Michael Vorländer

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lefebvre, R., Gournay, P. (2008). Speech Coders. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_31

Download citation

DOI: https://doi.org/10.1007/978-0-387-30441-0_31
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77698-9
Online ISBN: 978-0-387-30441-0
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics