Abstract
In this chapter, compression of audio information is reviewed, with special consideration paid to speech compression. To begin with, we recall some of the issues covered in Chap. 6 on digital audio in multimedia. Here, this is combined with techniques that exploit the temporal redundancy present in audio signals. We extend the Pulse Code Modulation (PCM) scheme to DPCM, prepending the word “Differential,” as briefly introduced in Chap. 6 but fleshed out here. Specifically, in this chapter, we look at ADPCM, Vocoders, and more general Speech Compression: LPC, CELP, MBE, and MELP. Adaptive DPCM is ADPCM. In speech coding, a number of standards have evolved and we set these out here, including some of their fundamental strategies. We then go on to study coders (encoding/decoding algorithms) specifically aimed at speech compression. The properties of Vocoders are examined, including the notion of phase insensitivity, channels, and formants. Next, LPC (Linear Predictive Coding) vocoders are discussed, followed by CELP (Code Excited Linear Prediction), a more complex family of coders. Hybrid Excitation Vocoders are another large class of speech coders, and we round the discussion off by having a look at MBE (Multi-Band Excitation) and MELP (Multiband Excitation Linear Predictive) vocoders.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
N.S. Jayant, P. Noll, Digital Coding of Waveforms (Prentice-Hall, Upper Saddle River, 1984)
J.C. Bellamy, Digital Telephony (Wiley, Hoboken, 2000)
T.E. Tremain, The government standard linear predictive coding algorithm: LPC-10. Speech Technol. 1(2), 40–49 (1982)
J.P. Campbell Jr., T.E. Tremain, V.C. Welch, in Advances in Speech Coding, The DOD 4.8 kbps Standard (Proposed Federal Standard 1016), (Kluwer Academic Publishers, Boston, 1991)
Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s. ITU-T recommendation G.723.1 (1996), http://www.itu.int/rec/T-REC-G.723.1/e
GSM enhanced full rate (EFR) speech transcoding (GSM 06.60). European Telecommunications Standards Institute v.8.0.1 (1999)
TDMA Cellular/PCS radio interface-enhanced full rate speech codec standard. TIA/EIA/IS-641-A (1998), http://engineers.ihs.com/document/abstract/OVXADAAAAAAAAAAA
Coding of speech at 16 kbit/s using low-delay code excited linear programming. ITU-T Recommendation G.728 (1992), http://www.itu.int/rec/T-REC-G.728/e
Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP). ITU-T Recommendation G.729 (1996), http://www.itu.int/rec/T-REC-G.729/e
D.W. Griffin, J.S. Lim, Multi-band excitation vocoder. IEEE Trans. ASSP 36(8), 1223–1235 (1988)
M.S. Brandstein, P.A. Monta, J.C. Hardwick, J.S. Lim, A real-time implementation of the improved MBE speech coder. Int. Conf. on Acoustics, Speech, and Signal Proc. (1990), pp. 5–8
T.P. Barnwellm III, A.V. McCree, Mixed excitation LPC vocoder model for low bit rate speech coding. IEEE Trans. Speech Audio Proc. 3(4), 242–250 (1995)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Li, ZN., Drew, M.S., Liu, J. (2014). Basic Audio Compression Techniques. In: Fundamentals of Multimedia. Texts in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-05290-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-05290-8_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05289-2
Online ISBN: 978-3-319-05290-8
eBook Packages: Computer ScienceComputer Science (R0)