Abstract
Recent advances in speech coding algorithms and techniques based on the use of linear prediction now permit high quality voice reproduction at remarkably low bit rates. This paper reviews some of the main ideas underlying the algorithms of major interest today. The concept of removing redundancy by linear prediction is reviewed, first in the context of predictive quantization or DPCM. Then linear predictive coding, adaptive predictive coding, and vector quantization are discussed. The concepts of excitation coding via analysis-by-synthesis linear predictive coding is explained and some important enhancements such as vector sum excitations, and adaptive postfiltering are described. Low-delay coding by backward computation of LPC parameters is explained. The concept of phonetic segmentation of speech for closed-loop coding systems is also presented. Linear prediction is the key technique that underlies almost all of the important algorithms for speech coding of interest today. Finally, we discuss some recent work on nonlinear prediction of speech and its potential for the future of speech coding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. D. Markel and A. H. Gray, Jr., Linear Prediction of Speech, Springer-Verlag, New York, NY, 1976.
A. Buzo, A. H. Gray, R. M. Gray, and J. D. Markel, “Speech Coding Based upon Vector Quantization,” IEEE Trans. Acoust., Speech, and Signal Processing, vol. ASSP-28, no. 5, pp. 562–574, October 1980.
V. Cuperman and A. Gersho, “Vector Predictive Coding of Speech at 16 kbits/s,” IEEE Transactions on Communications, vol. COM-33, pp. 685–696, July 1985.
J. H. Chen and A. Gersho, “Vector Adaptive Predictive Coding of Speech at 9.6 kb/s,” Proc. IEEE Inter. Conference on Acoust., Speech, and Signal Processing, pp. 1693–1696, Tokyo, Japan, April 1986.
I. A. Gerson, M. A. Jasiuk, “Vector Sum Excited Linear Prediction,” IEEE Workshop on Speech Coding for Telecommunications, Vancouver, September 1989.
G. Davidson, A. Gersho, “Speech Waveforms,” Proc. Inter. Conf. Acoust., Speech, & Signal Processing, pp. 163–166, April 1988.
S. Singhal and B. S. Atal, “Improving Performance of Multi-Pulse LPC Coders at Low Rates,” Proc. IEEE Inter. Conf Acoustics, Speech, and Signal Processing, vol. 1, pp. 1.3.1–1.3.4, San Diego, March 1984.
R. C. Ross and T. P. Barnwell, “The Self-Excited Vocoder,” Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 453–456, Japan, April, 1986.
P. Kabal, J.L. Moncet, and C.C. Chu, “Synthesis Filter Optimization and Coding: Applications to CELP,” Proc. IEEE Inter. Conf. Acoust., Speech, and Signal Processing, vol. 1, pp. 147–150, New York City, April 1988.
W. B. Kleijn, D. J. Krasinski, R. H. Ketchum, and Improved Speech Quality and Efficient Vector Quantization in SELP, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 155–158, New York, April, 1988.
V. Ramamoorthy, N.S. Jayant, “Enhancement of ADPCM Speech by Adaptive Postfiltering,” Conf. Rec, IEEE Conf. on Commun., pp. 917–920, June 1985.
Y. Yatsuzuka, S. Iizuka, T. Yamazaki, “A variable Rate Coding by APC with Maximum Likelihood Quantization from 4.8 bit/s to 16 kbit/s,” Proc. Inter. Conf Acoust., Speech, & Signal Processing, pp. 3071–3074, April 1986.
J. H. Chen and A. Gersho, “Real-Time Vector APC Speech Coding at 4800 bps with Adaptive Postfiltering,” Proc. Int. Conf. on Acoust., Speech, Signal Processing Speech, and Signal Processing, vol. 4, pp. 2185–2188, Dallas, April 1987.
J.P. Campbell, Jr., V.C. Welch, T.E. Tremain, “An Expandable Error-Protected 4800 BPS CELP Coder (U.S. Federal Standard 4800 BPS Voice Coder),” Proc. Inter. Conf Acoust., Speech, & Signal Processing, pp. 735–738, May 1989.
V. Cuperman, A. Gersho, R. Pettigrew, J. Shynk, J. Yao and J. H. Chen, “Backward Adaptive Configurations for Low-Delay Speech Coding,” Proc, IEEE Global Commun. Conf, November 1989.
J. H. Chen, “A Robust Low-Delay CELP Speech Coder at 16 kb/s,” Proc, IEEE Global Commun. Conf, November 1989.
Shihua Wang and Allen Gersho, “Phonetically-Based Vector Excitation Coding of Speech at 3.6 kbit/s,” Proc. IEEE Inter. Conf. Acousi., Speech, and Signal Processing, Glasgow, May 1989.
Shihua Wang and Allen Gersho, “Phonetic Segmentation for Low Rate Speech Coding,” Advances in Speech Coding, Kluwer Academic Publishers, to appear 1990.
A. Gersho, “Optimal Nonlinear Interpolative Vector Quantization,” IEEE Trans. on Comm., vol. COM-38, No. 9, pp. 1285–1287, September 1990.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1991 Springer-Verlag Wien
About this chapter
Cite this chapter
Gersho, A. (1991). Linear Prediction Techniques in Speech Coding. In: Davisson, L.D., Longo, G. (eds) Adaptive Signal Processing. International Centre for Mechanical Sciences, vol 324. Springer, Vienna. https://doi.org/10.1007/978-3-7091-2840-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-7091-2840-4_2
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-82333-0
Online ISBN: 978-3-7091-2840-4
eBook Packages: Springer Book Archive