How is speech processed in a cell phone conversation?
Although most people see the cell phone as an extension of conventional wired phone service or POTS (plain old telephone service), the truth is that cell phone technology is extremely complex and a marvel of technology. Very few people realize that these small devices perform hundreds of millions of operations per second to be able to maintain a phone conversation. If we take a closer look at the module that converts the electronic version of the speech signal into a sequence of bits, we see that for every 20 ms of input speech, a set of speech model parameters is computed and transmitted to the receiver. The receiver converts these parameters back into speech. In this chapter, we will see how linear predictive (LP) analysis- synthesis lies at the very heart of mobile phone transmission of speech. We first start with an introduction to linear predictive speech modeling and follow with a MATLAB-based proof of concept.
KeywordsVocal Tract Spectral Envelope Synthetic Speech Pitch Period Inverse Filter
Unable to display preview. Download preview PDF.
- Atal BS, Remde JR (1982) A New Model LPC Excitation for Producing Natural Sounding Speech at Low Bit Rates. In: Proc. ICASSP’82, pp 614–617Google Scholar
- de la Cuadra P (2007) Pitch Detection Methods Review [online] Available: http://www-ccrma.stanford.edu/~pdelac/154/ml54paper.htm [20/2/1007]Google Scholar
- Ellis D (2006) Matlab Audio Processing Examples [online] Available: http://www.ee.columbia.edu/%7Edpwe/resources/matlab/ [20/2/2007]Google Scholar
- Fant G (1960) Acoustic Theory of Speech Production. The Hague: MoutonGoogle Scholar
- Fellbaum K (2007) Human Speech Production Based on a Linear Predictive Vocoder [online] Available: http://www.kt.tu-cottbus.de/speech-analysis/ [20/2/2007]Google Scholar
- Gray RM (2006) Packet speech on the Arpanet: A history of early LPC speech and its accidental impact on the Internet Protocol [online] Available: http://www.ieee.org/organizations/society/sp/Packet_Speech.pdf [20/2/2007]Google Scholar
- Hess W (1992) Pitch and Voicing Determination. In: Advances in Speech Signal Processing, S. Furui, M. Sondhi, eds., Dekker, New York, pp 3–48Google Scholar
- Khan A, Kashif F (2003) Speech Coding with Linear Predictive Coding (LPC) [online] Available: http://www.dspexperts.com/dsp/projects/lpc [20/2/2007]Google Scholar
- Matsumoto J, Nishiguchi M, Iijima K (1997) Harmonic Vector Excitation Coding at 2.0 kbps. In: Proc. IEEE Workshop on Speech Coding, pp 39–40Google Scholar
- NATO (1984) Parameters and coding characteristics that must be common to assure interoperability of 2400 bps linear predictive encoded speech. NATO Standard STANAG-4I98-EdlGoogle Scholar
- Quatieri T (2002) Discrete-Time Speech Signal Processing: Principles and Practice. Prentice-Hall, Inc.: Upper Saddle River, NJGoogle Scholar
- Rabiner LR, Schafer RW (1978) Digital Processing of Speech Signals. Prentice-Hall, Inc.: Englewood Cliffs, NJGoogle Scholar
- Schroeder MR, Atal B (1985) Code-Excited Linear Prediction(CELP): High Quality Speech at Very Low Bit Rates. In: Proc. IEEE ICASSP-85, pp 937–940Google Scholar
- Spanias A, Painter T (2002) Matlab simulation of LPClOe vocoder [online] Available: http://www.cysip.net/lpc10e_FORM.htm [19/2/2007]Google Scholar
- Woodard J (2007) Speech coding [online] Available: http://www-mobile.ecs.soton.ac.uk/speech_codecs/ [20/2/2007]Google Scholar