Abstract
This study presents chirp group delay processing techniques for spectral analysis of speech signals. It is known that group delay processing is potentially very useful for spectral analysis of speech signals. However, it is also well known that group delay processing is difficult due to large spikes that mask the formant structure. In this chapter, we first discuss the sources of spikes on group delay functions, namely the zeros closely located to the unit circle. We then propose processing of chirp group delay functions, i.e. group delay functions computed on a circle other than the unit circle in z-plane. Chirp group delay functions can be guaranteed to be spike-free if zero locations can be controlled. The technique we use here for that is to compute the zero-phased version of the signal for which the zeros appear very close (or on) the unit circle. The final representation obtained is named as the chirp group delay of zero-phased version of a signal (CGDZP). We demonstrate use of CGDZP in two applications: formant tracking and feature extraction for automatic speech recognition (ASR). We show that high quality formant tracking can be performed by simply picking peaks on CGDZP and CGDZP is potentially useful for improving ASR performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Yegnanarayana, B., Duncan, G., Murthy, H.A.: Improving formant extraction from speech using minimum-phase group delay spectra. In: Proc. of European Signal Processing Conference (EUSIPCO), Grenoble, France, Sep. 5-8, 1988, pp. 447–450 (1988)
Murthy, H.A., Murthy, K.V., Yegnanarayana, B.: Formant extraction from phase using weighted group delay function. Electronics Letters 25(23), 1609–1611 (1989)
Murthy, H.A., Yegnanarayana, B.: Formant extraction from group delay function. Speech Communication 10(3), 209–221 (1991)
Bozkurt, B., Doval, B., D’Alessandro, C., Dutoit, T.: Appropriate windowing for group delay analysis and roots of z-transform of speech signals. In: Proc. of European Signal Processing Conference (EUSIPCO), Vienna, Austria, Sep. 6–10 (2004)
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-Time Signal Processing, 2nd edn. Prentice-Hall, Englewood Cliffs (1999)
Yegnanarayana, B., Saikia, D.K., Krishnan, T.R.: Significance of group delay functions in signal reconstruction from spectral magnitude or phase. IEEE Trans. on Acoustics, Speech and Signal Processing 32(3), 610–623 (1984)
Hegde, R.M., Murthy, H.A., Gadde, V.R.: The modified group delay feature: A new spectral representation of speech. In: Proc. of International Conference on Spoken Language Processing (ICSLP), Jeju Island, Korea, Oct. 4-8 (2004)
Bozkurt, B.: New spectral methods for analysis of source/filter characteristics of speech signals. PhD Thesis, Faculté Polytechnique De Mons, Presses universitaires de Louvain (2006)
Fant, G.: The LF-model revisited. Transformation and frequency domain analysis. Speech Trans. Lab. Q. Rep., Royal Inst. of Tech. Stockholm 2-3, 121–156 (1995)
Demo Page for Zeros of the Z-Transform (ZZT) Representation: http://tcts.fpms.ac.be/demos/zzt
Zhu, D., Paliwal, K.K.: Product of power spectrum and group delay function for speech recognition. In: Proc. of International Conference on Acoustics, Speech and Signal Processing (ICASSP), Montreal, Canada, May 17–21, 2004, pp. 125–128 (2004)
Rabiner, L.R., Schafer, R.W., Rader, C.M.: The chirp z-transform algorithm and its application. Bell System Tech. J. 48(5), 1249–1292 (1969)
Hirsch, H.G., Pearce, D.: The AURORA experimental framework for the performance evaluation of speech recognition Systems under noisy conditions. In: Proc. of ASR 2000, Paris, France, Sep. 18–20 (2000)
Hegde, R.M., Murthy, H.A., Gadde, V.R.: Continuous speech recognition using joint features derived from the modified group delay function and MFCC. In: Proc. of International Conference on Spoken Language Processing (ICSLP), Jeju Island, Korea, Oct. 4-8 (2004)
Boite, J.-M., Couvreur, L., Dupont, S., Ris, C.: Speech Training and Recognition Unified Tool (STRUT), http://tcts.fpms.ac.be/asr/project/strut
Bourlard, H., Morgan, N.: Connectionist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, Dordrecht (1994)
Gong, Y.: Speech recognition in noisy environments: a survey. Speech Communication 16(3), 261–291 (1995)
Junqua, J.C.: Robust Speech Processing in Embedded Systems and PC Applications. Kluwer Academic Publishers, Dordrecht (2000)
Bozkurt, B., Dutoit, T.: Mixed-phase speech modeling and formant estimation, using differential phase spectrums. In: Proc. of ISCA ITRW VOQUAL, Aug. 2003, pp. 21–24 (2003)
Introduction page for Chirp Group Delay processing: http://tcts.fpms.ac.be/demos/zzt/cgd.html
Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of z-transform representation with application to source-filter separation in speech. IEEE Signal Processing Letters 12(4), 344–347 (2005)
Fant, G.: Acoustic Theory of Speech Production. Mouton and Co., The Hague (1960)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Bozkurt, B., Dutoit, T., Couvreur, L. (2007). Spectral Analysis of Speech Signals Using Chirp Group Delay. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds) Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science, vol 4391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71505-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-71505-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71503-0
Online ISBN: 978-3-540-71505-4
eBook Packages: Computer ScienceComputer Science (R0)