Skip to main content

Advanced Methods for Glottal Wave Extraction

  • Conference paper
Nonlinear Analyses and Algorithms for Speech Processing (NOLISP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3817))

Abstract

Glottal inverse filtering is a technique used to derive the glottal waveform during voiced speech. Closed phase inverse filtering (CPIF) is a common approach for achieving this goal. During the closed phase there is no input to the vocal tract and hence the impulse response of the vocal tract can be determined through linear prediction. However, a number of problems are known to exist with the CPIF approach. This review paper briefly details the CPIF technique and highlights certain associated theoretical and methodological problems. An overview is then given of advanced methods for inverse filtering: model based, adaptive iterative, higher order statistics and cepstral approaches are examined. The advantages and disadvantages of these methods are highlighted. Outstanding issues and suggestions for further work are outlined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akande, O., Murphy, P.J.: Estimation of the vocal tract transfer function for voiced speech with application to glottal wave analysis. Speech Communication 46, 15–36 (2005)

    Article  Google Scholar 

  2. Alkhairy, A.: An algorithm for glottal volume velocity estimation. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 233–236 (1999)

    Google Scholar 

  3. Alku, P., Vilkman, E., Laine, U.K.: Analysis of glottal waveform in different phonation types using the new IAIF-method. Proc. 12th Int. Congress Phonetic Sciences 4, 362–365 (1991)

    Google Scholar 

  4. Alku, P.: An automatic method to estimate the time-based parameters of the glottal pulseform. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 2, pp. 29–32 (1992)

    Google Scholar 

  5. Alku, P.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication 11, 109–118 (1992)

    Article  Google Scholar 

  6. Alku, P., Vilkman, E.: Estimation of the glottal pulseform based on Discrete All-Pole modeling. In: Proc. Int. Conf. on Spoken Language Processing, pp. 1619–1622 (1994)

    Google Scholar 

  7. Ananthapadmanabha, T.V., Fant, G.: Calculation of true glottal flow and its components. Speech Communication 1, 167–184 (1982)

    Article  Google Scholar 

  8. Chen, W.-T., Chi, C.-Y.: Deconvolution and vocal-tract parameter estimation of speech signals by higher-order statistics based inverse filters. In: Proc. IEEE Workshop on HOS, pp. 51–55 (1993)

    Google Scholar 

  9. Childers, D.G., Principe, J.C., Ting, Y.T.: Adaptive WRLS-VFF for Speech Analysis. IEEE Trans. Speech and Audio Proc. 3, 209–213 (1995)

    Article  Google Scholar 

  10. Childers, D.G., Hu, H.T.: Speech synthesis by glottal excited linear prediction. J. Acoust. Soc. Amer. 96, 2026–2036 (1994)

    Article  Google Scholar 

  11. Deller, J.R.: Some notes on closed phase glottal inverse filtering. IEEE Trans. Acoust., Speech, Signal Proc. 29, 917–919 (1981)

    Article  Google Scholar 

  12. Erdem, A.T., Tekalp, A.M.: Linear Bispectrum of Signals and Identification of Nonminimum Phase FIR Systems Driven by Colored Input. IEEE Trans. Signal Processing 40, 1469–1479 (1992)

    Article  MATH  Google Scholar 

  13. Fant, G.C.M.: Acoustic Theory of Speech Production. Mouton, The Hague (1970)

    Google Scholar 

  14. Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. STL-QPR, 1–14 (1985)

    Google Scholar 

  15. A recursive maximum likelihood algorithm for ARMA spectral estimation. IEEE Trans. Inform. Theory 28, 639–646 (1982)

    Google Scholar 

  16. Fu, Q., Murphy, P.J.: Adapive Inverse filtering for High Accuracy Estimation of the Glottal Source. In: Proc. NoLisp 2003 (2003)

    Google Scholar 

  17. Fu, Q., Murphy, P.J.: Robust glottal source estimation based on joint source-filter model optimization. Accepted for publication, IEEE Transactions on Speech and Audio Processing (2005)

    Google Scholar 

  18. Hedelin, P.: High quality glottal LPC-vocoding. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, pp. 465–468 (1986)

    Google Scholar 

  19. Hess, W.: Pitch Determination of Speech Signals: Algorithms and Devices. Springer, Heidelberg (1983)

    Google Scholar 

  20. Hinich, M.J., Shichor, E.: Bispectral Analysis of Speech. In: Proc. 17th Convention of Electrical and Electronic Engineers in Israel, pp. 357–360 (1991)

    Google Scholar 

  21. Hinich, M.J., Wolinsky, M.A.: A test for aliasing using bispectral components. J. Am. Stat. Assoc. 83, 499–502 (1988)

    Article  MathSciNet  Google Scholar 

  22. Holmes, J.N.: Formant excitation before and after glottal closure Proc. In: IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 39–42 (1976)

    Google Scholar 

  23. Hunt, M.J., Bridle, J.S., Holmes, J.N.: Interactive digital inverse filtering and its relation to linear prediction methods. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 15–19 (1978)

    Google Scholar 

  24. Ishizaka, K., Flanagan, J.L.: Synthesis of voiced sounds from a two mass model of the vocal cords. Bell Syst. Tech. J. 51, 1233–1268 (1972)

    Google Scholar 

  25. Jiang, Y., Murphy, P.J.: Production based pitch modification of voiced speech. In: Proc. Int. Conf. Spoken Language Processing, pp. 2073–2076 (2002)

    Google Scholar 

  26. Konvalinka, I.S., Mataušek, M.R.: Simultaneous estimation of poles and zeros in speech analysis and ITIT-iterative inverse filtering algorithm. IEEE Trans. Acoust., Speech, Signal Proc. 27, 485–492 (1979)

    Article  Google Scholar 

  27. Kopec, G.E., Oppenheim, A.V., Tribolet, J.M.: Speech Analysis by Homomorphic. Prediction IEEE Trans.Acoust., Speech, Signal Proc. 25, 40–49 (1977)

    Article  Google Scholar 

  28. Krishnamurthy, A.K.: Glottal Source Estimation using a Sum-of-Exponentials Model. IEEE Trans. Signal Processing 40, 682–686 (1992)

    Article  Google Scholar 

  29. Krishnamurthy, A.K., Childers, D.G.: Two-channel speech analysis. IEEE Trans. Acoust., Speech, Signal Proc. 34, 730–743 (1986)

    Article  Google Scholar 

  30. Lee, D.T.L., Morf, M., Friedlander, B.: Recursive least squares ladder estimation algorithms. IEEE Trans. Acoust., Speech, Signal Processing 29, 627–641 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  31. Makhoul, J.: Linear Prediction: A Tutorial Review. Proc. IEEE 63, 561–580 (1975)

    Article  Google Scholar 

  32. Mendel, J.M.: Tutorial on Higher-Order Statistics (Spectra) in Signal Processing and System Theory: Theoretical Results and Some Applications. Proc. IEEE 79, 278–305 (1991)

    Article  Google Scholar 

  33. Milenkovic, P.: Glottal Inverse Filtering by Joint Estimation of an AR System with a Linear Input Model. IEEE Trans. Acoust., Speech, Signal Proc. 34, 28–42 (1986)

    Article  Google Scholar 

  34. Milenkovic, P.H.: Voice source model for continuous control of pitch period. J. Acoust. Soc. Amer. 93, 1087–1096 (1993)

    Article  Google Scholar 

  35. Miyanaga, Y., Miki, M., Nagai, N.: Adaptive Identification of a Time-Varying ARMA Speech Model. IEEE Trans. Acoust., Speech, Signal Proc. 34, 423–433 (1986)

    Article  Google Scholar 

  36. Moore, E., Clements, M.: Algorithm for automatic glottal waveform estimation without the reliance on precise glottal closure information. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 101–104 (2004)

    Google Scholar 

  37. Morikawa, H., Fujisaki, H.: Adaptive Analysis of Speech based on a Pole-Zero Representation. IEEE Trans. Acoust., Speech, Signal Proc. 30, 77–87 (1982)

    Article  Google Scholar 

  38. Nikias, C.L., Raghuveer, M.R.: Bispectrum Estimation:A Digital Signal Processing Framework. Proc. IEEE 75, 869–891 (1987)

    Article  Google Scholar 

  39. Oppenheim, A.V.: A speech analysis-synthesis system based on homomorphic filtering. J. Acoust., Soc. Amer. 45, 458–465 (1969)

    Article  Google Scholar 

  40. Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing. Prentice-Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  41. Pan, R., Nikias, C.L.: The complex cepstrum of higher order cumulants and nonminimum phase system identification. IEEE Trans. Acoust., Speech, Signal Proc. 36, 186–205 (1988)

    Article  MATH  Google Scholar 

  42. Parthasarathy, S., Tufts, D.W.: Excitation-Synchronous Modeling of Voiced Speech. IEEE Trans. Acoust., Speech, Signal Proc. 35, 1241–1249 (1987)

    Article  Google Scholar 

  43. Plumpe, M.D., Quatieri, T.F., Reynolds, D.A.: Modeling of the Glottal Flow Derivative Waveform with Application to Speaker Identification. IEEE Trans. Speech and Audio Proc. 7, 569–586 (1999)

    Article  Google Scholar 

  44. Rosenberg, A.: Effect of the glottal pulse shape on the quality of natural vowels. J. Acoust. Soc. Amer. 49, 583–590 (1971)

    Article  Google Scholar 

  45. Steiglitz, K.: On the simultaneous estimation of poles and zeros in speech analysis. IEEE Trans. Acoust., Speech, Signal Proc. 25, 194–202 (1977)

    Google Scholar 

  46. Steiglitz, K., McBride, L.E.: A technique for the identifcation of linear systems. IEEE Trans. Automat. Contr. 10, 461–464 (1965)

    Article  Google Scholar 

  47. Tekalp, A.M., Erdem, A.T.: Higher-Order Spectrum Factorization in One and Two Dimensions with Applications in Signal Modeling and Nonminimum Phase System Identification. IEEE Trans. Acoust., Speech, Signal Proc. 37, 1537–1549 (1989)

    Article  MATH  Google Scholar 

  48. Thomson, M.M.: A new method for determining the vocal tract transfer function and its excitation from voiced speech. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 2, pp. 23–26 (1992)

    Google Scholar 

  49. Ting, Y.T., Childers, D.G.: Speech Analysis using the Weighted Recursive Least Squares Algorithm with a Variable Forgetting Factor. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 389–392 (1990)

    Google Scholar 

  50. Veeneman, D.E., BeMent, S.L.: Automatic Glottal Inverse Filtering from Speech and Electroglottographic Signals. IEEE Trans. Acoust., Speech, Signal Proc. 33, 369–377 (1985)

    Article  Google Scholar 

  51. van Dinther, R., Kohlrausch, A., Veldhuis, R.: A method for measuring the perceptual relevance of glottal pulse parameter variations. Speech Communication 42, 175–189 (2004)

    Article  Google Scholar 

  52. Walker, J.: Application of the bispectrum to glottal pulse analysis. In: Proc. NoLisp 2003 (2003)

    Google Scholar 

  53. Wong, D.Y., Markel, J.D., Gray, A.H.: Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Trans. Acoust., Speech, Signal Proc. 27, 350–355 (1979)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Walker, J., Murphy, P. (2006). Advanced Methods for Glottal Wave Extraction. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds) Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science(), vol 3817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11613107_12

Download citation

  • DOI: https://doi.org/10.1007/11613107_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31257-4

  • Online ISBN: 978-3-540-32586-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics