Skip to main content

Oscillating Statistical Moments for Speech Polarity Detection

  • Conference paper
Advances in Nonlinear Speech Processing (NOLISP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7015))

Included in the following conference series:

Abstract

An inversion of the speech polarity may have a dramatic detrimental effect on the performance of various techniques of speech processing. An automatic method for determining the speech polarity (which is dependent upon the recording setup) is thus required as a preliminary step for ensuring the well-behaviour of such techniques. This paper proposes a new approach of polarity detection relying on oscillating statistical moments. These moments have the property to oscillate at the local fundamental frequency and to exhibit a phase shift which depends on the speech polarity. This dependency stems from the introduction of non-linearity or higher-order statistics in the moment calculation. The resulting method is shown on 10 speech corpora to provide a substantial improvement compared to state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Drugman, T., Thomas, M., Gudnason, J., Naylor, P., Dutoit, T.: Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review. IEEE Trans. on Audio, Speech and Language Processing (to appear)

    Google Scholar 

  2. Fant, G., Liljencrants, J., Lin, Q.: A four parameter model of glottal flow, STL-QPSR4, pp. 1–13 (1985)

    Google Scholar 

  3. Sakaguchi, S., Arai, T., Murahara, Y.: The Effect of Polarity Inversion of Speech on Human Perception and Data Hiding as Application. In: ICASSP, vol. 2, pp. 917–920 (2000)

    Google Scholar 

  4. Hunt, A., Black, A.: Unit selection in a concatenative speech synthesis system using a large speech database. In: ICASSP, pp. 373–376 (1996)

    Google Scholar 

  5. Moulines, E., Laroche, J.: Non-parametric techniques for pitch-scale and time-scale modification of speech. Speech Communication 16, 175–205 (1995)

    Article  Google Scholar 

  6. Drugman, T., Bozkurt, B., Dutoit, T.: A comparative study of glottal source estimation techniques. Computer Speech and Language 26, 20–34 (2012)

    Article  Google Scholar 

  7. Ding, W., Campbell, N.: Determining Polarity of Speech Signals Based on Gradient of Spurious Glottal Waveforms. In: ICASSP, pp. 857–860 (1998)

    Google Scholar 

  8. Alku, P., Svec, J., Vilkman, E., Sram, F.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication 11(2-3), 109–118 (1992)

    Article  Google Scholar 

  9. Saratxaga, I., Erro, D., Hernáez, I., Sainz, I., Navas, E.: Use of harmonic phase information for polarity detection in speech signals. In: Interspeech, pp. 1075–1078 (2009)

    Google Scholar 

  10. Kominek, J., Black, A.: The CMU Arctic Speech Databases. In: SSW5, pp. 223–224 (2004)

    Google Scholar 

  11. Burkhardt, F., Paseschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Interspeech, pp. 1517–1520 (2005)

    Google Scholar 

  12. Bagshaw, P., Hiller, S., Jack, M.: Enhanced pitch tracking and the processing of f0 contours for computer aided intonation teaching. In: Eurospeech, pp. 1003–1006 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Drugman, T., Dutoit, T. (2011). Oscillating Statistical Moments for Speech Polarity Detection. In: Travieso-González, C.M., Alonso-Hernández, J.B. (eds) Advances in Nonlinear Speech Processing. NOLISP 2011. Lecture Notes in Computer Science(), vol 7015. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25020-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25020-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25019-4

  • Online ISBN: 978-3-642-25020-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics