Skip to main content

On the Importance of Pre-emphasis and Window Shape in Phase-Based Speech Recognition

  • Conference paper
Advances in Nonlinear Speech Processing (NOLISP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7911))

Included in the following conference series:

Abstract

This paper aims at investigating the potentials of the phase spectrum in automatic speech recognition (ASR). We show that speech phase spectrum could potentially provide features with high discriminability and robustness. Out of such belief and to realize a higher portion of the phase spectrum potentials, we propose two simple amendments in two common blocks in feature extraction, namely pre-emphasis and windowing, without changing the workflow of the algorithms. Recognition tests over Aurora 2 indicate up to 11.2% and 14.7% performance improvement in average in the presence of both additive and convolutional noises for phase-based MODGDF and CGDF features, respectively. It proves the high potentials of the phase spectrum in robust ASR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ohm, G.S.: Uber die Definition des Tones, nebst daran geknupfter Theorie der Sirene und ahnlicher tonbildender Vorrichtungen. Ann. Phys. Chem. 59, 513–565 (1843)

    Google Scholar 

  2. von Helmholtz, H.L.F.: On the Sensations of Tone (English translation by A.J. Ellis). Longmans, Green and Co., London (1912) (original work published 1875)

    Google Scholar 

  3. Oppenheim, A.V., Lim, J.S.: The importance of phase in signals. Proc. IEEE 69, 529–541 (1981)

    Article  Google Scholar 

  4. Wang, D.L., Lim, J.S.: The unimportance of phase in speech enhancement. IEEE Trans. Acoust. Speech Signal Process, ASSP 30(4), 679–681 (1982)

    Article  Google Scholar 

  5. Liu, L., He, J., Palm, G.: Effects of phase on the perception of intervocalic stop consonants. Speech Commun. 22(4), 403–417 (1997)

    Article  Google Scholar 

  6. Paliwal, K.K., Alsteris, L.D.: Usefulness of phase spectrum in human speech perception. In: Proc. of Eurospeech, pp. 2117–2120 (September 2003)

    Google Scholar 

  7. Murthy, H.A., Gadde, V.: The modified group delay function and its application to phoneme recognition. In: Proc. ICASSP, pp. 68–71 (April 2003)

    Google Scholar 

  8. Bozkurt, B., Couvreur, L., Dutoit, T.: Chirp group delay analysis of speech signals. Speech Commun. 49(3), 159–176 (2007)

    Article  Google Scholar 

  9. Loweimi, E., Ahadi, S.M., Sheikhzadeh, H.: Phase-only speech reconstruction using short frames. In: Proc. InterSpeech, Florence, Italy (2011)

    Google Scholar 

  10. Loweimi, E., Ahadi, S.M., Loveymi, S.: On the importance of phase and magnitude spectra in speech enhancement. In: Proc. ICEE, Tehran, Iran (May 2011)

    Google Scholar 

  11. Hirsch, H.G., Pearce, D.: The AURORA experimental framework for the performance evaluation of speech recognition Systems under noisy conditions. In: Proc. ASR 2000, Paris, France (September 2000)

    Google Scholar 

  12. Young, S.J., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book Version 3.4. Cambridge University Press, Cambridge (2006)

    Google Scholar 

  13. Makhoul, J., Viswanathan, R.: Adaptive preprocessing for linear predictive speech com-pression systems. Journal of Acoustic Society of America 55, 475 (1974)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Loweimi, E., Ahadi, S.M., Drugman, T., Loveymi, S. (2013). On the Importance of Pre-emphasis and Window Shape in Phase-Based Speech Recognition. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38847-7_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38846-0

  • Online ISBN: 978-3-642-38847-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics