Skip to main content

TRAP-Based Techniques for Recognition of Noisy Speech

  • Conference paper
Text, Speech and Dialogue (TSD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4629))

Included in the following conference series:

  • 1735 Accesses

Abstract

This paper presents a systematic study of performance of TempoRAl Patterns (TRAP) based features and their proposed modifications and combinations for speech recognition in noisy environment. The experimental results are obtained on AURORA 2 database with clean training data. We observed large dependency of performance of different TRAP modifications on noise level. Earlier proposed TRAP system modifications help in clean conditions but degrade the system performance in presence of noise. The combination techniques on the other hand can bring large improvement in case of weak noise and degrade only slightly for strong noise cases. The vector concatenation combination technique is improving the system performance up to strong noise.

This work was partly supported by European projects Caretaker (FP6-027231), by Grant Agency of Czech Republic under project No. 102/05/0278 and by Czech Ministry of Education under project No. MSM0021630528. The hardware used in this work was partially provided by CESNET under projects No. 119/2004, No. 162/2005 and No. 201/2006.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, B., Zhu, Q., Morgan, N.: Learning long-term temporal features in LVCSR using neural networks. In: Proc. ICSLP 2004, Jeju Island, KR (2004)

    Google Scholar 

  2. Jain, P., Hermansky, H., Kingsbury, B.: Distributed speech recognition using noise-robust MFCC and TRAPS-estimated manner features. In: Proc. of ICSLP 2002, Denver, Colorado, USA (2002)

    Google Scholar 

  3. Adami, A., Burget, L., Dupont, S., Garudadri, H., Grezl, F., Hermansky, H., Jain, P., Kajarekar, S., Morgan, N., Sivadas, S.: Qualcomm-ICSI-OGI features for ASR. In: Proc. ICSLP 2002, Denver, Colorado, USA (2002)

    Google Scholar 

  4. Jain, P., Hermansky, H.: Beyond a single critical-band in TRAP based ASR. In: Proc. Eurospeech 2003, Geneva, Switzerland, pp. 437–440 (2003)

    Google Scholar 

  5. Grézl, F., Hermansky, H.: Local averaging and differentiating of spectral plane for TRAP-based ASR. In: Proc. Eurospeech 2003, Geneva, Switzerland (2003)

    Google Scholar 

  6. Grézl, F.: Combinations of TRAP-based systems. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 323–330. Springer, Heidelberg (2004)

    Google Scholar 

  7. Pearce, D.: Enabling new speech driven servicesfor mobile devices: An overview of the ETSIstandards activities for distributed speech recognition front-ends. In: Applied Voice Input/Output Society Conference (AVIOS 2000), San Jose, CA (2000)

    Google Scholar 

  8. Cole, R., Noel, M., Lander, T., Durham, T.: New telephone speech corpora at CSLU. In: Proc. of EUROSPEECH 1995, Madrid, Spain, pp. 821–824 (1995)

    Google Scholar 

  9. Misra, H., Bourlard, H., Tyagi, V.: New entropy based combination rules in HMM/ANN multi-stream asr. In: Proc. ICASSP 2003, Hong Kong, China (2003)

    Google Scholar 

  10. Grézl, F.: Local time-frequency operators in TRAPs for speech recognition. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 269–274. Springer, Heidelberg (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Václav Matoušek Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grézl, F., Černocký, J. (2007). TRAP-Based Techniques for Recognition of Noisy Speech. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74628-7_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74627-0

  • Online ISBN: 978-3-540-74628-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics