TRAP-Based Techniques for Recognition of Noisy Speech

Grézl, František; Černocký, Jan

doi:10.1007/978-3-540-74628-7_36

František Grézl¹ &
Jan Černocký¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4629))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1735 Accesses

Abstract

This paper presents a systematic study of performance of TempoRAl Patterns (TRAP) based features and their proposed modifications and combinations for speech recognition in noisy environment. The experimental results are obtained on AURORA 2 database with clean training data. We observed large dependency of performance of different TRAP modifications on noise level. Earlier proposed TRAP system modifications help in clean conditions but degrade the system performance in presence of noise. The combination techniques on the other hand can bring large improvement in case of weak noise and degrade only slightly for strong noise cases. The vector concatenation combination technique is improving the system performance up to strong noise.

This work was partly supported by European projects Caretaker (FP6-027231), by Grant Agency of Czech Republic under project No. 102/05/0278 and by Czech Ministry of Education under project No. MSM0021630528. The hardware used in this work was partially provided by CESNET under projects No. 119/2004, No. 162/2005 and No. 201/2006.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chen, B., Zhu, Q., Morgan, N.: Learning long-term temporal features in LVCSR using neural networks. In: Proc. ICSLP 2004, Jeju Island, KR (2004)
Google Scholar
Jain, P., Hermansky, H., Kingsbury, B.: Distributed speech recognition using noise-robust MFCC and TRAPS-estimated manner features. In: Proc. of ICSLP 2002, Denver, Colorado, USA (2002)
Google Scholar
Adami, A., Burget, L., Dupont, S., Garudadri, H., Grezl, F., Hermansky, H., Jain, P., Kajarekar, S., Morgan, N., Sivadas, S.: Qualcomm-ICSI-OGI features for ASR. In: Proc. ICSLP 2002, Denver, Colorado, USA (2002)
Google Scholar
Jain, P., Hermansky, H.: Beyond a single critical-band in TRAP based ASR. In: Proc. Eurospeech 2003, Geneva, Switzerland, pp. 437–440 (2003)
Google Scholar
Grézl, F., Hermansky, H.: Local averaging and differentiating of spectral plane for TRAP-based ASR. In: Proc. Eurospeech 2003, Geneva, Switzerland (2003)
Google Scholar
Grézl, F.: Combinations of TRAP-based systems. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 323–330. Springer, Heidelberg (2004)
Google Scholar
Pearce, D.: Enabling new speech driven servicesfor mobile devices: An overview of the ETSIstandards activities for distributed speech recognition front-ends. In: Applied Voice Input/Output Society Conference (AVIOS 2000), San Jose, CA (2000)
Google Scholar
Cole, R., Noel, M., Lander, T., Durham, T.: New telephone speech corpora at CSLU. In: Proc. of EUROSPEECH 1995, Madrid, Spain, pp. 821–824 (1995)
Google Scholar
Misra, H., Bourlard, H., Tyagi, V.: New entropy based combination rules in HMM/ANN multi-stream asr. In: Proc. ICASSP 2003, Hong Kong, China (2003)
Google Scholar
Grézl, F.: Local time-frequency operators in TRAPs for speech recognition. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 269–274. Springer, Heidelberg (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Speech@FIT, Brno University of Technology, Czech Republic
František Grézl & Jan Černocký

Authors

František Grézl
View author publications
You can also search for this author in PubMed Google Scholar
Jan Černocký
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Václav Matoušek Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grézl, F., Černocký, J. (2007). TRAP-Based Techniques for Recognition of Noisy Speech. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-540-74628-7_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74627-0
Online ISBN: 978-3-540-74628-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TRAP-Based Techniques for Recognition of Noisy Speech