Error Concealment

Haeb-Umbach, Reinhold; Ion, Valentin

doi:10.1007/978-1-84800-143-5_9

Error Concealment

Reinhold Haeb-Umbach³ &
Valentin Ion³

Chapter

1170 Accesses

Part of the book series: Advances in Pattern Recognition ((ACVPR))

In distributed and network speech recognition the actual recognition task is not carried out on the user’s terminal but rather on a remote server in the network. While there are good reasons for doing so, a disadvantage of this client-server architecture is clearly that the communication medium may introduce errors, which then impairs speech recognition accuracy. Even sophisticated channel coding cannot completely prevent the occurrence of residual bit errors in the case of temporarily adverse channel conditions, and in packet-oriented transmission packets of data may arrive too late for the given real-time constraints and have to be declared lost. The goal of error concealment is to reduce the detrimental effect that such errors may induce on the recipient of the transmitted speech signal by exploiting residual redundancy in the bit stream at the source coder output. In classical speech transmission a human is the recipient, and erroneous data are reconstructed so as to reduce the subjectively annoying effect of corrupted bits or lost packets. Here, however, a statistical classifier is at the receiving end, which can benefit from knowledge about the quality of the reconstruction. In this book chapter we show how the classical Bayesian decision rule needs to be modified to account for uncertain features, and illustrate how the required feature posterior density can be estimated in the case of distributed speech recognition. Some other techniques for error concealment can be related to this approach. Experimental results are given for both a small and a medium vocabulary recognition task and both for a channel exhibiting bit errors and a packet erasure channel.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arrowood, J.A. and Clements, M.A. (2002). Using observation uncertainty in HMM decod-ing. In Proc. ICSLP, Denver, Colorado.
Google Scholar
Bahl, L., Cocke, J., Jelinek, F. and Raviv, J. (1974). Optimal decoding of linear codes for minimizing symbol error rate, IEEE Trans. Inf. Theory, vol. 10, pp. 284-287.
Article MathSciNet Google Scholar
Bernard, A. and Alwan, A. (2001). Joint channel decoding—Viterbi recognition for wireless applications. In Proc. Eurospeech, Aalborg, Denmark.
Google Scholar
Bernard, A. and Alwan, A. (2002). Low-bitrate distributed speech recognition for packet-based and wireless communication. IEEE Trans. Speech and Audio Process., vol. 10, no. 8, Nov., 2002.
Google Scholar
Boulis, C., Ostendorf, M., Riskin, E.A. and Otterson, S. (2002). Graceful degradation of speech recognition performance over packet-erasure networks. IEEE Trans. on Speech and Audio Processing, vol. 10, no. 8, Nov. pp. 580-590.
Article Google Scholar
Cardenal-López, A., García-Mateo, C. and Docío-Fernández, L. (2006). Weighted Viterbi decoding strategies for distributed speech recognition over IP networks, Speech Commu-nication, vol. 48, no. 11, Nov., pp. 1422-1434.
Article Google Scholar
COST 207 (1989). Digital land mobile radio communication—Final report. Office for offi-cial publications of the European Communities, Luxembourg.
Google Scholar
Cox, R.V., Kleijn, W.B. and Kroon, P. (1989). Robust CELP coders for noisy backgrounds and noisy channels. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 1989, pp. 739-742.
Google Scholar
Davis, S.B. and Mermelstein P. (1980). Comparison of parametric representations for mono-syllabic word recognition in continuously spoken sentences. IEEE Trans. on Acoust. Speech and Signal Process., vol. 28, pp. 357-366.
Article Google Scholar
Droppo, J., Acero, A. and Deng, L. (2002). Uncertainty decoding with Splice for noise robust speech recognition. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Orlando, Florida.
Google Scholar
Endo, T., Kuroiwa, S. and Nakamura, S. (2003). Missing feature theory applied to robust speech recognition over IP networks. In Proc. Eurospeech, Geneva, Switzerland. ETSI Standard ES 202 050 (2002). Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms. v1.1.1, Oct.
Google Scholar
ETSI Standard ES 201 108 (2003a). Speech processing, transmission and quality aspects (STQ);distributed speech recognition; front-end feature extraction algorithm; compression algorithms. v1.1.3, Sep.
Google Scholar
ETSI Standard TS 100 909 v8.7.1 (2003b). Digital cellular telecommunications system (phase 2+); channel coding. (3GPP TS 05.03 version 8.7.0; Release 1999).
Google Scholar
Fingscheidt, T., Aalburg, S., Stan, T. and Beaugeant, C. (2002). Network-based versus distrib-uted speech recognition in adaptive multi-rate wireless systems. In Proc. Int. Conf. on Spoken Language Proc., Denver.
Google Scholar
Fingscheidt, T. and Vary, P. (2001). Softbit speech decoding: A new approach to error con-cealment. IEEE Trans. Speech and Audio Proc., vol. 9, no. 3, March, pp. 1-11.
Google Scholar
Gómez, A.M., Peinado, A.M., Sánchez, V. and Rubio, J. (2007). On the Ramsey class of interleavers for robust speech recognition in burst-like packet loss, IEEE Trans. Audio Speech and Lang. Process., vol. 15, no. 4, May, pp. 1496-1499.
Article Google Scholar
GSM 06.11 Recommendation (1992). Substitution and muting of lost frames for full rate speech traffic channels. ETSI TC-SMG.
Google Scholar
Haeb-Umbach, R. and Ion, V. (2004). Soft features for improved distributed speech recogni-tion over wireless networks. In Proc. ICSLP, Jeju, Korea.
Google Scholar
Hirsch, H.G. and Pearce, D. (2000). The Aurora experimental framework for the performance evaluation ofspeech recognition systems undernoisy conditions. In Proc. ISCA ITRW Workshop ASR2000, Paris, France, pp. 181-188.
Google Scholar
Ion, V. and Haeb-Umbach, R. (2005). A unified probabilistic approach to error concealment for distributed speech recognition. In Proc. Interspeech, Lisbon.
Google Scholar
Ion, V. and Haeb-Umbach, R. (2006a). Uncertainty decoding for distributed speech recogni-tion over error-prone networks, Speech Communication 48, pp. 1435-1446.
Article Google Scholar
Ion, V. and Haeb-Umbach, R. (2006b). An inexpensive packet loss compensation scheme for distributed speech recognition based on soft-features. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Toulouse, France.
Google Scholar
Ion, V. and Haeb-Umbach, R. (2006c). Improved source modeling and predictive classifica-tion for channel robust speech recognition. In Proc. Interspeech, Pittsburgh. ITU-T Recommendation G.711 Appendix I (1999). A high quality low-complexity algorithm for packet loss concealment with G.711.
Google Scholar
James, A.B., Gomez, A. and Milner, B.P. (2004). A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss. In Proc. ICSLP, Jeju, Korea.
Google Scholar
Kristjansson, T.T. and Frey, B.J. (2002). Accounting for uncertainty in observations: A new paradigm for robust speech recognition. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Orlando, Florida.
Google Scholar
Lahouti, F. and Khandani, A.K. (2007). Soft reconstruction of speech in the presence of noise and packet loss. IEEE Trans. Audio Speech and Lang. Proc., vol. 15, no. 1, Jan., pp. 44-56.
Article Google Scholar
Liao, H. and Gales, M.J.F. (2004). Uncertainty decoding for noise robust automatic speech recognition. Technical Report TR.499, Cambridge University Engineering Department.
Google Scholar
Milner, B. and Semnani, S. (2000). Robust speech recognition over IP networks. In Proc. Int. Conf. Acoust. Speech Signal Process., Istanbul, Turkey.
Google Scholar
Morris, A., Cooke, M. and Green, P. (1998). Some solutions to the missing feature problem in data classification, with application to noise-robust ASR. In Proc. Int. Conf. Acoust. Speech Signal Process., Seattle.
Google Scholar
Morris, A., Barker, J. and Bourlard, H. (2001). From missing data to maybe useful data: Soft data modeling for noise robust ASR. In Proc. WISP, vol. 6.
Google Scholar
Paul, D. and Baker, J. (1992). The design for the Wall Street Journal-based CSR corpus. DARPA Technical Report.
Google Scholar
Pearce, D. (2000). Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends. In Proc. Voice Input/Output Soc. Speech Applications Conference, May.
Google Scholar
Peinado, A.M., Sanchez, V., Perez-Cordoba, J.L. and de la Torre, A. (2003). HMM-based channel error mitigation and its application to distributed speech recognition. Speech Communication, 41, pp. 549-561.
Article Google Scholar
Potamianos, A. and Weerackody, V. (2001). Soft-feature decoding for speech recognition over wireless channels. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Salt Lake City, Utah.
Google Scholar
RFC 2460 (1998). Internet Protocol, Version 6 (IPv6) Specification, http://www.ietf.org/rfc/ rfc2460.txt, Internet Engineering Task Force, Dec.
RFC 3828 (2004). The Lightweight User Datagram Protocol (UDP-Lite), http://www.ietf.org/ rfc/rfc3828.txt, _{Internet Engineering Task Force, July.}
Tan, Z.-H., Dalsgaard, P. and Lindberg, B. (2004). A subvector-based error concealment algorithm for speech recognition over mobile networks. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Montreal, Quebec, Canada.
Google Scholar
Tan, Z.H., Dalsgaard, P. and Lindberg, B. (2005). Automatic speech recognition over error-prone wireless networks, Speech Communication, vol. 47, no. 1-2, Sep.-Oct., pp 220-242.
Article Google Scholar
Vary, P. and Martin, R. (2006). Digital Speech Transmission—Enhancement, Coding and Error Concealment. John Wiley, New York.
Google Scholar
Weerackody, V., Reichl, W. and Potamianos, A. (2002). An error-protected speech recogni-tion system for wireless communications. IEEE Trans. on Wireless Communications, vol. 1, no. 2, April, pp. 282-291.
Article Google Scholar
Young, S.J. et al. (2004). HTK: Hidden Markov Model Toolkit V3.2.1 Reference Manual. Cambridge University Speech Group, Cambridge, U.K.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Communications Engineering, University of Paderborn, 33095, Paderborn, Germany
Reinhold Haeb-Umbach & Valentin Ion

Authors

Reinhold Haeb-Umbach
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Ion
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Haeb-Umbach, R., Ion, V. (2008). Error Concealment. In: Automatic Speech Recognition on Mobile Devices and over Communication Networks. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84800-143-5_9

Download citation

DOI: https://doi.org/10.1007/978-1-84800-143-5_9
Publisher Name: Springer, London
Print ISBN: 978-1-84800-142-8
Online ISBN: 978-1-84800-143-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Buying options