Skip to main content

Error Concealment

  • Chapter
  • 1170 Accesses

Part of the book series: Advances in Pattern Recognition ((ACVPR))

In distributed and network speech recognition the actual recognition task is not carried out on the user’s terminal but rather on a remote server in the network. While there are good reasons for doing so, a disadvantage of this client-server architecture is clearly that the communication medium may introduce errors, which then impairs speech recognition accuracy. Even sophisticated channel coding cannot completely prevent the occurrence of residual bit errors in the case of temporarily adverse channel conditions, and in packet-oriented transmission packets of data may arrive too late for the given real-time constraints and have to be declared lost. The goal of error concealment is to reduce the detrimental effect that such errors may induce on the recipient of the transmitted speech signal by exploiting residual redundancy in the bit stream at the source coder output. In classical speech transmission a human is the recipient, and erroneous data are reconstructed so as to reduce the subjectively annoying effect of corrupted bits or lost packets. Here, however, a statistical classifier is at the receiving end, which can benefit from knowledge about the quality of the reconstruction. In this book chapter we show how the classical Bayesian decision rule needs to be modified to account for uncertain features, and illustrate how the required feature posterior density can be estimated in the case of distributed speech recognition. Some other techniques for error concealment can be related to this approach. Experimental results are given for both a small and a medium vocabulary recognition task and both for a channel exhibiting bit errors and a packet erasure channel.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Arrowood, J.A. and Clements, M.A. (2002). Using observation uncertainty in HMM decod-ing. In Proc. ICSLP, Denver, Colorado.

    Google Scholar 

  • Bahl, L., Cocke, J., Jelinek, F. and Raviv, J. (1974). Optimal decoding of linear codes for minimizing symbol error rate, IEEE Trans. Inf. Theory, vol. 10, pp. 284-287.

    Article  MathSciNet  Google Scholar 

  • Bernard, A. and Alwan, A. (2001). Joint channel decoding—Viterbi recognition for wireless applications. In Proc. Eurospeech, Aalborg, Denmark.

    Google Scholar 

  • Bernard, A. and Alwan, A. (2002). Low-bitrate distributed speech recognition for packet-based and wireless communication. IEEE Trans. Speech and Audio Process., vol. 10, no. 8, Nov., 2002.

    Google Scholar 

  • Boulis, C., Ostendorf, M., Riskin, E.A. and Otterson, S. (2002). Graceful degradation of speech recognition performance over packet-erasure networks. IEEE Trans. on Speech and Audio Processing, vol. 10, no. 8, Nov. pp. 580-590.

    Article  Google Scholar 

  • Cardenal-López, A., García-Mateo, C. and Docío-Fernández, L. (2006). Weighted Viterbi decoding strategies for distributed speech recognition over IP networks, Speech Commu-nication, vol. 48, no. 11, Nov., pp. 1422-1434.

    Article  Google Scholar 

  • COST 207 (1989). Digital land mobile radio communication—Final report. Office for offi-cial publications of the European Communities, Luxembourg.

    Google Scholar 

  • Cox, R.V., Kleijn, W.B. and Kroon, P. (1989). Robust CELP coders for noisy backgrounds and noisy channels. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 1989, pp. 739-742.

    Google Scholar 

  • Davis, S.B. and Mermelstein P. (1980). Comparison of parametric representations for mono-syllabic word recognition in continuously spoken sentences. IEEE Trans. on Acoust. Speech and Signal Process., vol. 28, pp. 357-366.

    Article  Google Scholar 

  • Droppo, J., Acero, A. and Deng, L. (2002). Uncertainty decoding with Splice for noise robust speech recognition. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Orlando, Florida.

    Google Scholar 

  • Endo, T., Kuroiwa, S. and Nakamura, S. (2003). Missing feature theory applied to robust speech recognition over IP networks. In Proc. Eurospeech, Geneva, Switzerland. ETSI Standard ES 202 050 (2002). Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms. v1.1.1, Oct.

    Google Scholar 

  • ETSI Standard ES 201 108 (2003a). Speech processing, transmission and quality aspects (STQ);distributed speech recognition; front-end feature extraction algorithm; compression algorithms. v1.1.3, Sep.

    Google Scholar 

  • ETSI Standard TS 100 909 v8.7.1 (2003b). Digital cellular telecommunications system (phase 2+); channel coding. (3GPP TS 05.03 version 8.7.0; Release 1999).

    Google Scholar 

  • Fingscheidt, T., Aalburg, S., Stan, T. and Beaugeant, C. (2002). Network-based versus distrib-uted speech recognition in adaptive multi-rate wireless systems. In Proc. Int. Conf. on Spoken Language Proc., Denver.

    Google Scholar 

  • Fingscheidt, T. and Vary, P. (2001). Softbit speech decoding: A new approach to error con-cealment. IEEE Trans. Speech and Audio Proc., vol. 9, no. 3, March, pp. 1-11.

    Google Scholar 

  • Gómez, A.M., Peinado, A.M., Sánchez, V. and Rubio, J. (2007). On the Ramsey class of interleavers for robust speech recognition in burst-like packet loss, IEEE Trans. Audio Speech and Lang. Process., vol. 15, no. 4, May, pp. 1496-1499.

    Article  Google Scholar 

  • GSM 06.11 Recommendation (1992). Substitution and muting of lost frames for full rate speech traffic channels. ETSI TC-SMG.

    Google Scholar 

  • Haeb-Umbach, R. and Ion, V. (2004). Soft features for improved distributed speech recogni-tion over wireless networks. In Proc. ICSLP, Jeju, Korea.

    Google Scholar 

  • Hirsch, H.G. and Pearce, D. (2000). The Aurora experimental framework for the performance evaluation ofspeech recognition systems undernoisy conditions. In Proc. ISCA ITRW Workshop ASR2000, Paris, France, pp. 181-188.

    Google Scholar 

  • Ion, V. and Haeb-Umbach, R. (2005). A unified probabilistic approach to error concealment for distributed speech recognition. In Proc. Interspeech, Lisbon.

    Google Scholar 

  • Ion, V. and Haeb-Umbach, R. (2006a). Uncertainty decoding for distributed speech recogni-tion over error-prone networks, Speech Communication 48, pp. 1435-1446.

    Article  Google Scholar 

  • Ion, V. and Haeb-Umbach, R. (2006b). An inexpensive packet loss compensation scheme for distributed speech recognition based on soft-features. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Toulouse, France.

    Google Scholar 

  • Ion, V. and Haeb-Umbach, R. (2006c). Improved source modeling and predictive classifica-tion for channel robust speech recognition. In Proc. Interspeech, Pittsburgh. ITU-T Recommendation G.711 Appendix I (1999). A high quality low-complexity algorithm for packet loss concealment with G.711.

    Google Scholar 

  • James, A.B., Gomez, A. and Milner, B.P. (2004). A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss. In Proc. ICSLP, Jeju, Korea.

    Google Scholar 

  • Kristjansson, T.T. and Frey, B.J. (2002). Accounting for uncertainty in observations: A new paradigm for robust speech recognition. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Orlando, Florida.

    Google Scholar 

  • Lahouti, F. and Khandani, A.K. (2007). Soft reconstruction of speech in the presence of noise and packet loss. IEEE Trans. Audio Speech and Lang. Proc., vol. 15, no. 1, Jan., pp. 44-56.

    Article  Google Scholar 

  • Liao, H. and Gales, M.J.F. (2004). Uncertainty decoding for noise robust automatic speech recognition. Technical Report TR.499, Cambridge University Engineering Department.

    Google Scholar 

  • Milner, B. and Semnani, S. (2000). Robust speech recognition over IP networks. In Proc. Int. Conf. Acoust. Speech Signal Process., Istanbul, Turkey.

    Google Scholar 

  • Morris, A., Cooke, M. and Green, P. (1998). Some solutions to the missing feature problem in data classification, with application to noise-robust ASR. In Proc. Int. Conf. Acoust. Speech Signal Process., Seattle.

    Google Scholar 

  • Morris, A., Barker, J. and Bourlard, H. (2001). From missing data to maybe useful data: Soft data modeling for noise robust ASR. In Proc. WISP, vol. 6.

    Google Scholar 

  • Paul, D. and Baker, J. (1992). The design for the Wall Street Journal-based CSR corpus. DARPA Technical Report.

    Google Scholar 

  • Pearce, D. (2000). Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends. In Proc. Voice Input/Output Soc. Speech Applications Conference, May.

    Google Scholar 

  • Peinado, A.M., Sanchez, V., Perez-Cordoba, J.L. and de la Torre, A. (2003). HMM-based channel error mitigation and its application to distributed speech recognition. Speech Communication, 41, pp. 549-561.

    Article  Google Scholar 

  • Potamianos, A. and Weerackody, V. (2001). Soft-feature decoding for speech recognition over wireless channels. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Salt Lake City, Utah.

    Google Scholar 

  • RFC 2460 (1998). Internet Protocol, Version 6 (IPv6) Specification, http://www.ietf.org/rfc/ rfc2460.txt, Internet Engineering Task Force, Dec.

  • RFC 3828 (2004). The Lightweight User Datagram Protocol (UDP-Lite), http://www.ietf.org/ rfc/rfc3828.txt, Internet Engineering Task Force, July.

  • Tan, Z.-H., Dalsgaard, P. and Lindberg, B. (2004). A subvector-based error concealment algorithm for speech recognition over mobile networks. In Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Montreal, Quebec, Canada.

    Google Scholar 

  • Tan, Z.H., Dalsgaard, P. and Lindberg, B. (2005). Automatic speech recognition over error-prone wireless networks, Speech Communication, vol. 47, no. 1-2, Sep.-Oct., pp 220-242.

    Article  Google Scholar 

  • Vary, P. and Martin, R. (2006). Digital Speech Transmission—Enhancement, Coding and Error Concealment. John Wiley, New York.

    Google Scholar 

  • Weerackody, V., Reichl, W. and Potamianos, A. (2002). An error-protected speech recogni-tion system for wireless communications. IEEE Trans. on Wireless Communications, vol. 1, no. 2, April, pp. 282-291.

    Article  Google Scholar 

  • Young, S.J. et al. (2004). HTK: Hidden Markov Model Toolkit V3.2.1 Reference Manual. Cambridge University Speech Group, Cambridge, U.K.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag London Limited

About this chapter

Cite this chapter

Haeb-Umbach, R., Ion, V. (2008). Error Concealment. In: Automatic Speech Recognition on Mobile Devices and over Communication Networks. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84800-143-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-84800-143-5_9

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84800-142-8

  • Online ISBN: 978-1-84800-143-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics