Quality-Aware Loss-Robust Scalable Speech Streaming Based on Speech Quality Estimation

Kang, Jin Ah; Choi, Seung Ho; Kim, Hong Kook

doi:10.1007/978-3-642-27201-1_16

Jin Ah Kang¹⁰,
Seung Ho Choi¹¹ &
Hong Kook Kim¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 266))

Included in the following conference series:

International Conference on Future Generation Communication and Networking

848 Accesses
1 Citations

Abstract

This paper proposes a quality-aware loss-robust scalable speech streaming (QLSSS) method to improve the perceived speech quality (PSQ) of a scalable wideband speech streaming (SWSS) system over IP networks. To this end, the proposed method estimates the PSQ and the packet loss rate (PLR) from the received speech data. Subsequently, it decides the amount of redundant speech data (RSD) that a speech decoder can use to reconstruct lost speech signals for high PLRs. According to this decision, the proposed method optimizes a scalable speech coding mode for current speech data (CSD) and RSD bitstreams in order to prevent speech quality from being degraded under the estimated packet loss condition and maintain the transmission bandwidth. The effectiveness of the proposed method is then demonstrated using the ITU-T Recommendations G.729.1 and P.563 as a scalable wideband speech codec and a PSQ estimator, respectively. It is shown from the experiments that an SWSS system employing the proposed QLSSS method significantly improves speech quality under packet loss conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wu, C.-F., Lee, C.-L., Chang, W.-W.: Perceptual-based playout mechanisms for multi-stream voice over IP networks. In: Proceedings of Annual Conference of the International Speech Communication Association (Interspeech), Antwerp, Belgium, pp. 1673–1676 (2007)
Google Scholar
Zhang, Q., Wang, G., Xiong, Z., Zhou, J., Zhu, W.: Error robust scalable audio streaming over wireless IP networks. IEEE Transactions on Multimedia 6(6), 897–909 (2004)
Article Google Scholar
Park, N.I., Kim, H.K., Jung, M.A., Lee, S.R., Choi, S.H.: A packet loss concealment algorithm robust to burst packet loss using multiple codebooks and comfort noise for CELP-type speech coders. CCIS, vol. 120, pp. 138–147 (2010)
Google Scholar
Bolot, J.-C., Fosse-Parisis, S., Towsley, D.: Adaptive FEC-based error control for Internet telephony. In: Proceedings of IEEE International Conference on Computer Communications (INFOCOM), New York, NY, pp. 1453–1460 (1999)
Google Scholar
Jiang, W., Schulzrinne, H.: Comparison and optimization of packet loss repair methods on VoIP perceived quality under bursty loss. In: Proceedings of 12th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), Miami, FL, pp. 73–81 (2002)
Google Scholar
Yung, C., Fu, H., Tsui, C., Cheng, R.S., George, D.: Unequal error protection for wireless transmission of MPEG audio. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), Orlando, FL, vol. 6, pp. 342–345 (1999)
Google Scholar
Hagenauer, J., Stockhammer, T.: Channel coding and transmission aspects for wireless multimedia. Proceedings of the IEEE 87(10), 1764–1777 (1999)
Article Google Scholar
Ito, A., Konno, K., Makino, S.: Packet loss concealment for MDCT-based audio codec using correlation-based side information. International Journal of Innovative Computing, Information and Control 6(3B), 1347–1361 (2010)
Google Scholar
ITU-T Recommendation G.729.1: An 8-32 kbit/s Scalable Wideband Coder Bitstream Interoperable with G.729 (2006)
Google Scholar
ITU-T Recommendation P.563: 563: Single-Ended Method for Objective Audio Quality Assessment in Narrow-Band Telephony Applications (2004)
Google Scholar
Bessette, B., Salami, R., Lefebvre, R., Jelinek, M., Rotola-Pukkila, J., Vainio, J., Mikkola, H., Jarvinen, K.: The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing 10(8), 620–636 (2002)
Article Google Scholar
IETF RFC 4749.: RTP Payload Format for the G.729.1 Audio Codec (2006)
Google Scholar
IETF RFC 1889.: RTP: A Transport Protocol for Real-Time Applications (1996)
Google Scholar
NTT-AT.: Multi-Lingual Speech Database for Telephonometry (1994)
Google Scholar
ITU-T Recommendation G.191: Software Tools for Speech and Audio Coding Standardization (1996)
Google Scholar
ITU-T Recommendation P.862: Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communications, Gwangju Institute of Science and Technology (GIST), Gwangju, 500-712, Korea
Jin Ah Kang & Hong Kook Kim
Department of Electronic and Information Engineering, Seoul National University of Science and Technology, Seoul, 139-743, Korea
Seung Ho Choi

Authors

Jin Ah Kang
View author publications
You can also search for this author in PubMed Google Scholar
Seung Ho Choi
View author publications
You can also search for this author in PubMed Google Scholar
Hong Kook Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Multimedia Engineering Department, Hannam University, 133 Ojeong-dong, Daeduk-gu, Daejeon, Korea
Tai-hoon Kim
The Ohio State University, 470 Hitchcock Hall, 2070 Neil Avenue, 43210-1275, Columbus, OH, USA
Hojjat Adeli
National Chiao Tung University, Hsinchu, Taiwan, R.O.C.
Wai-chi Fang
University of Western Macedonia, Kozani, Greece
Thanos Vasilakos
Jet Propulsion Laboratory/Caltech, NASA, 4800 Oak Grove Drive, 91109, Pasadena, CA, USA
Adrian Stoica
National Technical University of Athens, Heroon Politechniou 9, Zographou T.K., 157 73, Athens, Greece
Charalampos Z. Patrikakis
The Computing Laboratory, University of Kent, CT2 7NF, Canterbury, UK
Gansen Zhao
Universidad Complutense de Madrid, 28040, Madrid, Spain
Javier García Villalba
The University of Alabama, Tuscaloosa, AL, USA
Yang Xiao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kang, J.A., Choi, S.H., Kim, H.K. (2011). Quality-Aware Loss-Robust Scalable Speech Streaming Based on Speech Quality Estimation. In: Kim, Th., et al. Communication and Networking. FGCN 2011. Communications in Computer and Information Science, vol 266. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27201-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-27201-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27200-4
Online ISBN: 978-3-642-27201-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics