Abstract
This paper proposes a quality-aware loss-robust scalable speech streaming (QLSSS) method to improve the perceived speech quality (PSQ) of a scalable wideband speech streaming (SWSS) system over IP networks. To this end, the proposed method estimates the PSQ and the packet loss rate (PLR) from the received speech data. Subsequently, it decides the amount of redundant speech data (RSD) that a speech decoder can use to reconstruct lost speech signals for high PLRs. According to this decision, the proposed method optimizes a scalable speech coding mode for current speech data (CSD) and RSD bitstreams in order to prevent speech quality from being degraded under the estimated packet loss condition and maintain the transmission bandwidth. The effectiveness of the proposed method is then demonstrated using the ITU-T Recommendations G.729.1 and P.563 as a scalable wideband speech codec and a PSQ estimator, respectively. It is shown from the experiments that an SWSS system employing the proposed QLSSS method significantly improves speech quality under packet loss conditions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wu, C.-F., Lee, C.-L., Chang, W.-W.: Perceptual-based playout mechanisms for multi-stream voice over IP networks. In: Proceedings of Annual Conference of the International Speech Communication Association (Interspeech), Antwerp, Belgium, pp. 1673–1676 (2007)
Zhang, Q., Wang, G., Xiong, Z., Zhou, J., Zhu, W.: Error robust scalable audio streaming over wireless IP networks. IEEE Transactions on Multimedia 6(6), 897–909 (2004)
Park, N.I., Kim, H.K., Jung, M.A., Lee, S.R., Choi, S.H.: A packet loss concealment algorithm robust to burst packet loss using multiple codebooks and comfort noise for CELP-type speech coders. CCIS, vol. 120, pp. 138–147 (2010)
Bolot, J.-C., Fosse-Parisis, S., Towsley, D.: Adaptive FEC-based error control for Internet telephony. In: Proceedings of IEEE International Conference on Computer Communications (INFOCOM), New York, NY, pp. 1453–1460 (1999)
Jiang, W., Schulzrinne, H.: Comparison and optimization of packet loss repair methods on VoIP perceived quality under bursty loss. In: Proceedings of 12th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), Miami, FL, pp. 73–81 (2002)
Yung, C., Fu, H., Tsui, C., Cheng, R.S., George, D.: Unequal error protection for wireless transmission of MPEG audio. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), Orlando, FL, vol. 6, pp. 342–345 (1999)
Hagenauer, J., Stockhammer, T.: Channel coding and transmission aspects for wireless multimedia. Proceedings of the IEEE 87(10), 1764–1777 (1999)
Ito, A., Konno, K., Makino, S.: Packet loss concealment for MDCT-based audio codec using correlation-based side information. International Journal of Innovative Computing, Information and Control 6(3B), 1347–1361 (2010)
ITU-T Recommendation G.729.1: An 8-32 kbit/s Scalable Wideband Coder Bitstream Interoperable with G.729 (2006)
ITU-T Recommendation P.563: 563: Single-Ended Method for Objective Audio Quality Assessment in Narrow-Band Telephony Applications (2004)
Bessette, B., Salami, R., Lefebvre, R., Jelinek, M., Rotola-Pukkila, J., Vainio, J., Mikkola, H., Jarvinen, K.: The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing 10(8), 620–636 (2002)
IETF RFC 4749.: RTP Payload Format for the G.729.1 Audio Codec (2006)
IETF RFC 1889.: RTP: A Transport Protocol for Real-Time Applications (1996)
NTT-AT.: Multi-Lingual Speech Database for Telephonometry (1994)
ITU-T Recommendation G.191: Software Tools for Speech and Audio Coding Standardization (1996)
ITU-T Recommendation P.862: Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kang, J.A., Choi, S.H., Kim, H.K. (2011). Quality-Aware Loss-Robust Scalable Speech Streaming Based on Speech Quality Estimation. In: Kim, Th., et al. Communication and Networking. FGCN 2011. Communications in Computer and Information Science, vol 266. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27201-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-27201-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27200-4
Online ISBN: 978-3-642-27201-1
eBook Packages: Computer ScienceComputer Science (R0)