Advertisement

Speech Coding pp 151-160 | Cite as

Bandwidth Extension

  • Sascha Disch
  • Tom BäckströmEmail author
Chapter
Part of the Signals and Communication Technology book series (SCT)

Abstract

Perceptual audio coding at low bit rates often relies on semi-parametric or parametric techniques to efficiently transmit and restore audio content that, after receiving, may be very different to the original in its waveform, but is perceptually still very close to it. Audio bandwidth extension exploits the limited resolution of the human auditory perception at high frequencies to recreate a spectral high band from the transmitted spectral low band and post-processing parameters, which elicits the sensation of plausible high frequency content that perceptually fuses with the low band into a decent broadband audio perception. The following chapter details the underlying thoughts, design criteria, perceptual trade-offs and signal processing techniques found in contemporary low bit rate audio codecs using audio bandwidth extension.

Keywords

Side Information Spectral Envelope Temporal Envelope Audio Code Core Coder 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    3GPP. TS 26.445, EVS Codec Detailed Algorithmic Description; 3GPP Technical Specification (Release 12) (2014)Google Scholar
  2. 2.
    Atti, V., Krishnan, V., Dewasurendra, D., Chebiyyam, V., Subasingha, S., Sinder, D.J., Rajendran, V., Varga, I., Gibbs, J., Miao, L., Grancharov, V., Pobloth, H.: Super-wideband bandwidth extension for speech in the 3GPP EVS codec. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015, pp. 5927–5931 (2015)Google Scholar
  3. 3.
    Bregman, A.S.: Auditory Scene Analysis: The Perceptual Organization of Sound. Bradford Books. MIT Press, Cambridge (1990)Google Scholar
  4. 4.
    Bruhn, S., Pobloth, H., Schnell, M., Grill, B., Gibbs, J., Miao, L., Järvinen, K., Laaksonen, L., Harada, N., Naka, N., Ragot, S., Proust, S., Sanda, T., Varga, I., Greer, C., Jelínek, M., Xie, M., Usai, P.: Standardization of the new 3GPP EVS Codec. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015Google Scholar
  5. 5.
    Daniel, P., Weber, R.: Psychoacoustical roughness: implementation of an optimized model. Acustica 83, 113–123 (1997)Google Scholar
  6. 6.
    Dietz, M., Liljeryd, L., Kjorling, K., Kunz, O.: Spectral band replication, a novel approach in audio coding, In: 112th AES Convention. Preprint 5553 (2002)Google Scholar
  7. 7.
    Dietz, M., Multrus, M., Eksler, V., Malenovsky, V., Norvell, E., Pobloth, H., Miao, L., Wang, Z., Laaksonen, L., Vasilache, A., Kamamoto, Y., Kikuiri, K., Ragot, S., Faure, J., Ehara, H., Rajendran, V., Atti, V., Sung, H., Oh, E., Yuan, H., Zhu, C.: Overview of the EVS codec architecture. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015, pp. 5698–5702 (2015)Google Scholar
  8. 8.
    Disch, S., Neukam, C., Schmidt, K.: Temporal tile shaping for spectral gap filling in audio transform coding in EVS. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015, pp. 5873–5877 (2015)Google Scholar
  9. 9.
    Disch, S., Niedermeier, A., Helmrich, C.R., Neukam, C., Schmidt, K., Geiger, R., Lecomte, J., Ghido, F., Nagel, F., Edler, B.: Intelligent gap filling in perceptual transform coding of audio. In: 141st AES Convention Proceedings, September 2016Google Scholar
  10. 10.
    Disch, S., Schubert, B.: Sinusoidal substitution - an integrated parametric tool for enhancement of transform-based perceptual audio coders. In: Proceedings, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2014), May 2014Google Scholar
  11. 11.
    Ekstrand, P.: Bandwidth extension of audio signals by spectral band replication. In: First IEEE Benelux Workshop on Model based Processing and Coding of Audio (MPCA-2002), Leuven, Belgium, pp. 73–79, November 2002Google Scholar
  12. 12.
    Erne, M.: Perceptual audio coders – what to listen for. In: 111th AES Convention (2001)Google Scholar
  13. 13.
    Fastl, H., Zwicker, E.: Psychoacoustics: Facts and Models. Springer Series in Information Sciences. Springer, Heidelberg (2006)Google Scholar
  14. 14.
    Fuchs, G., Helmrich, C.R., Markovic, G., Neusinger, M., Ravelli, E., Moriya, T.: Low delay LPC and MDCT-based audio coding in the EVS codec. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015, pp. 5723–5727 (2015)Google Scholar
  15. 15.
    Helmrich, C., Niedermeier, A., Disch, S., Ghido, F.: Spectral envelope reconstruction via IGF for audio transform coding. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015Google Scholar
  16. 16.
    Herre, J., Disch, S.: Perceptual Audio Coding. Academic Press, Cambridge (2013)Google Scholar
  17. 17.
    Herre, J., Hilpert, J., Kuntz, A., Plogsties, J.: MPEG-H Audio - The new standard for universal spatia /3d audio coding. In: Audio Engineering Society Convention 137, October 2014Google Scholar
  18. 18.
    Iser, B., Schmidt, G., Minker, W.: Bandwidth Extension of Speech Signals. Springer Publishing Company Incorporated, New York (2008)CrossRefzbMATHGoogle Scholar
  19. 19.
    ISO/IEC. MPEG 14496-3:2001/AMD1:2003, Bandwidth Extension. JTC1/SC29/WG11 (2003)Google Scholar
  20. 20.
    ISO/IEC. MPEG 23003-3: 2012, MPEG-D (MPEG Audio Technologies), Part3: Unified Speech and Audio Coding. JTC1/SC29/WG11 (2012)Google Scholar
  21. 21.
    ISO/IEC JTC 1/SC 29. High efficiency coding and media delivery in heterogeneous environments – Part 3: 3D audio, AMENDMENT 3: MPEG-H 3D Audio Phase 2, November 2015Google Scholar
  22. 22.
    ISO/IEC (MPEG-H) 23008-3. High efficiency coding and media delivery in heterogeneous environments – Part 3: 3D audio, February 2015Google Scholar
  23. 23.
    Laitinen, M.-V., Disch, S., Oates, C., Pulkki, V.: Phase derivative correction of bandwidth-extended signals for perceptual audio codecs. In: 140th Convention Proceedings, June 2016Google Scholar
  24. 24.
    Laitinen, M.-V., Disch, S., Pulkki, V.: Sensitivity of human hearing to changes in phase spectrum. J. Audio Eng. Soc. (Journal of the AES) 61(11), 860–877 (2013)Google Scholar
  25. 25.
    Larsen, E., Aarts, R.: Audio Bandwidth Extension: Application of Psychoacoustics, Signal Processing and Loudspeaker Design (Chapters 5 and 6). Wiley, New Jersey (2004)CrossRefGoogle Scholar
  26. 26.
    Nagel, F., Disch, S.: A Harmonic bandwidth extension method for audio codecs. In: Proceedings, International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 145–148, April 2009Google Scholar
  27. 27.
    Nagel, F., Disch, S., Rettelbach, N.: A phase vocoder driven bandwidth extension method with novel transient handling for audio codecs. In: Proceedings, 126th Audio Engineering Society (AES) Convention, page Convention Paper 7711, May 2009Google Scholar
  28. 28.
    Nagel, F., Disch, S., Wilde, S.: A continuous modulated single sideband bandwith extension. In: Proceedings, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2010), March 2010Google Scholar
  29. 29.
    Neuendorf, M., Multrus, M., Rettelbach, N., Fuchs, G., Robilliard, J., Lecomte, J., Wilde, S., Bayer, S., Disch, S., Helmrich, C.R., Lefebvre, R., Gournay, P., Bessette, B., Lapierre, J., Kjörling, K., Purnhagen, H., Villemoes, L., Oomen, W., Schuijers, E., Kikuiri, K., Chinen, T., Norimatsu, T., Seng, C.K., Oh, E., Kim, M., Quackenbush, S., Grill, B.: MPEG unified speech and audio coding - the ISO/MPEG standard for high-efficiency audio coding of all content types. In: Audio Engineering Society (AES), editor, Proceedings, 132nd AES Convention, page Paper No. 8654, April 2012Google Scholar
  30. 30.
    Terhardt, E.: On the perception of periodic sound fluctuations (roughness). Acustica 30, 201–213 (1974)Google Scholar
  31. 31.
    Villemoes, L., Ekstrand, P., Hedelin, P.: Methods for enhanced harmonic transposition. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2011)Google Scholar
  32. 32.
    Zhong, H., Villemoes, L., Ekstrand, P., Disch, S., Nagel, F., Wilde, S., Chong, K.-S., Norimatsu, T.: QMF based harmonic spectral band replication. In: Proceedings, 131st AES Convention, page Conv. Paper 8517, October 2011Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.International Audio Laboratories Erlangen (AudioLabs)Friedrich-Alexander University Erlangen-Nürnberg (FAU)ErlangenGermany

Personalised recommendations