Speech Coding pp 131-150 | Cite as

Frequency Domain Coding

  • Tom BäckströmEmail author
Part of the Signals and Communication Technology book series (SCT)


Signals which are sufficiently stationary permit highly efficient coding in the frequency domain. Such signals include speech signals such as sustained vowels and prolonged fricatives, as well as generic audio signals such as music and mixed material. The main components of frequency domain coding methods include windowing, a time-frequency transform, perceptual modelling and entropy coding of the spectral components. This chapter gives an overview of such transform domain coding methods.


Speech Signal Discrete Fourier Transform Window Function Critical Sampling Perfect Reconstruction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    3GPP. TS 26.445, EVS Codec Detailed Algorithmic Description; 3GPP Technical Specification (Release 12), 2014Google Scholar
  2. 2.
    Allen, J.: Short-term spectral analysis, and modification by discrete fourier transform. IEEE Trans. Acoust. Speech Signal Process. 25, 235–238 (1977)CrossRefzbMATHGoogle Scholar
  3. 3.
    Bosi, M., Goldberg, R.E.: Introduction to Digital Audio Coding and Standards. Kluwer Academic Publishers, Dordrecht (2003)CrossRefGoogle Scholar
  4. 4.
    Bäckström, T.: Comparison of windowing in speech and audio coding. In: Proceedings of WASPAA, New Paltz, USA (2013)Google Scholar
  5. 5.
    Bäckström, T.: Vandermonde factorization of Toeplitz matrices and applications in filtering and warping. IEEE Trans. Signal Process. 61(24), 6257–6263 (2013)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bäckström, T., Helmrich, C.R.: Decorrelated innovative codebooks for ACELP using factorization of autocorrelation matrix. In: Proceedings of Interspeech, pp. 2794–2798 (2014)Google Scholar
  7. 7.
    Bäckström, T., Helmrich, C.R.: Arithmetic coding of speech and audio spectra using TCX based on linear predictive spectral envelopes. In: Proceedings of ICASSP, pp. 5127–5131 (2015)Google Scholar
  8. 8.
    Edler, B.: Codierung von audiosignalen mit Überlappender transformation und adaptiven fensterfunktionen. Frequenz 43(9), 252–256 (1989)CrossRefGoogle Scholar
  9. 9.
    Eksler, V., Jelínek, M., Salami, R.: Efficient handling of mode switching and speech transitions in the EVS codec. In: Proceedings of ICASSP, Brisbane, Australia, IEEE (2015)Google Scholar
  10. 10.
    Fischer, T.: A pyramid vector quantizer. IEEE Trans. Inf. Theory, IT-32(4), 568–583 (1986)Google Scholar
  11. 11.
    Fuchs, G., Subbaraman, V., Multrus, M.: Efficient context adaptive entropy coding for real-time applications. In: Proceedings of ICASSP, IEEE, pp. 493–496 (2011)Google Scholar
  12. 12.
    Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Springer, Berlin (1992)Google Scholar
  13. 13.
    Harris, F.J.: On the use of windows for harmonic analysis with the discrete fourier transform. Proc. IEEE 66(1), 51–83 (1978)CrossRefGoogle Scholar
  14. 14.
    Huffman, D.A.: A method for the construction of minimum redundancy codes. Proc. IRE 40(9), 1098–1101 (1952)CrossRefzbMATHGoogle Scholar
  15. 15.
    ISO/IEC 23003–3:2012. MPEG-D (MPEG audio technologies), Part 3: Unified speech and audio coding (2012)Google Scholar
  16. 16.
    Malvar, H.S.: Lapped transforms for efficient transform/subband coding. IEEE Trans. Acoust. Speech Signal Process. 38(6), 969–978 (1990)CrossRefGoogle Scholar
  17. 17.
    Malvar, H.S.: Signal Processing with Lapped Transforms. Artech House, Inc. (1992)Google Scholar
  18. 18.
    Mitra, S.K.: Digital Signal Processing: A Computer-Based Approach. McGraw-Hill, New York (1998)Google Scholar
  19. 19.
    Mäkinen, J., Bessette, B., Bruhn, S., Ojala, P., Salami, R., Taleb, A.: AMR-WB+: a new audio coding standard for 3rd generation mobile audio services. Proc. ICASSP 2, 1109–1112 (2005)Google Scholar
  20. 20.
    Rissanen, J., Langdon, G.G.: Arithmetic coding. IBM J. Res. Dev. 23(2), 149–162 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Sanchez, V.E., Adoul, J.-P.: Low-delay wideband speech coding using a new frequency domain approach. In: Proceedings of ICASSP, IEEE, vol. 2, pp. 415–418 (1993)Google Scholar
  22. 22.
    Svedberg, J., Grancharov, V., Sverrisson, S., Norvell, E., Toftgård, T., Pobloth, H., Bruhn, S.: MDCT audio coding with pulse vector quantizers. In: Proceedings of ICASSP, pp. 5937–5941 (2015)Google Scholar
  23. 23.
    Valin, J.-M., Maxwell, G., Terriberry, T.B., Vos, K.: High-quality, low-delay music coding in the OPUS codec. In: Audio Engineering Society Convention 135. Audio Engineering Society (2013)Google Scholar
  24. 24.
    Witten, I.H., Neal, R.M., Cleary, J.G.: Arithmetic coding for data compression. Commun. ACM 30(6), 520–540 (1987)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.International Audio Laboratories Erlangen (AudioLabs)Friedrich-Alexander University Erlangen-Nürnberg (FAU)ErlangenGermany

Personalised recommendations