Skip to main content
Log in

HHT-based audio coding

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In this paper, a new audio coding scheme combining the Hilbert transform and the empirical mode decomposition (EMD) is introduced. Based on the EMD, the coding is fully a data-driven approach. Audio signal is first decomposed adaptively, by EMD, into intrinsic oscillatory components called intrinsic mode functions (IMFs). The key idea of this work is to code both instantaneous amplitude (IA) and instantaneous frequency (IF), of the extracted IMFs, calculated using Hilbert transform. Since IA (resp. IF) is strongly correlated, it is encoded via a linear prediction technique. The decoder recovers the original signal by superposition of the demodulated IMFs. The proposed approach is applied to audio signals, and the results are compared to those obtained by advanced audio coding (AAC) and MP3 codecs, and wavelets-based compression. Coding performances are evaluated using the bit rate, objective difference grade (ODG) and noise to mask ratio (NMR) measures. Based on the analyzed audio signals, overall, our coding scheme performs better than wavelet compression, AAC and MP3 codecs. Results also show that this new scheme has good coding performances without significant perceptual distortion, resulting in an ODG in range \([-1,0]\) and large negative NMR values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Johnston, J.D.: Transform coding of audio signals using perceptual criteria. IEEE. Sel. Areas Commun. 6, 314–323 (1988)

    Article  Google Scholar 

  2. Noll, P.: MPEG digital audio coding. IEEE Sig. Process. Mag. 14, 59–81 (1997)

    Article  MathSciNet  Google Scholar 

  3. Brandenburg, K., Stoll, G.: ISO-MPEG-1 audio: a generic standard for coding of high-quality digital audio. J. Audio Eng. Soc. 42, 780–792 (1994)

    Google Scholar 

  4. Stoll, G., Theile, G., Nielsen, S., Silze, A., Link, M., Sedlmeyer, R., Brefort, A.: Extension of ISO/MPEG-audio layer II to multi-channel coding. The future standard for broadcasting, telecommunication, and multimedia applications. In: Proceedings of the 94th Convention of the Audio Engineering Society (1993)

  5. Rault, J.B., Philippe, P., Lever, M.: MUSICAM (ISO/MPEG audio) very low BR coding at reduced sampling frequency. In: Proceedings of the 94th Convention of the Audio Engineering Society (1993)

  6. Stoll, G., Nielsen, S., Van de Kerkhof, L.: Generic architecture of the ISO/MPEG audio layer I and II-compatible developments to improve quality and addition of new features. In: Proceedings of the 94th Convention of the Audio Engineering Society (1993)

  7. Srinivasan, P., Jamieson, L.H.: High quality audio compression using an adaptive wavelet packet decomposition and psychoacoustic modeling. IEEE Trans. Sig. Process. 46, 1085–1093 (1998)

    Google Scholar 

  8. Deshmukh, P.R.: Multiwavelet decomposition for audio coding. IE(I) J.-ET 11, 38–41 (2006)

    MathSciNet  Google Scholar 

  9. Huang, N.E., et al.: The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. 454, 903–995 (1998)

    Article  MATH  Google Scholar 

  10. Khaldi, K., Boudraa, A.O., Turki, M., Chonavel Th., Samaali, I.: Audio encoding based on the empirical mode decomposition. In: Proceedings of the EUSIPCO, pp. 1–5 (2009)

  11. Khaldi, K., Boudraa, A.O., Torrésani, B., Chonavel Th., Turki, M.: Audio encoding using Huang and Hilbert transforms. In: Proceedings of the ISCCSP, pp. 1–5 (2010)

  12. Spanias, A., Painter, T., Atti, V.: Audio Signal Processing and Coding, p. 464. Wiley, New York (2007)

  13. Flandrin, P., Rilling, G., Goncalves, P.: Empirical mode decomposition as a filter bank. IEEE Sig. Proc. Lett. 11, 112–114 (2004)

    Article  Google Scholar 

  14. McAulay, R.J., Quatieri, T.F.: Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans. Acoust. Speech. Sig. Proc. 34, 744–754 (1986)

    Article  Google Scholar 

  15. Depalle, Ph, Garcia, G., Rodet, X.: Analysis of sound for additive synthesis: tracking of partials using Hidden Markov model. Proc. ICMC 34, 94–97 (1993)

    Google Scholar 

  16. Gonon, G., Montresor, S., Baudry, M.: Improved entropic gain and adaptive time-frequency segmentation. Application to audio coding. EUROSPEECH 4, 2661–2664 (2001)

    Google Scholar 

  17. Welch, T.: A technique for high-performance data compression. Computer 17, 8–19 (1984)

    Article  Google Scholar 

  18. ISO/IEC 11172-3: Information Technology Coding of Moving Pictures and Associated Audio for Digital Storage Media at Up to About 1.5 Mbit/s-Part 3: Audio. International Organization for Standardization (1993)

  19. OSO/IEC 13818-7: MPEG-2 Advanced Audio Coding, AAC. International Organization for Standardization (1997)

  20. ITU Recommendation, ITU-R BS.1387-1: Method for Objective Measurements of Perceived Audio Quality (2001)

  21. Campbell, D., Jones, E., Glavin, M.: Audio quality assessment techniques—a review, and recent developments. Signal Process. 89, 1489–1500 (2009)

    Article  MATH  Google Scholar 

  22. Brandenburg, K., Sporer, T.: NMR and masking flag: evaluation of quality using perceptual criteria. In: Proceedings of the AES 11th International Conference on Test and Measurement, pp. 169-179 (1992)

  23. Laurent, H., Doncarli, C.: Stationarity index for abrupt changes detection in the time frequency plane. IEEE Signal Proc. Lett. 5(2), 43–45 (1998)

    Article  Google Scholar 

  24. Martin, W., Flandrin, P.: Detection of changes of signal structure by using the Wigner–Ville spectrum. Signal Proc. 8, 215–233 (1985)

    Google Scholar 

  25. Wu, Z., Huang, N.E.: Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv. Adapt. Data Anal. 1, 1–41 (2009)

    Article  Google Scholar 

  26. Torres, M.E., Colominas, M.A., Schlotthauer, G., Flandrin, P.: A complete ensemble empirical mode decomposition with adaptive noise. In: IEEE ICASSP, pp. 4144–4147 (2011)

  27. Huang, N.E., Shen, S.S.P. (eds.): The Hilbert–Huang Transform and Its Applications (Interdisciplinary Mathematical Sciences), vol. 5. World Scientific Publishing Company, Singapore (2005)

  28. Hwaley, S.D., Atlas, Les E.: Some properties of an empirical mode type signal decomposition algorithm. IEEE Sig. Process. Lett. 17(1), 24–27 (2010)

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to the anonymous reviewer for supplying useful comments and pointing out the proof-of-concept nature of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdel-Ouahab Boudraa.

Additional information

Preliminary results of this work has been presented at IEEE ISCCSP conference, Limassol, Cyprus, 2010.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khaldi, K., Boudraa, AO., Torresani, B. et al. HHT-based audio coding. SIViP 9, 107–115 (2015). https://doi.org/10.1007/s11760-013-0433-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-013-0433-6

Keywords

Navigation