Skip to main content
Log in

A blind digital audio watermarking scheme based on EMD and UISA techniques

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, a new perceptual spread spectrum audio watermarking scheme is discussed. The watermark embedding process is performed in the Empirical Mode Decomposition (EMD) domain, and the hybrid watermark extraction process is based on the combination of EMD and ISA (Independent Subspace Analysis) techniques, followed by the generic detection system, i.e. inverse perceptual filter, predictor filter and correlation based detector. Since the EMD decomposes the audio signal into several oscillating components–the intrinsic mode functions (IMF)–the watermark information can be inserted in more than one IMF, using spread spectrum modulation, allowing hence the increase of the insertion capacity. The imperceptibility of the inserted data is ensured by the use of a psychoacoustical model. The blind extraction of the watermark signal, from the received watermarked audio, consists in the separation of the watermark from the IMFs of the received audio signal. The separation is achieved by a new proposed under-determined ISA method, here referred to as UISA. The proposed hybrid watermarking system was applied to the SQAM (Sound Quality Assessment Material) audio database (Available at http://sound.media.mit.edu/mpeg4/audio/sqam/) and proved to have efficient detection performances in terms of Bit Error Rate (BER) compared to a generic perceptual spread spectrum watermarking system. The perceptual quality of the watermarked audio was objectively assessed using the PEMO-Q (Tool for objective perceptual assessment of audio quality) algorithm. Also, using our technique, we can extract the different watermarks without using any information of original signal or the inserted watermark. Experimental results exhibit that the transparency and high robustness of the watermarked audio can be achieved simultaneously with a substantial increase of the amount of information transmitted. A reliability of 1.8 10 − 4 (against 1.5 10 − 2 for the generic system), for a bit rate of 400 bits/s, can be achieved when the channel is not disturbed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Adib A, Aboutajdine D (2005) Reference-based blind source separation using a deflation approach. Signal Process 85(10):1943–1949

    Article  MATH  Google Scholar 

  2. Baras C, Moreau N (2006) Controlling the inaudibility and maximizing the robustness in an audio annotation watermarking system. IEEE Trans on Audio, Speech and Language Process 14(5):1772–1782

    Article  Google Scholar 

  3. Bas P, Lienard J, Chassery JM, Beautemps D, Bailly G (2004) Artus: animation réaliste par tatouage audiovisuel à l’usage des sourds. J3eA 3 (Hors série 1, 16)

  4. Bhat V, Sengupta I, Das A (2010) An audio watermarking scheme using singular value decomposition and dither-modulation quantization. Multimed Tools Appl 52(2–3):369–383

    Google Scholar 

  5. Boney L, Tewfik AH, Hamdy KN (1996) Digital watermarks for audio signals. In: IEEE International conference on multimedia computing and systems. Hiroshima, pp 473–480

  6. Casey M, Westner A (2000) Separation of mixed audio sources by independent subspace analysis. In: ICMC 2000. Berlin, Germany, pp 154–161

  7. Flandrin P (2010) An empirical model for electronic submissions to conferences. Adv Complex Systems 13(3):1–11

    MathSciNet  Google Scholar 

  8. Frank I, Todeschini R (1994) The data analysis handbook. Elsevier, Sci Pub Co

    Google Scholar 

  9. Guerrero L (2004) Etude de techniques de tatouage audio pour la transmission de données. Master’s thesis, Ecole Doctorale Electronique, Electrotechnique, Automatique, Télécommunications, Signal

  10. Hamdouni NE, Adib A (2010) Single mixture audio sources separation using ISA technique in EMD domain. In: ISIVC’10

  11. Hamdouni NE, Adib A, Turki M, Aboutajdine D (2008) Combined emd/ica for digital audiowatermarking: a potentially craft to increase the embedded binary flow. In: ISIVC, Spain

  12. Hamdouni NE, Adib A, Larbi SD, Turki M (2010) Hybrid embedding strategy for a blind audio watermarking system using EMD and ISA technique. In: ISCCSP’10. Limassol, Cyprus

  13. Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH (1998) The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. R Soc London A 454:903–995

    Article  MathSciNet  MATH  Google Scholar 

  14. Huber R, Kollmeier, B (2006) PEMO-Q—a new method for objective audio quality assessment using a model of auditory perception. IEEE Trans on Audio, Speech and Language Process 14(6):1902–1911

    Article  Google Scholar 

  15. Larbi SD (2005) Structures d’égalisation en tatouage audionumérique. Thèse de doctorat, Ecole Nationale d’Ingénieurs de Tunis (ENIT) et Télécom Paris (ENST)

  16. Larbi S, Jaïdane M, Moreau N (2004) A new Wiener filtering based detection scheme for time domain perceptual audio watermarking. In: IEEE Int Conf Acoust, Speech, Signal Process, vol 5. Montreal, QC, Canada, pp 949–952

  17. Lei BY, Soon IY, Li Z (2011) Blind and robust audio watermarking scheme based on svd−dct. Signal Process 91(8):1973–1984

    Article  MATH  Google Scholar 

  18. Liénard J (2001) Transmission d’un message numérique caché dans un signal audio. In: Gretsi Toulouse France, pp 473–480

  19. Lin WH, Horng SJ, Kao TW, Fan P, Lee CL, Pan Y (2008) An efficient watermarking method based on significant difference of wavelet coefficient quantization. IEEE Trans Multimedia 10(5):746–757

    Article  Google Scholar 

  20. Lin WH, Horng SJ, Kao TW, Chen RJ, Chen YH, Lee CL, Terano T (2009) Image copyright protection with forward error correction. Expert Syst Appl 36(9):11,888–11,894

    Google Scholar 

  21. Lin WH, Wang YR, Horng SJ (2009) A wavelet-tree-based watermarking method using distance vector of binary cluster. Expert Syst Appl 36(6):9869–9878

    Article  Google Scholar 

  22. Lin WH, Wang YR, Horng SJ, Pan Y (2009) A blind watermarking method using maximum wavelet coefficient quantization. Expert Syst Appl 36(9):11,509–11,516

    Google Scholar 

  23. Liu YW (2007) Sound source segregation assisted by audio watermarking. In: IEEE Int Conf Multimedia and Expo, pp 200–203

  24. Malik H (2009) Blind watermark estimation attack for spread spectrum watermarking. Informatica 33(1):49–68

    MathSciNet  MATH  Google Scholar 

  25. Parvaix M, Girin L, Brossier J (2010) A watermarking-based method for informed source separation of audio signals with a single sensor. IEEE Trans on Audio, Speech and Language Process 18(6):1464–1475

    Article  Google Scholar 

  26. Rilling G, Flandrin P, Goncalvès P (2003) Empirical mode decomposition and its algorithms. In: IEEE-EURASIP, Workshop on nonlinear signal and image processing, NSIP ’03, Grado (I)

  27. Taoufiki M, Adib A, Aboutajdine D (2010) A new behavior of higher order blind source separation methods for convolutive mixture. Digit Signal Process 20:269–275

    Article  Google Scholar 

  28. Uhle C, Dittmar C, Sporer T (2003) Extraction of drum tracks from polyphonic music using independent subspace analysis. In: Int Symp Independent Compon Anal Blind Signal Separation (ICA2003), pp 843–848

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nawal El Hamdouni.

Appendix: Empirical mode decomposition

Appendix: Empirical mode decomposition

The EMD technique was firstly developed by Huang et al. [13] to decompose any nonstationary and nonlinear signal into oscillating components obeying some basic properties. The key benefit of using the EMD technique is that an automatic decomposition and fully data adaptive. This method makes it possible to decompose the audio signal x(n), to a set of \(\textsf{IMF}\) components, as follows:

$$ x(n)=\sum\limits_{\iota=1}^{N}\,\textsf{IMF}_{\iota}(n)+r_{N}(n), $$
(12)

where N represents the number of the \(\textsf{IMF}_i(n)\), and r N (n) is the final residue. The EMD technique can decompose the host signal from the highest frequency components to the residue r N (n), which represents the lowest frequency component. An audio signal example (Harpsichord sound) and some selected \(\textsf{IMF}\)s components are shown in Fig. 12.

Fig. 12
figure 12

EMD of an audio signal (Harpsichord sound) showing selected \(\textsf{IMF}\) components

The \(\textsf{IMF}\)s are the bases for representing the time series data. Being data adaptive, the basis usually offers a physically meaningful representation of the underlying processes. There is no need of considering the signal as a stack of harmonics and, therefore, EMD is ideal for analyzing nonstationary and nonlinear data [13]. The EMD process can also be considered as dyadic filter-bank especially for white noise and the \(\textsf{IMF}_i(n)\) components are all normally distributed [13].

The EMD technique has also many interesting features [13]: it is characterized by its completeness, that is x(n) can be exactly reconstructed by linear superposition of all the components. It generates roughly orthogonal components, because the \(\textsf{IMF}\)s have different local frequencies at the same time. Each \(\textsf{IMF}_i(n)\) has a well defined Hilbert transform from which the instantaneous frequencies can be calculated.

Rights and permissions

Reprints and permissions

About this article

Cite this article

El Hamdouni, N., Adib, A., Larbi, S.D. et al. A blind digital audio watermarking scheme based on EMD and UISA techniques. Multimed Tools Appl 64, 809–829 (2013). https://doi.org/10.1007/s11042-012-0988-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-012-0988-1

Keywords

Navigation