Spoken dialog systems have become popular and are used in a home environment, such as smart speakers. A problem will occur when two or more smart speakers are in the same environment, in which a dialog system misdetects the other dialog systems voice as a users voice. In this paper, a method to mute synthesized speech is proposed to prevent a speech recognizer from recognizing speech uttered by a machine. The audio watermark technique is used to indicate that a machine utters the speech, and the speech recognizer attenuates the observed speech if it contains the watermark. The watermark is embedded in high frequency so that humans cannot perceive the watermark and the watermark is robustly extracted. From the experimental result, we found that the proposed method robustly determine the existence of the watermark when the SNR is no less than 0 dB.


Spoken dialog systems Watermarking Muting 



Part of this work was supported by JSPS Kakenhi JP17H00823.


  1. 1.
    Arnold, M., Chen, X.M., Baum, P., Gries, U., Doërr, G.: A phase-based audio watermarking system robust to acoustic path propagation. IEEE Trans. Inf. Forensics Secur. 9(3), 411–425 (2014). Scholar
  2. 2.
    Embleton, T.F.W.: Tutorial on sound propagation outdoors. J. Acoust. Soc. Am. 100(1), 31–48 (1996). Scholar
  3. 3.
    Furui, S.: Toward robust speech recognition under adverse conditions. In: ESCA Tutorial and Research Workshop on Speech Processing in Adverse Conditions (1992)Google Scholar
  4. 4.
    Grant, R., McGregor, P.E.: Method for integrating computer processes with an interface controlled by voice actuated grammars. U.S. Patent No. 6,208,972, March 2001Google Scholar
  5. 5.
    Kojima, T., Oizumi, A., Okayasu, K., Parampalli, U.: An audio data hiding based on complete complementary codes and its application to an evacuation guiding system. In: The Sixth International Workshop on Signal Design and Its Applications in Communications, pp. 118–121, October 2013.
  6. 6.
    Lie, W.N., Chang, L.C.: Robust and high-quality time-domain audio watermarking based on low-frequency amplitude modification. IEEE Trans. Multimed. 8(1), 46–59 (2006)CrossRefGoogle Scholar
  7. 7.
    Marx, M.T., et al.: System and method for developing interactive speech applications. U.S. Patent No. 6,173,266, January 2011Google Scholar
  8. 8.
    Matsuoka, H., Nakashima, Y., Yoshimura, T.: Acoustic communication system using mobile terminal microphones. NTT DoCoMo Tech. J. 8(2), 4–12 (2006)Google Scholar
  9. 9.
    Nakashima, Y., Matsuoka, H., Yoshimura, T.: Evaluation and demonstration of acoustic OFDM. In: 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, pp. 1747–1751, October 2006.
  10. 10.
    Nematollahi, M.A., Al-Haddad, S.A.R.: An overview of digital speech watermarking. Int. J. Speech Technol. 16, 471–488 (2013)CrossRefGoogle Scholar
  11. 11.
    Nishimura, A.: Data hiding for audio signals that are robust with respect to air transmission and a speech codec. In: Proceedings of the International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 601–604, August 2008.
  12. 12.
    Nishimura, A.: Encoding data by frequency modulation of a high-low siren emitted by an emergency vehicle. In: 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 255–259, August 2014.
  13. 13.
    Suzuki, Y., Nishimura, R., Tao, H.: Audio watermark enhanced by LDPC coding for air transmission. In: Proceedings of the International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 23–26, December 2006.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Tohoku UniversitySendaiJapan

Personalised recommendations