Towards an Interrogation Speech Manipulation Detection Method Using Speech Fingerprinting

  • Shinnya Takahashi
  • Kazuhiro KondoEmail author
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 82)


We proposed a manipulation detection method for interrogation speech. We used a robust fingerprinting method optimized for speech since our intended target is interrogation speech recorded during a police investigation. The fingerprint uses line spectral pairs (LSP) to measure the spectral envelope of the speech, and is coarsely quantized so that the fingerprint will not be altered by small degradation in the signal, but will be altered enough by malicious modifications to the speech content. This fingerprint is embedded in the speech signal using conventional spread-spectrum watermarks. To detect manipulation, the watermarked fingerprint is detected, and compared to the fingerprint extracted from the speech itself. If the fingerprints match within the predetermined tolerance, it can be authenticated to be unaltered. Otherwise, manipulation should be suspected. We conducted initial experiments to verify the feasibility of the proposed method, and confirmed that at the utterance level, we can identify all substitution manipulated speech utterances successfully.


Interrogation speech Manipulation detection Audio watermark Speech fingerprinting Line spectral pairs 



This work was supported in part by the Cooperative Research Project Program of the Research Institute of Electrical Communication, Tohoku University (H26/A14).


  1. 1.
    Advisory Panel on White House Tapes: Report on a technical investigation conducted for the U.S. District Court for the District of Columbia by the advisory panel on White House tapes. Technical report, U.S. District Court for the District of Columbia, May 1974Google Scholar
  2. 2.
    Boney, L., Tewfik, A.H., Hamdy, K.N.: Digital watermarks for audio signals. In: Proceedings of IEEE International Conference on Multimedia Computing and Systems. IEEE, Hiroshima (1996)Google Scholar
  3. 3.
    Itakura, F.: Line spectrum representation of linear prediction coefficients of speech signals. J. Acoust. Soc. Am. 57, 535 (1975)CrossRefGoogle Scholar
  4. 4.
    Kukucka, J.: Lights, camera, justice: the value of recording police investigations. The Huffington Post online article, July 2014.
  5. 5.
    Kyodo News: Japanese police to tape all interrogations of suspects facing lay judge trials. The Japan Times online article, September 2016.
  6. 6.
    NII Speech Resources Consortium: ASJ continuous speech corpus for research. Accessed 2 Mar 2016
  7. 7.
    Sugamura, N., Itakura, F.: Speech data compression by LSP analysis-synthesis technique. Trans. Inst. Electron. Inf. Commun. Eng. J64-A(8) (1981). (in Japanese)Google Scholar
  8. 8.
    Tousignant, L.: The secret of Nixon tape’s 18-minute gap revealed. New York Post online article, August 2014.

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Graduate School of Science and EngineeringYamagata UniversityYonezawaJapan

Personalised recommendations