An Interrogation Speech Manipulation Detection Method Using Speech Fingerprinting and Watermarking
We proposed a manipulation detection method for interrogation speech. We used a robust fingerprinting method optimized for speech since our intended target is interrogation speech recorded during a police investigation. The fingerprint uses line spectral pairs (LSP) to measure the spectral envelope of the speech, and is coarsely quantized so that the fingerprint will not be altered by small degradation in the signal, but will be altered enough by malicious modifications to the speech content. This fingerprint is embedded in the speech signal using conventional spread-spectrum watermarks. To detect manipulation, the watermarked fingerprint is detected, and compared to the fingerprint extracted from the speech itself. If the fingerprints match within the predetermined tolerance, it can be authenticated to be unaltered. Otherwise, manipulation should be suspected.
We conducted manipulation detection on a frame by frame basis, and confirmed that we can correctly detect manipulation with noisy and reverberant speech in almost all of the substituted frames.
KeywordsInterrogation speech Manipulation detection Audio watermark Speech fingerprinting Line spectral pairs
This work was supported in part by the Cooperative Research Project Program of the Research Institute of Electrical Communication, Tohoku University (H29/A18).
- 1.Kyodo News: Japanese police to tape all interrogations of suspects facing lay judge trials. The Japan Times online article, September 2016. http://www.japantimes.com/news/2016/09/16/crime-legal/japanese-police-tape-interrogations-suspect-facing-lay-judge-trials/
- 2.Kukucka, J.: Lights, camera, justice: The value of recording police investigations. The Huffington Post online article, July 2014. http://www.huffingtonpost.com/jeff-kukucka/lights-camera-justice-the_b_5404579.html
- 3.Takahashi, S., kondo, K.: Towards an interrogation speech manipulation detection method using speech fingerprinting and watermarking. In: Proceedings of IIHMSP, Matsue (2017)Google Scholar
- 5.Sugamura, N., Itakura, F.: Speech data compression by LSP analysis-synthesis technique. Trans. Inst. Electron., Inf. Commun. Eng. J64-A(8) (Aug 1981), in JapaneseGoogle Scholar
- 6.Boney, L., Tewkfik, A.H., Hamdy, K.N.: Digital watermarks for audio signals. In: Proceedings of the IEEE International Conference on Multimedia Computing and Systems. IEEE, Hiroshima (1996)Google Scholar
- 7.Habets, E: Room impulse response generator, September 2010. https://www.audiolabs-erlangen.de/fau/professor/habets/software/rir-generator
- 8.NII Speech Resources Consortium: ASJ continuous speech corpus for research. http://research.nii.ac.jp/src/en/ASJ-JIPDEC.html. Accessed 2 Mar 2016