Abstract
Automatic speaker verification systems are vulnerable to several kinds of spoofing attacks. Some of them can be quite simple – for example, the playback of an eavesdropped recording does not require any specialized equipment nor knowledge, but still may pose a serious threat for a biometric identification module built into an e-banking application. In this paper we follow the recent approach and convert recordings to images, assuming that original voice can be distinguished from its played back version through the analysis of local texture patterns. We propose improvements to the state-of-the-art solution, but also show its severe limitations. This in turn leads to the fundamental question: is it possible to find one set of features which are characteristic for all playback recordings? We look for the answer by performing a series of optimization experiments, but in general the problem remains open.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wu, Z., Kinnunen, T., Evans, N., Yamagishi, J., Hanilci, C., Sahidullah, M., Sizov, A.: ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. In: Interspeech 2015, Dresden, pp. 2037–2041 (2015)
Janicki, A., Alegre, F., Evans, N.: An assessment of automatic speaker verification vulnerabilities to replay spoofing attacks. Secur. Comm. Netw. 9, 3030–3044 (2016)
Villalba, J., Lleida, E.: Preventing replay attacks on speaker verification systems. In: IEEE International Carnahan Conference on Security Technology, Barcelona (2011)
Wang, Z.-F., Wei, G., He, Q.-H.: Channel pattern noise based playback attack detection algorithm for speaker recognition. In: International Conference on Machine Learning and Cybernetics, vol. 4, Guilin, pp. 1708–1713 (2011)
Shiota, S., Villavicencio, F., Yamagishi, J., Ono, N., Echizen, I., Matsui, T.: Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification. In: Interspeech 2015, Dresden, pp. 239–243 (2015)
Gałka, J., Grzywacz, M., Samborski, R.: Playback attack detection for text-dependent speaker verification over telephone channels. Speech Commun. 67, 143–153 (2015)
Yu, G., Slotine, J.-J.: Audio classification from time-frequency texture. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, pp. 1677–1680 (2009)
Maka, T., Forczmański, P.: Environmental sounds recognition based on image processing methods. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds.) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. AISC, vol. 403, pp. 723–732. Springer, Cham (2016). doi:10.1007/978-3-319-26227-7_68
Forczmański, P.: Evaluation of singer’s voice quality by means of visual pattern recognition. J. Voice 30(1), 127.e21–127.e30 (2016)
Ojala, T., Pietikainen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29, 51–59 (1996)
Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Img. Proc. 19, 1635–1650 (2011)
Smiatacz, M., Rumiński, J.: Local texture pattern selection for efficient face recognition and tracking. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds.) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. AISC, vol. 403, pp. 359–368. Springer, Cham (2016). doi:10.1007/978-3-319-26227-7_34
Acknowledgement
The author would like to thank Tomasz Szwelnik and Jacek Kawalec from Voicelab for fruitful discussions, sharing the expertise and granting access to the VL-Bio database of playback attacks, which was created at the company’s laboratories.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Smiatacz, M. (2018). Playback Attack Detection: The Search for the Ultimate Set of Antispoof Features. In: Kurzynski, M., Wozniak, M., Burduk, R. (eds) Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017. CORES 2017. Advances in Intelligent Systems and Computing, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-319-59162-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-59162-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59161-2
Online ISBN: 978-3-319-59162-9
eBook Packages: EngineeringEngineering (R0)