Playback Attack Detection: The Search for the Ultimate Set of Antispoof Features

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 578)


Automatic speaker verification systems are vulnerable to several kinds of spoofing attacks. Some of them can be quite simple – for example, the playback of an eavesdropped recording does not require any specialized equipment nor knowledge, but still may pose a serious threat for a biometric identification module built into an e-banking application. In this paper we follow the recent approach and convert recordings to images, assuming that original voice can be distinguished from its played back version through the analysis of local texture patterns. We propose improvements to the state-of-the-art solution, but also show its severe limitations. This in turn leads to the fundamental question: is it possible to find one set of features which are characteristic for all playback recordings? We look for the answer by performing a series of optimization experiments, but in general the problem remains open.


Playback detection Antispoof algorithms Biometrics 



The author would like to thank Tomasz Szwelnik and Jacek Kawalec from Voicelab for fruitful discussions, sharing the expertise and granting access to the VL-Bio database of playback attacks, which was created at the company’s laboratories.


  1. 1.
    Wu, Z., Kinnunen, T., Evans, N., Yamagishi, J., Hanilci, C., Sahidullah, M., Sizov, A.: ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. In: Interspeech 2015, Dresden, pp. 2037–2041 (2015)Google Scholar
  2. 2.
    Janicki, A., Alegre, F., Evans, N.: An assessment of automatic speaker verification vulnerabilities to replay spoofing attacks. Secur. Comm. Netw. 9, 3030–3044 (2016)CrossRefGoogle Scholar
  3. 3.
    Villalba, J., Lleida, E.: Preventing replay attacks on speaker verification systems. In: IEEE International Carnahan Conference on Security Technology, Barcelona (2011)Google Scholar
  4. 4.
    Wang, Z.-F., Wei, G., He, Q.-H.: Channel pattern noise based playback attack detection algorithm for speaker recognition. In: International Conference on Machine Learning and Cybernetics, vol. 4, Guilin, pp. 1708–1713 (2011)Google Scholar
  5. 5.
    Shiota, S., Villavicencio, F., Yamagishi, J., Ono, N., Echizen, I., Matsui, T.: Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification. In: Interspeech 2015, Dresden, pp. 239–243 (2015)Google Scholar
  6. 6.
    Gałka, J., Grzywacz, M., Samborski, R.: Playback attack detection for text-dependent speaker verification over telephone channels. Speech Commun. 67, 143–153 (2015)CrossRefGoogle Scholar
  7. 7.
    Yu, G., Slotine, J.-J.: Audio classification from time-frequency texture. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, pp. 1677–1680 (2009)Google Scholar
  8. 8.
    Maka, T., Forczmański, P.: Environmental sounds recognition based on image processing methods. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds.) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. AISC, vol. 403, pp. 723–732. Springer, Cham (2016). doi: 10.1007/978-3-319-26227-7_68 CrossRefGoogle Scholar
  9. 9.
    Forczmański, P.: Evaluation of singer’s voice quality by means of visual pattern recognition. J. Voice 30(1), 127.e21–127.e30 (2016)CrossRefGoogle Scholar
  10. 10.
    Ojala, T., Pietikainen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29, 51–59 (1996)CrossRefGoogle Scholar
  11. 11.
    Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Img. Proc. 19, 1635–1650 (2011)MathSciNetGoogle Scholar
  12. 12.
    Smiatacz, M., Rumiński, J.: Local texture pattern selection for efficient face recognition and tracking. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds.) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. AISC, vol. 403, pp. 359–368. Springer, Cham (2016). doi: 10.1007/978-3-319-26227-7_34 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Faculty of Electronics, Telecommunications and InformaticsGdańsk University of TechnologyGdańskPoland

Personalised recommendations