KS(conf): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications

  • Rémy Sun
  • Christoph H. LampertEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)


Computer vision systems for automatic image categorization have become accurate and reliable enough that they can run continuously for days or even years as components of real-world commercial applications. A major open problem in this context, however, is quality control. Good classification performance can only be expected if systems run under the specific conditions, in particular data distributions, that they were trained for. Surprisingly, none of the currently used deep network architectures have a built-in functionality that could detect if a network operates on data from a distribution it was not trained for, such that potentially a warning to the human users could be triggered.

In this work, we describe KS(conf), a procedure for detecting such outside of specifications (out-of-specs) operation, based on statistical testing of the network outputs. We show by extensive experiments using the ImageNet, AwA2 and DAVIS datasets on a variety of ConvNets architectures that KS(conf) reliably detects out-of-specs situations. It furthermore has a number of properties that make it a promising candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with all networks, including pretrained ones, and requires no a priori knowledge of how the data distribution could change.



This work was funded in parts by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 308036.


  1. 1.
    Bansal, A., Farhadi, A., Parikh, D.: Towards transparent systems: semantic characterization of failure modes. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 366–381. Springer, Cham (2014). Scholar
  2. 2.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bendale, A., Boult, T.: Towards open world recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  4. 4.
    Bendale, A., Boult, T.: Towards open set deep networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  5. 5.
    Daftry, S., Zeng, S., Bagnell, J.A., Hebert, M.: Introspective perception: learning to predict failures in vision systems. In: International Conference on Intelligent Robots (IROS) (2016)Google Scholar
  6. 6.
    Dunning, T., Ertl, O.: Computing extremely accurate quantiles using t-digests (2014).
  7. 7.
    Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learing (ICML) (2015)Google Scholar
  8. 8.
    Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: International Conference on Machine Learing (ICML) (2017)Google Scholar
  9. 9.
    Harel, M., Mannor, S., El-Yaniv, R., Crammer, K.: Concept drift detection through resampling. In: International Conference on Machine Learning (ICML) (2014)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  11. 11.
    Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
  12. 12.
    Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \({<}\)0.5 MB model size. arXiv:1602.07360 (2016)
  13. 13.
    Jain, L.P., Scheirer, W.J., Boult, T.E.: Multi-class open set recognition using probability of inclusion. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 393–409. Springer, Cham (2014). Scholar
  14. 14.
    Kuncheva, L.I., Faithfull, W.J.: PCA feature extraction for change detection in multidimensional unlabeled data. IEEE Trans. Neural Netw. (T-NN) 25(1), 69–80 (2014)CrossRefGoogle Scholar
  15. 15.
    Marsaglia, G., Tsang, W.W., Wang, J.: Evaluating Kolmogorov’s distribution. J. Stat. Softw. Articles 8(18), 1–4 (2003)Google Scholar
  16. 16.
    Massey Jr., F.J.: The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951)CrossRefGoogle Scholar
  17. 17.
    Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  18. 18.
    Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers. Cambridge University Press (1999)Google Scholar
  19. 19.
    Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  20. 20.
    dos Reis, D.M., Flach, P., Matwin, S., Batista, G.: Fast unsupervised online drift detection using incremental Kolmogorov-Smirnov test. In: SIGKDD (2016)Google Scholar
  21. 21.
    Royer, A., Lampert, C.H.: Classifier adaptation at prediction time. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  22. 22.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Sethi, T.S., Kantardzic, M., Hu, H.: A grid density based framework for classifying streaming data in the presence of concept drift. J. Intell. Inf. Syst. 46(1), 179–211 (2016)CrossRefGoogle Scholar
  24. 24.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
  25. 25.
    Sun, R., Lampert, C.H.: KS(conf): A light-weight test if a ConvNet operates outside of its specifications. arXiv:1804.04171 (2018)
  26. 26.
    Tange, O.: Gnu parallel - the command-line power tool. USENIX Mag. 36(1), 42–47 (2011).
  27. 27.
    Wang, H., Abraham, Z.: Concept drift detection for streaming data. In: International Joint Conference on Neural Networks (IJCNN) (2015)Google Scholar
  28. 28.
    Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning - a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI) (2018)Google Scholar
  29. 29.
    Zhang, P., Wang, J., Farhadi, A., Hebert, M., Parikh, D.: Predicting failures of vision systems. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  30. 30.
    Zliobaite, I.: Change with delayed labeling: when is it detectable? In: International Conference on Data Mining Workshops (2010)Google Scholar
  31. 31.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. arXiv:1707.07012 (2017)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.École Normale Supérieure de Rennes (ENS Rennes)BruzFrance
  2. 2.Institute of Science and Technology Austria (IST Austria)KlosterneuburgAustria

Personalised recommendations