Incremental Training for Face Recognition

  • Martin Winter
  • Werner BailerEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11295)


Many applications require the identification of persons in video. However, the set of persons of interest is not always known in advance, e.g., in applications for media production and archiving. Additional training samples may be added during the analysis, or groups of faces of one person may need to be identified retrospectively. In order to avoid re-running the face recognition, we propose an approach that supports fast incremental training based on a state of the art face detection and recognition pipeline using CNNs and an online random forest as a classifier. We also describe an algorithm to use the incremental training approach to automatically train classifiers for unknown persons, including safeguards to avoid noise in the training data. We show that the approach reaches state of the art performance on two datasets when using all training samples, but performs better with few or even only one training sample.



The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements no 732461, ReCAP (“Real-time Content Analysis and Processing”,, and 761802, MARCONI (“Multimedia and Augmented Radio Creation: Online, iNteractive, Individual”,


  1. 1.
    Amos, B., Ludwiczuk, B., Satyanarayanan, M., et al.: OpenFace: a general-purpose face recognition library with mobile applications. Technical Report CMU-CS-16-118, CMU School of Computer Science (2016)Google Scholar
  2. 2.
    Breimann, L.: Random forests. Mach. Learn. 45, 5–32 (2001)CrossRefGoogle Scholar
  3. 3.
    Chen, S., Liu, Y., Gao, X., Han, B.: MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices. In: Chinese Conference on Biometric Recognition (2018)Google Scholar
  4. 4.
    Choi, K., Toh, K.-A., Byun, H.: Incremental face recognition for large-scale social network services. Pattern Recognit. 45(8), 2868–2883 (2012)CrossRefGoogle Scholar
  5. 5.
    Criminisi, A., Shotton, J., Konukoglu, E.: Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends Comput. Graph. Vis. 7, 81–227 (2012)CrossRefGoogle Scholar
  6. 6.
    Deng, J., Guo, J., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. CoRR, abs/1801.07698 (2018)Google Scholar
  7. 7.
    Farfade, S.S., Saberian, M.J., Li, L.-J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 643–650. ACM (2015)Google Scholar
  8. 8.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst.Sci. 55(1), 119–139 (1997)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Freund, Y., Schapire, R.E.: A short introduction to boosting. J. Jpn. Soc. Artif. Intell. 14(5), 771–780 (1999). English translationGoogle Scholar
  10. 10.
    Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: challenge of recognizing one million celebrities in the real world. Electron. Imaging 2016(11), 1–6 (2016)CrossRefGoogle Scholar
  11. 11.
    Jiang, H., Learned-Miller, E.: Face detection with the faster R-CNN. In: 2017 12th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2017, pp. 650–657. IEEE (2017)Google Scholar
  12. 12.
    King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10(Jul), 1755–1758 (2009)Google Scholar
  13. 13.
    Learned-Miller, E., Huang, G.B., RoyChowdhury, A., Li, H., Hua, G.: Labeled faces in the wild: a survey. In: Kawulok, M., Celebi, M.E., Smolka, B. (eds.) Advances in Face Detection and Facial Image Analysis, pp. 189–248. Springer, Cham (2016). Scholar
  14. 14.
    Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)Google Scholar
  15. 15.
    Mathias, M., Benenson, R., Pedersoli, M., Van Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). Scholar
  16. 16.
    Oza, N.C., Russell, S.: Online bagging and boosting. In: Eighth International Workshop on Artificial Intelligence and Statistics, pp. 105–112 (2001)Google Scholar
  17. 17.
    Ozawa, S., Toh, S.L., Abe, S., Pang, S., Kasabov, N.: Incremental learning of feature space and classifier for face recognition. Neural Netw. 18(5–6), 575–584 (2005)CrossRefGoogle Scholar
  18. 18.
    Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017)Google Scholar
  19. 19.
    Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 23–38 (1998)CrossRefGoogle Scholar
  20. 20.
    Saffari, A., Leistner, C., Santner, J., Godec, M., Bischof, H.: On-line random forests. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pages 1393–1400. IEEE (2009)Google Scholar
  21. 21.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)Google Scholar
  22. 22.
    Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2892–2900 (2015)Google Scholar
  23. 23.
    Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR, abs/1602.07261 (2016)Google Scholar
  24. 24.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)Google Scholar
  25. 25.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (2013)zbMATHGoogle Scholar
  26. 26.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRefGoogle Scholar
  27. 27.
    Wong, Y.W., Seng, K.P., Ang, L.M.: Radial basis function neural network with incremental learning for face recognition. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 41(4), 940–949 (2011)CrossRefGoogle Scholar
  28. 28.
    Yan, J., Lei, Z., Wen, L., Li, S.Z.: The fastest deformable part model for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2497–2504 (2014)Google Scholar
  29. 29.
    Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  30. 30.
    Zhu, C., Zheng, Y., Luu, K., Savvides, M.: CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection. arXiv preprint arXiv:1606.05413 (2016)
  31. 31.
    Zhu, Z., Luo, P., Wang, X., Tang, X.: Recover canonical-view faces in the wild with deep neural networks. arXiv preprint arXiv:1404.3543 (2014)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.JOANNEUM RESEARCH Forschungsgesellschaft mbH, DIGITAL – Institute for Information and Communication TechnologiesGrazAustria

Personalised recommendations