Abstract
Pose variations and occlusions are two major challenges for unconstrained face detection. Many approaches have been proposed to handle pose variations and occlusions in face detection, however, few of them addresses the two challenges in a model explicitly and simultaneously. In this paper, we propose a novel face detection method called Aggregating Visible Components (AVC), which addresses pose variations and occlusions simultaneously in a single framework with low complexity. The main contributions of this paper are: (1) By aggregating visible components which have inherent advantages in occasions of occlusions, the proposed method achieves state-of-the-art performance using only hand-crafted feature; (2) Mapped from meanshape through component-invariant mapping, the proposed component detector is more robust to pose-variations (3) A local to global aggregation strategy that involves region competition helps alleviate false alarms while enhancing localization accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Blur or low resolution is a challenging problem mainly in surveillance. Though many blur face images exist in current benchmark databases (e.g. FDDB [1]), they are intentionally made out of focus in background while the main focus is the center figures in news photography.
References
Jain, V., Learned-Miller, E.G.: FDDB: a benchmark for face detection in unconstrained settings. UMass Amherst Technical report (2010)
Wu, B., Ai, H., Huang, C., Lao, S.: Fast rotation invariant multi-view face detection based on real adaBoost. In: IEEE Conference on Automatic Face and Gesture Recognition (2004)
Li, S., Zhang, Z.: Floatboost learning and statistical face detection. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1112–1123 (2004)
Huang, C., Ai, H., Li, Y., Lao, S.: High-performance rotation invariant multiview face detection. IEEE Trans. Pattern Anal. Mach. Intell. 29, 671–686 (2007)
Hotta, K.: A robust face detector under partial occlusion. In: International Conference on Image Processing (2004)
Lin, Y., Liu, T., Fuh, C.: Fast object detection with occlusions. In: Proceedings of the European Conference on Computer Vision, pp. 402–413 (2004)
Lin, Y., Liu, T.: Robust face detection with multi-class boosting (2005)
Chen, J., Shan, S., Yang, S., Chen, X., Gao, W.: Modification of the adaboost-based detector for partially occluded faces. In: 18th International Conference on Pattern Recognition (2006)
Goldmann, L., Monich, U., Sikora, T.: Components and their topology for robust face detection in the presence of partial occlusions. IEEE Trans. Inf. Forensics Secur. 2, 559–569 (2007)
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, pp. 33–61 (1995)
Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)
Farfade, S.S., Saberian, M.J., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 643–650. ACM (2015)
Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3676–3684 (2015)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks. arXiv preprint arXiv:1604.02878 (2016)
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. arXiv preprint arXiv:1603.01249 (2016)
Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vis. 4, 34–47 (2001)
Yang, B., Yan, J., Lei, Z., Li, S.Z.: Aggregate channel features for multi-view face detection. In: 2014 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–8. IEEE (2014)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2879–2886. IEEE (2012)
Yan, J., Lei, Z., Wen, L., Li, S.: The fastest deformable part model for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2497–2504 (2014)
Mathias, M., Benenson, R., Pedersoli, M., Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10593-2_47
Ranjan, R., Patel, V.M., Chellappa, R.: A deep pyramid deformable part model for face detection. In: 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–8. IEEE (2015)
Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 109–122. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10599-4_8
Liao, S., Jain, A., Li, S.: A fast and accurate unconstrained face detector. IEEE Trans. Pattern Anal. Mach. Intell. 38, 211–223 (2016)
Goldmann, L., Mönich, U.J., Sikora, T.: Components and their topology for robust face detection in the presence of partial occlusions. IEEE Trans. Inf. Forensics Secur. 2, 559–569 (2007)
Azizpour, H., Laptev, I.: Object detection using strongly-supervised deformable part models. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 836–849. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33718-5_60
Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 168–181. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15567-3_13
Zhang, N., Paluri, M., Ranzato, M.A.: Panda: Pose aligned networks for deep attribute modeling. In: Computer Vision and Pattern Recognition, pp. 1637–1644. IEEE, Springer, Berlin, Heidelberg (2014)
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10590-1_54
Zhang, H., Xu, T., Elhoseiny, M., Huang, X., Zhang, S., Elgammal, A., Metaxas, D.: SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151. IEEE (2011)
Bourdev, L., Brandt, J.: Robust object detection via soft cascade. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 236–243. IEEE (2005)
Liao, S., Jain, A.K., Li, S.Z.: Unconstrained face detection. Technical report, MSU-CSE-12-15, Department of Computer Science, Michigan State University (2012)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
Acknowledgement
This work was supported by the National Key Research and Development Plan (Grant No. 2016YFC0801002), the Chinese National Natural Science Foundation Projects #61672521, #61473291, #61572501, #61502491, #61572536, NVIDIA GPU donation program and AuthenMetric R&D Funds.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Duan, J., Liao, S., Guo, X., Li, S.Z. (2017). Face Detection by Aggregating Visible Components. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10117. Springer, Cham. https://doi.org/10.1007/978-3-319-54427-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-54427-4_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54426-7
Online ISBN: 978-3-319-54427-4
eBook Packages: Computer ScienceComputer Science (R0)