Advertisement

World Wide Web

, Volume 22, Issue 4, pp 1481–1498 | Cite as

Spatial alignment network for facial landmark localization

  • Huifang Li
  • Yidong LiEmail author
  • Junliang Xing
  • Hairong Dong
Article
  • 147 Downloads

Abstract

Facial Landmark Localization (FLL) on unconstrained images still remains challenging as they poses complex variation in face spatial structure and appearance. To address this problem, we propose a Spatial Alignment Network (SAN), which consist of two modules, like the transformation sub-network and the estimation sub-network. In the first module, we propose two methods to achieving spatial transformation, one is the handcrafted method which can ensure model stability and the other is the learning-based method which is efficient and flexible. In the second module, we add an attention layer in the deep CNN to enhance the importance of discriminative features and obtain more accurate results. Through extensive experiments, our model achieves good performance on several public challenging datasets.

Keywords

Facial landmark localization Spatial transformation Canonical shape Attention Convolution neural network 

Notes

Acknowledgements

This work is supported by National Science Foundation of China Grant #61672088 and #61790575, Fundamental Research Funds for the Central Universities #2018JBZ002. The corresponding author is Yidong Li.

References

  1. 1.
    Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Robust discriminative response map fitting with constrained local models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3444–3451 (2013)Google Scholar
  2. 2.
    Bartz, C., Yang, H., Meinel, C.: Stn-ocr: a single neural network for text detection and text recognition (2017)Google Scholar
  3. 3.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2930–2940 (2013)CrossRefGoogle Scholar
  4. 4.
    Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3d deformable face tracking with a commodity depth camera. In: Computer Vision - ECCV 2010, European Conference on Computer Vision, pp 229–242. Proceedings, Heraklion (2010)Google Scholar
  5. 5.
    Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2887–2894 (2012)Google Scholar
  6. 6.
    Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: Scale-aware semantic image segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)Google Scholar
  7. 7.
    Chu, Q., Ouyang, W., Li, H., Wang, X, Liu, B., Yu, N.: Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism (2017)Google Scholar
  8. 8.
    Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X.: Multi-context attention for human pose estimation (2017)Google Scholar
  9. 9.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: European Conference on Computer Vision, pp. 484–498 (1998)Google Scholar
  10. 10.
    Dollar, P., Welinder, P., Perona, P.: Cascaded pose regression. IEEE 238 (6), 1078–1085 (2010)Google Scholar
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  12. 12.
    Jourabloo, A., Liu, X.: Pose-invariant 3d face alignment. In: IEEE International Conference on Computer Vision, pp. 3694–3702 (2016)Google Scholar
  13. 13.
    Kingma, D.P., Adam, J.B.a.: A method for stochastic optimization. Computer Science (2014)Google Scholar
  14. 14.
    Kowalski, M., Naruniec, J., Trzcinski, T: Deep alignment network: A convolutional neural network for robust face alignment, pp. 2034–2043 (2017)Google Scholar
  15. 15.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp 1097–1105 (2012)Google Scholar
  16. 16.
    Li, H., Li, Y., Liu, W., Dong, H.: Coarse-to-fine facial landmarks localization based on convolutional feature. In: 2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC), pp. 1–6 (2017)Google Scholar
  17. 17.
    Li, Y., Chang, M.-C., Farid, H., Lyu, S.: In ictu oculi: Exposing ai generated fake face videos by detecting eye blinking. arXiv:1806.02877 (2018)
  18. 18.
    Lin, C.H., Lucey, S.: Inverse compositional spatial transformer networks, pp. 2252–2260 (2016)Google Scholar
  19. 19.
    Liu, Y., Jourabloo, A., Liu, X.: Learning deep models for face antispoofng: binary or auxiliary supervision (2018)Google Scholar
  20. 20.
    Lv, J., Shao, X., Xing, J., Cheng, C., Zhou, X.: A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3691–3700 (2017)Google Scholar
  21. 21.
    Mo, K.: Spatial transformer networkGoogle Scholar
  22. 22.
    Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012)Google Scholar
  23. 23.
    Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1–1 (2016)Google Scholar
  24. 24.
    Rashid, M., Gu, X., Yong, J.L.: Interspecies knowledge transfer for facial keypoint detection (2017)Google Scholar
  25. 25.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: A semi-automatic methodology for facial landmark annotation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 896–903 (2013)Google Scholar
  26. 26.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge The first facial landmark localization challenge. In: IEEE International Conference on Computer Vision Workshops, pp. 397–403 (2014)Google Scholar
  27. 27.
    Sagonas, C., Antonakos, E., Tzimiropoulos, G., Zafeiriou, S., Pantic, M: 300 faces in-the-wild challenge: database and results. Image Vis. Comput. 47, 3–18 (2016)CrossRefGoogle Scholar
  28. 28.
    Saragih, J.M., Lucey, S., Cohn, J.F.: Deformable Model Fitting by Regularized Landmark Mean-Shift. Kluwer Academic Publishers, Netherlands (2010)zbMATHGoogle Scholar
  29. 29.
    Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483 (2013)Google Scholar
  30. 30.
    Trigeorgis, G., Snape, P., Nicolaou, M.A., Antonakos, E., Zafeiriou, S.: Mnemonic descent method: a recurrent process applied for end-to-end face alignment. In: Computer Vision and Pattern Recognition (2016)Google Scholar
  31. 31.
    Tuzel, O., Marks, T.K., Tambe, S.: Robust face alignment using a mixture of invariant experts. In: European Conference on Computer Vision, pp. 825–841 (2016)Google Scholar
  32. 32.
    Xie, S., Girshick, R., Dollar, P, Tu, Z., He, K.: Aggregated residual transformations for deep neural networks (2016)Google Scholar
  33. 33.
    Xiong, X., Torre, F.D.L.: Supervised descent method and its applications to face alignment. In: Computer Vision and Pattern Recognition, pp. 532–539 (2013)Google Scholar
  34. 34.
    Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment. In: European Conference on Computer Vision, pp. 1–16 (2014)Google Scholar
  35. 35.
    Zhang, Z., Luo, P., Chen, C.L., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision, pp. 94–108 (2014)Google Scholar
  36. 36.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Huifang Li
    • 1
  • Yidong Li
    • 1
    Email author
  • Junliang Xing
    • 2
  • Hairong Dong
    • 3
  1. 1.School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina
  2. 2.Institute of AutomationChinese Academy of SciencesBeijingChina
  3. 3.Rail Traffic Control and SafetyBeijing Jiaotong UniversityBeijingChina

Personalised recommendations