Multimedia Tools and Applications

, Volume 78, Issue 3, pp 3221–3238 | Cite as

Robust facial landmark extraction scheme using multiple convolutional neural networks

  • Hyungjoon Kim
  • Jisoo Park
  • HyeonWoo Kim
  • Eenjun HwangEmail author
  • Seungmin Rho


Facial landmarks are a set of features that can be distinguished on the human face with the naked eye. Typical facial landmarks include eyes, eyebrows, nose, and mouth. Landmarks play an important role in human-related image analysis. For example, they can be used to determine whether there is a human being in the image, identify who the person is, or recognize the orientation of a face when taking a photograph. General techniques for detecting facial landmarks can be classified into two groups: One is based on traditional image processing techniques, such as Haar cascade classifiers and edge detection. The other is based on machine learning techniques in which landmarks can be detected by training neural network using facial features. However, such techniques have shown low accuracy, especially in some special conditions such as low luminance and overlapped faces. To overcome these problems, we proposed in our previous work a facial landmark extraction scheme using deep learning and semantic segmentation, and demonstrated that with even a small dataset, our scheme could achieve reasonable facial landmark extraction performance under such conditions. Nevertheless, for more extensive dataset, we found several exceptional cases where the scheme could not detect face landmarks precisely. Hence, in this paper, we revise our facial landmark extraction scheme using a deep learning model called Faster R-CNN and show how our scheme can improve the overall performance by handling such exceptional cases appropriately. Also, we show how to expand the training dataset by using image filters and image operations such as rotation for more robust landmark detection.


Convolutional neural networks Facial landmark Semantic segmentation Object detection Faster R-CNN 



This work was supported by Korea Environment Industry & Technology Institute (KEITI) through Public Technology Program based on Environmental Policy, funded by Korea Ministry of Environment (MOE)(2017000210001).


  1. 1.
    Badrinarayanan V, Kendall A, Cipolla R (2015) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561Google Scholar
  2. 2.
    Badrinarayanan V, Handa A, Cipolla R (2015) Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv preprint arXiv:1505.07293Google Scholar
  3. 3.
    Eigen D, Fergus R (2015) Predicting depth, surface normals, and semantic labels with a common multi-scale convolutional architecture. in ICCV, pp 2650–2658Google Scholar
  4. 4.
    Erjin Z et al (2013) Extensive facial landmark localization with coarse-to-fine convolutional network cascade. Comput Vis Workshops (ICCVW) 2013 IEEE Int Conf IEEEGoogle Scholar
  5. 5.
    Face datasets – Accessed: 2017-11-03
  6. 6.
    Girshick R (2015) Fast r-cnn. arXiv preprint arXiv:1504.08083Google Scholar
  7. 7.
    Girshick R et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf Comput Vis Pattern RecognitGoogle Scholar
  8. 8.
    Güçlü U et al (2017) End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent, and adversarial networks. arXiv preprint arXiv:1703.03305Google Scholar
  9. 9.
    Kasinski A, Schmidt A (2010) The architecture and performance of the face and eyes detection system based on the Haar cascade classifiers. Pattern Anal Applic 13(2):197–211MathSciNetCrossRefGoogle Scholar
  10. 10.
    Kim H, Park J, Kim H, Hwang E (2018) Facial landmark extraction scheme based on semantic segmentation. 2018 International Conference on Platform Technology and Service (PlatCon-18), Jeju, Korea.01Google Scholar
  11. 11.
    King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res pp 1755–1758Google Scholar
  12. 12.
    Krizhevsky et al (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces SystGoogle Scholar
  13. 13.
    Le V, Brandt J, Lin Z, Bourdev LD, Huang TS (2012) Interactive facial feature localization. Interactive facial feature localization. Eur Conf Comput Vis pp 679–692Google Scholar
  14. 14.
    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit pp 3431–3440Google Scholar
  15. 15.
    Noh H, Hong S, Han B (2015) Learning deconvolution networks for semantic segmentation. Proc IEEE Int Conf Comput Vis pp 1520–1528Google Scholar
  16. 16.
    Park J et al (2018) An automatic virtual makeup scheme based on personal color analysis. International Conference on Ubiquitous Information Management and Communication (IMCOM 2018), Langkawi, Malaysia. 01Google Scholar
  17. 17.
    Redmon J et al (2016) You only look once: unified, real-time object detection. Proc IEEE Conf Comput Vis Pattern RecognitGoogle Scholar
  18. 18.
    Ren et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Proces SystGoogle Scholar
  19. 19.
    Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. Int Conf Med Image Comput Comput Assist Interv pp 234–241Google Scholar
  20. 20.
    Russakovsky O et al. (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) pp 1–42Google Scholar
  21. 21.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556Google Scholar
  22. 22.
    Yang J et al (2009) Linear spatial pyramid matching using sparse coding for image classification. Comput Vis Pattern Recognit CVPR 2009. IEEE Conference on. IEEE 2009Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018
corrected publication September/2018

Authors and Affiliations

  • Hyungjoon Kim
    • 1
  • Jisoo Park
    • 1
  • HyeonWoo Kim
    • 1
  • Eenjun Hwang
    • 1
    Email author
  • Seungmin Rho
    • 2
  1. 1.School of Electrical Engineering, Korea UniversitySeoulRepublic of Korea
  2. 2.Department of Media SoftwareSungkyul UniversityAnyangRepublic of Korea

Personalised recommendations