Driver Head Analysis Based on Deeply Supervised Transfer Metric Learning with Virtual Data

Liu, Keke; Liu, Yazhou; Sun, Quansen; Pranata, Sugiri; Shen, Shengmei

doi:10.1007/978-3-319-77383-4_28

Keke Liu¹⁹,
Yazhou Liu¹⁹,
Quansen Sun¹⁹,
Sugiri Pranata²⁰ &
…
Shengmei Shen²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10736))

Included in the following conference series:

Pacific Rim Conference on Multimedia

2327 Accesses

Abstract

Driver head analysis is of paramount interest for the advanced driver assistance systems (ADAS). Recently proposed methods almost rely on training with labeled samples, especially deep learning. However, the labeling process is a subjective and tiresome manual task. Even trickier, our application scene is driver assistance systems, where the training dataset is more difficult to capture. In this paper, we present a rendering pipeline to synthesize virtual-world driver head pose and facial landmark dataset with annotation by computer 3D animation software, in which we consider driver’s gender, dress, hairstyle, hats and glasses. This large amounts of virtual-world labeled dataset and a small amount of real-world labeled dataset are trained together firstly by deeply supervised transfer metric learning method. We treat it as a cross-domain task, the labeled virtual data is a source domain and the unlabeled real-world data is a target domain. By exploiting the feature self-learning characteristic of deep networks, we find the common feature subspace between them, and transfer discriminative knowledge from the labeled source domain to the labeled target domain. Finally we employ a small number of real-world dataset to fine-tune the model iteratively. Our experiments show high accuracy on real-world driver head images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 607–626 (2009)
Article Google Scholar
Papazov, C., Marks, T.K., Jones, M.: Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4722–4730 (2015). https://doi.org/10.1109/cvpr.2015.7299104
Lee, D., Yang, M.H., Oh, S.: Fast and accurate head pose estimation via random projection forests. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1958–1966 (2015). https://doi.org/10.1109/iccv.2015.227
Blake, A., Isard, M.: Active shape models. In: Active Contours, pp. 25–37. Springer, London (1998). https://doi.org/10.1007/978-1-4471-1555-7_2
Chapter Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)
Article Google Scholar
Li, Y., Feng, J., Meng, L., Wu, J.: Sparse representation shape models. J. Math. Imaging Vis. 48(1), 83–91 (2014). https://doi.org/10.1007/s10851-012-0394-3
Article MathSciNet Google Scholar
Lee, Y.H., Han, W., Kim, Y., Kim, B.: Facial feature extraction using an active appearance model on the iPhone. In: 2014 Eighth International Conference on IEEE Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), pp. 196–201 (2014). https://doi.org/10.1109/imis.2014.24
Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vis. 107(2), 177–190 (2014). https://doi.org/10.1007/s11263-013-0667-3
Article MathSciNet Google Scholar
Dong, Z., Wu, Y., Pei, M., Jia, Y.: Vehicle type classification using a semisupervised convolutional neural network. IEEE Trans. Intell. Transp. Syst. 16(4), 2247–2256 (2015)
Article Google Scholar
Ouyang, W., Wang, X., Zeng, X., Qiu, S., Luo, P., Tian, Y., Tang, X.: Deepid-net: deformable deep convolutional neural networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2403–2412 (2015). https://doi.org/10.1109/cvpr.2015.7298854
Dong, Z., Jia, S., Wu, T., Pei, M.: Face video retrieval via deep learning of binary hash representations. In: AAAI, pp. 3471–3477 (2016)
Google Scholar
Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483 (2013). https://doi.org/10.1109/cvpr.2013.446
Wu, Y., Wang, Z., Ji, Q.: Facial feature tracking under varying facial expressions and face poses based on restricted boltzmann machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3452–3459 (2013). https://doi.org/10.1109/cvpr.2013.443
Marin, J., Vazquez, D., Geronimo, D., Lopez, A.M.: Learning appearance in virtual scenarios for pedestrian detection. In: Computer Vision and Pattern Recognition (2010). https://doi.org/10.1109/cvpr.2010.5540218
Shotton, J., Sharp, T., Kipman, A.A., Fitzgibbon, A., Finocchio, M.J., Blake, A., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013). https://doi.org/10.1109/cvpr.2011.5995316
Article Google Scholar
Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M.J., Blake, A.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821–2840 (2013). https://doi.org/10.1109/iccv.2013.429
Article Google Scholar
Vazquez, D., Lopez, A.M., Marin, J., Ponsa, D., Geronimo, D.: Virtual and real world adaptation for pedestrian detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 797–809 (2014). https://doi.org/10.1109/tpami.2013.163
Article Google Scholar
Hu, J., Lu, J., Tan, Y.P.: Deep transfer metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 325–333 (2015). https://doi.org/10.1109/cvpr.2015.7298629
Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-Fine Auto-Encoder Networks (CFAN) for real-time face alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_1
Chapter Google Scholar
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014). https://doi.org/10.1109/cvpr.2014.241

Download references

Acknowledgments

This work was supported in part by the National Nature Science Foundation of China under Grant no 61672286 and 61673220.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Keke Liu, Yazhou Liu & Quansen Sun
Panasonic R&D Center Singapore, Singapore, Singapore
Sugiri Pranata & Shengmei Shen

Authors

Keke Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yazhou Liu
View author publications
You can also search for this author in PubMed Google Scholar
Quansen Sun
View author publications
You can also search for this author in PubMed Google Scholar
Sugiri Pranata
View author publications
You can also search for this author in PubMed Google Scholar
Shengmei Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yazhou Liu .

Editor information

Editors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Bing Zeng
University of Chinese Academy of Sciences, Beijing, China
Qingming Huang
University of Ottawa, Ottawa, Ontario, Canada
Abdulmotaleb El Saddik
University of Electronic Science and Technology of China, Chengdu, China
Hongliang Li
Chinese Academy of Sciences, Beijing, China
Shuqiang Jiang
Harbin Institute of Technology, Harbin, China
Xiaopeng Fan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, K., Liu, Y., Sun, Q., Pranata, S., Shen, S. (2018). Driver Head Analysis Based on Deeply Supervised Transfer Metric Learning with Virtual Data. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-77383-4_28
Published: 10 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77382-7
Online ISBN: 978-3-319-77383-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics