Skip to main content

Deep Architectures for Face Attributes

  • Conference paper
  • First Online:
Computer Vision – ACCV 2016 Workshops (ACCV 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10117))

Included in the following conference series:

Abstract

We train a deep convolutional neural network to perform identity classification using a new dataset of public figures annotated with age, gender, ethnicity and emotion labels, and then fine-tune it for attribute classification. An optimal sharing pattern of computational resources within this network is determined by experiment, requiring only 1 G flops to produce all predictions. Rather than fine-tune by re-learning weights in one additional layer after the penultimate layer of the identity network, we try several different depths for each attribute. We find that prediction of age and emotion is improved by fine-tuning from earlier layers onward, presumably because deeper layers are progressively invariant to non-identity related changes in the input.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://webscope.sandbox.yahoo.com/catalog.php?datatype=i&did=67.

  2. 2.

    http://wikipedia.org.

  3. 3.

    http://flickr.com.

  4. 4.

    https://en.wikipedia.org/wiki/Ethnic_group.

  5. 5.

    http://www.ethnicelebs.com.

  6. 6.

    http://www.gettyimages.com/.

References

  1. Baker, S.: Multi-PIE. In: Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. IEEE Computer Society (2008)

    Google Scholar 

  2. Chen, B.-C., Chen, C.-S., Hsu, W.H.: Cross-age reference coding for age-invariant face recognition and retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 768–783. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10599-4_49

    Google Scholar 

  3. Dhall, A., Goecke, R., Joshi, J., Wagner, M., Gedeon, T.: Emotion recognition in the wild challenge. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, ICMI (2013)

    Google Scholar 

  4. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: a deep convolutional activation feature for generic visual recognition. CoRR, abs/1310.1531 (2013)

    Google Scholar 

  5. Eidinger, E., Enbar, R., Hassner, T.: Age and gender estimation of unfiltered faces. IEEE Trans. Inf. Forensics Secur. 9(12), 2170–2179 (2014)

    Article  Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  7. Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical report 07–49, University of Massachusetts, Amherst, October 2007

    Google Scholar 

  8. Kahou, S.E., Bouthillier, X., Lamblin, P., Gülçehre, Ç., Michalski, V., Konda, K.R., Jean, S., Froumenty, P., Dauphin, Y., Boulanger-Lewandowski, N., Ferrari, R.C., Mirza, M., Warde-Farley, D., Courville, A.C., Vincent, P., Memisevic, R., Pal, C.J., Bengio, Y.: EmoNets: multimodal deep learning approaches for emotion recognition in video. J. Multimodal User Interfaces, abs/1503.01800 (2015)

    Google Scholar 

  9. Levi, G., Hassner, T.: Age and gender classification using convolutional neural networks. In: CVPR workshops (2015)

    Google Scholar 

  10. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)

    Google Scholar 

  11. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: CVPR (2014)

    Google Scholar 

  12. Pinto, L., Gandhi, D., Han, Y., Park, Y.-L., Gupta, A.: The curious robot: learning visual representations via physical interactions. In: ECCV (2016)

    Google Scholar 

  13. Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. CoRR, abs/1603.01249 (2016)

    Google Scholar 

  14. Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: CVPR workshops (2014)

    Google Scholar 

  15. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  16. Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: CVPR (2013)

    Google Scholar 

  17. Susskind, J.M., Anderson, A.K., Hinton, G.E.: The toronto face database. Department of Computer Science, University of Toronto, Toronto, ON, Canada. Technical report 3 (2010)

    Google Scholar 

  18. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10590-1_53

    Google Scholar 

  19. Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.D.: PANDA: pose aligned networks for deep attribute modeling. In: CVPR (2014)

    Google Scholar 

  20. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10599-4_7

    Google Scholar 

  21. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes. IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 918–930 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Neil O’Hare for collaborating with us on the search engine query logs, and the entire Yahoo Vision and Machine Learning Team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tobi Baumgartner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Baumgartner, T., Culpepper, J. (2017). Deep Architectures for Face Attributes. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10117. Springer, Cham. https://doi.org/10.1007/978-3-319-54427-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54427-4_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54426-7

  • Online ISBN: 978-3-319-54427-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics