Abstract
Getting pain intensity from face images is an important problem in autonomous nursing systems. However, due to the limitation in data sources and the subjectiveness in pain intensity values, it is hard to adopt modern deep neural networks for this problem without domain-specific auxiliary design. Inspired by human vision priori, we propose a novel approach called saliency supervision, where we directly regularize deep networks to focus on facial area that is discriminative for pain regression. Through alternative training between saliency supervision and global loss, our method can learn sparse and robust features, which is proved helpful for pain intensity regression. We verified saliency supervision with face-verification network backbone [15] on the widely-used UNBC-McMaster Shoulder-Pain [10] dataset, and achieved state-of-art performance without bells and whistles. Our saliency supervision is intuitive in spirit, yet effective in performance. We believe such saliency supervision is essential in dealing with ill-posed datasets, and has potential in a wide range of vision tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chang, W., Hsu, S., Chien, J.: Fatauva-net: an integrated deep learning framework for facial attribute recognition, action unit detection, and valence-arousal estimation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1963–1971 (2017)
Ekman, P., Rosenberg, E.: What the face reveals: basic and applied studies of spontaneous expression using the facial action coding system (facs), 68(1), 83–96 (1997)
Girshick, R.B.: Fast R-CNN abs/1504.08083 (2015)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification abs/1703.07737 (2017)
Li, W., Abtahi, F., Zhu, Z., Yin, L.: Eac-net: A region-based deep enhancing and cropping approach for facial action unit detection (2017)
Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional lstm-cnns-crf (2016)
Martinez, D.L., Rudovic, O., Picard, R.: Personalized automatic estimation of self-reported pain intensity from facial expressions. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2318–2327 (2017)
Miyato, T., Maeda, S.I., Koyama, M., Ishii, S.: Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)
Miyato, T., Maeda, S., Koyama, M., Nakae, K., Ishii, S.: Distributional smoothing with virtual adversarial training. Computer Science (2015)
Lucey, P., Cohn, J.F., Prkachin, M., Solomon, P.E., Matthews, I.: Painful data: The UNBC-McMaster shoulder pain expression archive database. In: Face and Gesture 2011, pp. 57–64 (2011)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering (2015). CoRR abs/1503.03832
Wang, F., et al.: Transferring face verification nets to pain and expression regression (2017). abs/1702.06925
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Yeh, R.A., Chen, C., Lim, T., Hasegawa-Johnson, M., Do, M.N.: Semantic image inpainting with perceptual and contextual losses (2016). abs/1607.07539
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: Computer Vision and Pattern Recognition, pp. 2528–2535 (2010)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks (2016). abs/1604.02878
Zhao, R., Gan, Q., Wang, S., Ji, Q.: Facial expression intensity estimation using ordinal information. In: Computer Vision and Pattern Recognition, pp. 3466–3474 (2016)
Zhou, J., Hong, X., Su, F., Zhao, G.: Recurrent convolutional neural network regression for continuous pain intensity estimation in video pp. 1535–1543 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, C., Zhu, Z., Zhao, Y. (2018). Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11307. Springer, Cham. https://doi.org/10.1007/978-3-030-04239-4_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-04239-4_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04238-7
Online ISBN: 978-3-030-04239-4
eBook Packages: Computer ScienceComputer Science (R0)