Abstract
The smile is one of the most common human facial expressions encountered in our daily lives. Smile recognition can be used in many scenarios, such as emotion monitoring, human-to-robot games, and camera shutter control, which is why smile recognition has received significant attention of researchers. This topic is a significant but challenging problem, particularly in unconstrained scenarios. The variety of facial sizes, illumination conditions, head poses, occlusions, and other factors increases the difficulty of this problem. To address this problem, we propose a novel multiple convolutional neural network (CNN) fusion approach in which a face-based CNN and a mouth-based CNN are used to perform smile recognition. According to the results obtained using the two CNNs, we fuse the two networks using a specified weight and choose the higher-probability result as the final result. Experimental results indicate that the method is effective on a real-world smile dataset (GENKI-4 K). The smile recognition rate of the proposed method is improved by 1.6% and 3.3% relative to the face-based CNN and mouth-based CNN, respectively, and the proposed method outperforms the most of previous methods.
Similar content being viewed by others
References
An L, Yang S, Bhanu B (2015) Efficient smile detection by extreme learning machine. Neurocomputing 149:354–363
Bianco S, Celona L, Schettini R (2016) Robust smile detection using convolutional neural networks. Journal of Electronic Imaging 25(6):063002–063002
Chen J, Ou Q, Chi Z, Fu H (2017) Smile detection in the wild with deep convolutional neural networks. Mach Vis Appl 28(1–2):173–183
Cui D, Huang G B, Liu T (2016) Smile detection using Pair-wise Distance Vector and Extreme Learning Machine. In: Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, pp. 2298–2305
Dahmane M, Meunier J (2011) Emotion recognition using dynamic grid-based HoG features. In: Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on, pp. 884–888
Freund Y, Schapire R E (1995) A desicion-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, pp. 23–37
Gao Y, Liu H, Wu P, Wang C (2016) A new descriptor of gradients self-similarity for smile detection in unconstrained scenarios. Neurocomputing 174:1077–1086
Glauner P O (2015) Deep convolutional neural networks for smile recognition. arXiv:1508.06535
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323
Gonzalez R C, Woods RE (2007) Digital Image Processing (3rd Edition). Prentice-Hall, Inc.
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456
Jain V, Crowley JL (2013) Smile detection using multi-scale Gaussian derivatives. In: 12th WSEAS International Conference on Signal Processing, Robotics and Automation
Jain V, Crowley J L, Lux A (2014) Local binary patterns calculated over Gaussian derivative images. In: Pattern Recognition (ICPR), 2014 22nd International Conference on. IEEE, pp. 3987–3992
Kahou S E, Froumenty P, Pal C (2014) Facial expression analysis based on high dimensional binary features. In: European Conference on Computer Vision, pp. 135–147
King DE (2009) Dlib-ml: A machine learning toolkit. J Mach Learn Res 10(Jul):1755–1758
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105
LeCun Y, Boser B, Denker JS et al (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Liu M, Li S, Shan S, Chen X (2012) Enhancing expression recognition in the wild with unlabeled reference data. In: Asian Conference on Computer Vision, pp. 577–588
Liu Y, Nie L, Liu L et al (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Liu C, Xu W, Wu Q et al (2016) Learning motion and content-dependent features with convolutions for action recognition. Multimedia Tools and Applications 75(21):13023–13039. https://doi.org/10.1007/s11042-015-2550-4
Liu Y, Zheng Y, Liang Y, et al (2016) Urban water quality prediction based on multi-task multi-view learning
Lucey P, Cohn JF, Matthews I et al (2011) Automatically detecting pain in video through facial action units. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 41(3):664–674
Mavadati SM, Mahoor MH, Bartlett K, Trinh P, Cohn JF (2013) Disfa: A spontaneous facial action intensity database. IEEE Trans Affect Comput 4(2):151–160
Qu T, Zhang Q, Sun S (2017) Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks. Multimedia Tools and Applications 76(20):21651–21663. https://doi.org/10.1007/s11042-016-4043-5
Rubin LR, Rubin LR (1974) The anatomy of a smile: its importance in the treatment of facial paralysis. Plast Reconstr Surg 53(4):384–387
Shan C (2012) Smile detection by boosting pixel differences. IEEE Trans Image Process 21(1):431–436
Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: A comprehensive study. Image Vis Comput 27(6):803–816
Sikka K, Wu T, Susskind J, Bartlett M (2012) Exploring bag of words architectures in the facial expression domain. In: Computer Vision–ECCV 2012. Workshops and Demonstrations, pp. 250–259
Singh R, Om H (2017) Newborn face recognition using deep convolutional neural network. Multimedia Tools and Applications:1–11. https://doi.org/10.1007/s11042-016-4342-x
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
The MPLab GENKI-4K Database (2018). http://mplab.ucsd.edu/
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Valstar MF, Pantic M (2012) Fully automatic recognition of the temporal phases of facial actions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42(1):28–43
Whitehill J, Littlewort G, Fasel I, Bartlett M, Movellan J (2009) Toward practical smile detection. IEEE Trans Pattern Anal Mach Intell 31(11):2106–2111
Zhang K, Huang Y, Wu H, Wang L (2015) Facial smile detection based on deep learning features. In: Pattern Recognition (ACPR), 2015 3rd IAPR Asian Conference on. IEEE, pp: 534–538
Zhang Y, Zhou L, Sun T (2012) A novel approach to detect smile expression. In: Machine Learning and Applications (ICMLA), 2012 11th International Conference on. IEEE 1:482–487
Acknowledgements
This research was supported by the National Science Foundation of China (Grant Nos. 51605464), National Basic Research Program of China (973Program) (2014CB049500) and Research on the Major Scientific Instrument of National Natural Science Foundation of China (61727809).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, J., Jin, Y., Akram, M.W. et al. Novel multi-convolutional neural network fusion approach for smile recognition. Multimed Tools Appl 78, 15887–15907 (2019). https://doi.org/10.1007/s11042-018-6945-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6945-x