Abstract
Eye detection is an important step for a range of applications such as iris and face recognition. For eye detection in practice, speed is as equally important as accuracy. In this paper, we propose a super-fast (1000 fps on a general PC) eye detection method based on the label map of the raw image without face detection. We firstly produce the label map of a raw image according to the coordinates of its bounding box . Then we train a stacked denoising autoencoder (SDAE) which is specifically designed to learn the mapping from the raw image to the label map. Finally, through an effective post-processing step, we obtain the bounding boxes of two eyes. Experimental results show that our method is about 2,500 times faster than the deformable part-based model (DPM) while maintaining a comparable accuracy. Also, our method is much better than the popular LBP+Cascade model in terms of both accuracy and speed.
Chapter PDF
Similar content being viewed by others
References
Bengio, Y.: Learning deep architectures for AI. Foundations and trends in Machine Learning 2(1), 1–127 (2009)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Hinton, G.: A practical guide to training restricted boltzmann machines. Momentum 9(1), 926 (2010)
Hinton, G.E.: Deep belief networks. Scholarpedia 4(5), 5947 (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Liao, S.C., Zhu, X.X., Lei, Z., Zhang, L., Li, S.Z.: Learning multi-scale block local binary patterns for face recognition. In: Lee, S.-W., Li, S.Z. (eds.) ICB 2007. LNCS, vol. 4642, pp. 828–837. Springer, Heidelberg (2007)
Luo, P., Wang, X., Tang, X.: Hierarchical face parsing via deep learning. In: CVPR. IEEE (2012)
Ng, A.: Sparse autoencoder. CS294A Lecture notes 72 (2011)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15(1), 1929–1958 (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint arXiv:1409.4842 (2014)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research 11 (2010)
Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: NIPS, pp. 809–817 (2013)
Yan, J., Lei, Z., Wen, L., Li, S.Z.: The fastest deformable part model for object detection. In: CVPR. IEEE (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tang, W., Huang, Y., Wang, L. (2015). 1000 Fps Highly Accurate Eye Detection with Stacked Denoising Autoencoder. In: Zha, H., Chen, X., Wang, L., Miao, Q. (eds) Computer Vision. CCCV 2015. Communications in Computer and Information Science, vol 547. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48570-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-662-48570-5_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48569-9
Online ISBN: 978-3-662-48570-5
eBook Packages: Computer ScienceComputer Science (R0)