Abstract
Nowadays, existing object detection methods based on deep learning usually need vast amounts of training data and cannot deal with unseen classes of objects well. In this paper, we propose a new framework that applies one-shot learning to object detection. During the training period, the network learns an ability from known object classes to compare the similarity of two image parts. For the image of a new category, selective search seeks proposals in the first step. Then the comparison based on traditional feature is used to screen out some inaccurate proposals. Next, our deep learning model can extract features and measure the similarity through feature fusion (which means concatenating the channels of two feature maps in this paper). After these steps, we can obtain a temporary result. Based on this result and some proposals related to it, we refine the proposals through the intersection. Then we conduct second-round detection with new proposals and improve the accuracy. Experiments on different datasets demonstrate that our method is effective and has a certain transferability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Biswas, S.K., Milanfar, P.: One shot detection with laplacian object and fast matrix cosine similarity. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 546–562 (2016). https://doi.org/10.1109/TPAMI.2015.2453950
Dong, X., Zheng, L., Ma, F., Yang, Y., Meng, D.: Few-example object detection with model communication. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2018). https://doi.org/10.1109/TPAMI.2018.2844853
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vision 59(2), 167–181 (2004). https://doi.org/10.1023/B:VISI.0000022288.19776.77
Girshick, R.B.: Fast R-CNN. In: International Conference on Computer Vision, pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81
He, X., Yan, S., Hu, Y., Niyogi, P., Zhang, H.: Face recognition using laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 328–340 (2005). https://doi.org/10.1109/TPAMI.2005.55
Hiroyuki, T., Sina, F., Peyman, M.: Kernel regression for image processing and reconstruction. IEEE Trans. Image Process. 16(2), 349–366 (2007). https://doi.org/10.1109/TIP.2006.888330
Keren, G., Schmitt, M., Kehrenberg, T., Schuller, B.: Weakly supervised one-shot detection with attention siamese networks. Stat 1050, 12 (2018). http://arxiv.org/abs/1801.03329
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015). http://www.cs.toronto.edu/~gkoch/files/msc-thesis.pdf
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012). https://doi.org/10.1145/3065386
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head R-CNN: in defense of two-stage object detector. arXiv preprint arXiv:1711.07264 (2017). http://arxiv.org/abs/1711.07264
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Shivani, A., Aatif, A., Dan, R.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1475–1490 (2004). https://doi.org/10.1109/TPAMI.2004.108
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Computer Vision and Pattern Recognition, pp. 1199–1208 (2018). https://doi.org/10.1109/CVPR.2018.00131
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vision 104(2), 154–171 (2013). https://doi.org/10.1007/s11263-013-0620-5
Vinyals, O., Blundell, C., Lillicrap, T.P., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Neural Information Processing Systems, pp. 3637–3645 (2016). http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4203–4212 (2018). https://doi.org/10.1109/CVPR.2018.00442
Zhang, Z., Qiao, S., Xie, C., Shen, W., Wang, B., Yuille, A.L.: Single-shot object detection with enriched semantics. In: Computer Vision and Pattern Recognition, pp. 5813–5821 (2018). https://doi.org/10.1109/CVPR.2018.00609
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_26
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Na, S., Yan, R. (2019). A New Learning-Based One Shot Detection Framework for Natural Images. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Image Processing. ICANN 2019. Lecture Notes in Computer Science(), vol 11729. Springer, Cham. https://doi.org/10.1007/978-3-030-30508-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-30508-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30507-9
Online ISBN: 978-3-030-30508-6
eBook Packages: Computer ScienceComputer Science (R0)