Well Begun Is Half Done: Generating High-Quality Seeds for Automatic Image Dataset Construction from Web

  • Yan Xia
  • Xudong Cao
  • Fang Wen
  • Jian Sun
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8692)


We present a fully automatic approach to construct a large-scale, high-precision dataset from noisy web images. Within the entire pipeline, we focus on generating high quality seed images for subsequent dataset growing. High quality seeds are essential as we revealed, but they have received relatively less attention in previous works with respect to how to automatically generate them. In this work, we propose a density score based on rank-order distance to identify positive seed images. The basic idea is images relevant to a concept typically are tightly clustered, while the outliers are widely scattered. Through adaptive thresholding, we guarantee the selected seeds as numerous and accurate as possible. Starting with the high quality seeds, we grow a high quality dataset by dividing seeds and conducting iterative negative and positive mining. Our system can automatically collect thousands of images for one concept/class, with a precision rate of 95% or more. Comparisons with recent state-of-the-arts also demonstrate our method’s superior performance.


Synthetic Dataset Convolutional Neural Network Seed Image Deep Neural Network Adaptive Thresholding 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berg, T.L., Forsyth, D.A.: Animals on the web. In: Computer Vision and Pattern Recognition, vol. 2, pp. 1463–1470 (2006)Google Scholar
  2. 2.
    Carpineto, C., De Mori, R., Romano, G., Bigi, B.: An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems (TOIS) 19(1), 1–27 (2001)CrossRefGoogle Scholar
  3. 3.
    Chen, X., Shrivastava, A., Gupta, A.: Neil: Extracting visual knowledge from web data. In: International Conference on Computer Vision, vol. 3 (2013)Google Scholar
  4. 4.
    Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.T.: Nus-wide: A real-world web image database from national university of singapore. In: Proc. of ACM Conf. on Image and Video Retrieval (2009)Google Scholar
  5. 5.
    Collins, B., Deng, J., Li, K., Fei-Fei, L.: Towards scalable dataset construction: An active learning approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)Google Scholar
  7. 7.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: Computer Vision and Pattern Recognition, pp. 248–255 (2009)Google Scholar
  8. 8.
    Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. arXiv:1310.1531 (2013)Google Scholar
  9. 9.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  10. 10.
    Feng, H., Chua, T.S.: A bootstrapping approach to annotating large image collection. In: Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 55–62 (2003)Google Scholar
  11. 11.
    Hariharan, B., Malik, J., Ramanan, D.: Discriminative decorrelation for clustering and classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 459–472. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)Google Scholar
  13. 13.
    Li, L.J., Fei-Fei, L.: Optimol: automatic online picture collection via incremental model learning. International Journal of Computer Vision 88(2), 147–168 (2010)CrossRefGoogle Scholar
  14. 14.
    Li, L.J., Wang, G., Fei-Fei, L.: Optimol: automatic online picture collection via incremental model learning. In: Computer Vision and Pattern Recognition (2007)Google Scholar
  15. 15.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. International Journal of Computer Vision 77(1-3), 157–173 (2008)CrossRefGoogle Scholar
  16. 16.
    Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. In: International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  17. 17.
    Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)Google Scholar
  18. 18.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional neural networks. arXiv:1311.2901 (2013)Google Scholar
  19. 19.
    Zhu, C., Wen, F., Sun, J.: A rank-order distance based clustering algorithm for face tagging. In: Computer Vision and Pattern Recognition, pp. 481–488 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yan Xia
    • 1
  • Xudong Cao
    • 2
  • Fang Wen
    • 2
  • Jian Sun
    • 2
  1. 1.University of Science and Technology of ChinaChina
  2. 2.Microsoft Research AsiaBeijingChina

Personalised recommendations