Robust visual object clustering and its application to sightseeing spot assessment

  • Min Ge
  • Chenyi Zhuang
  • Qiang MaEmail author


In this paper, we propose a robust visual object clustering approach based on bounding box ranking to discover the characteristics of objects from real-world datasets containing a large number of noisy images, and apply it to sightseeing spot assessment. The purpose is to develop a diversity of resources for sightseeing from images available on social network services (SNS). Objects appearing frequently in images captured in a certain city may represent a certain characteristic of it (local culture, architecture, and so on). Such knowledge can be used to discover various sightseeing resources from the perspective of the user rather than that of the provider (e.g., a travel agency). However, owing to the variable quality of images on SNS, it is challenging to identify objects common to several images by using conventional object discovery methods, and this is where the proposed approach is useful. Extensive experiments on standard and extended benchmarks verified its effectiveness. We also tested the proposed method on an application where the characteristics of a city (i.e., cultural elements) were discovered from a set of images of it. Moreover, by utilizing the objects discovered from images on SNS, we propose an object-level assessment framework to rank sightseeing spots by assigning scores and verify its performance.


Object discovery Clustering Sightseeing spot assessment 



  1. 1.
    Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. TPAMI 34(11):2189–2202CrossRefGoogle Scholar
  2. 2.
    Cho M, Kwak S, Laptev I, Schmid C, Ponce J (2015) Unsupervised object discovery and localization in images and videos. In: URAI, pp 292–293Google Scholar
  3. 3.
    Cho M, Kwak S, Schmid C, Ponce J (2015) Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. In: CVPR, pp 1201–1210Google Scholar
  4. 4.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, pp 886–893Google Scholar
  5. 5.
    Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: SIGKDD, pp 269–s274Google Scholar
  6. 6.
    Doersch C, Singh S , Gupta A, Sivic J, Efros A (2012) What makes paris look like paris? In: TOG, vol 31, issue 4Google Scholar
  7. 7.
    Everingham M, Zisserman A, Williams CKI , Van Gool L, Allan M, Bishop CM, Chapelle O, Dalal N, Deselaers T, Dorkó G et al (2007) The PASCAL visual object classes challenge 2007 (VOC2007) resultsGoogle Scholar
  8. 8.
    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. In: PAMI, pp 1627–1645Google Scholar
  9. 9.
    Girshick R (2005) Fast r-cnn. In: ICCV, pp 1440–1448Google Scholar
  10. 10.
    Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587Google Scholar
  11. 11.
    Harel J, Koch C, Perona P et al (2006) Graph-based visual saliency. In: NIPS, pp 545–552Google Scholar
  12. 12.
    He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: ECCV, pp 346–361Google Scholar
  13. 13.
    Hochman N, Schwartz R (2012) Visualizing instagram: tracing cultural visual rhythms. In: ICWSM12, pp 6–9Google Scholar
  14. 14.
    Jeong J-W, Hong H-K, Heu J-U, Qasim I, Lee D-H (2012) Visual summarization of the social image collection using image attractiveness learned from social behaviors. In: ICME, pp 538–543Google Scholar
  15. 15.
    Kwak S, Cho M, Laptev I, Ponce J, Schmid C (2015) Unsupervised object discovery and tracking in video collections. In: ICCV, pp 3173–3181Google Scholar
  16. 16.
    Lowe DG (1999) Object recognition from local scale-invariant features. In: ICCV, pp 1150–1157Google Scholar
  17. 17.
    Manen S, Guillaumin M, Van Gool L (2013) Prime object proposals with randomized prim’s algorithm. In: ICCV, pp 2536–2543Google Scholar
  18. 18.
    Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: NIPS, pp 91–99Google Scholar
  19. 19.
    Rosenberg A, Hirschberg J (2007) V-Measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL, pp 410–420Google Scholar
  20. 20.
    Rubinstein M, Joulin A, Kopf J, Liu C (2013) Unsupervised joint object discovery and segmentation in internet images. In: CVPR, pp 1939–1946Google Scholar
  21. 21.
    San Pedro J, Siersdorfer S (2009) Ranking and classifying attractiveness of photos in folksonomies. In: WWW, pp 771–780Google Scholar
  22. 22.
    Shen Y, Ge M, Zhuang C, Ma Q (2016) Sightseeing value estimation by analyzing geosocial images. In: BigMM, pp 117–124Google Scholar
  23. 23.
    Shen Y, Ge M, Zhuang C, Ma Q (2018) Sightseeing value estimation by analysing geosocial images. IJBDI 5(1/2):31–48CrossRefGoogle Scholar
  24. 24.
    Shen Y, Zhuang C, Ma Q (2017) Element-oriented method of landscape assessment of sightseeing spots by using social images. In: APWeb-WAIM, pp 66–73Google Scholar
  25. 25.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: arXiv:1409.1556
  26. 26.
    Singh S, Gupta A, Efros AA (2012) Unsupervised discovery of mid-level discriminative patches. In: ECCV, 73–86Google Scholar
  27. 27.
    Tang K, Joulin A, Li L-J, Fei-Fei L (2014) Co-localization in real-world images. In: CVPR, pp 1464–1471Google Scholar
  28. 28.
    Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: CVPR, pp 1–8Google Scholar
  29. 29.
    Zhuang C, Ma Q, Liang X, Yoshikawa M (2014) Anaba: an obscure sightseeing spots discovering system. In: ICME, pp 1–6Google Scholar
  30. 30.
    Zhuang C, Ma Q, Liang X, Yoshikawa M (2015) Discovering obscure sightseeing spots by analysis of geo-tagged social images. In: ASONAM, pp 590–595Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Kyoto UniversityKyotoJapan
  2. 2.National Institute of Advanced Industrial Science and TechnologyTokyoJapan

Personalised recommendations