Realtime Hierarchical Clustering Based on Boundary and Surface Statistics

  • Dominik Alexander KleinEmail author
  • Dirk Schulz
  • Armin Bernd Cremers
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10111)


Visual grouping is a key mechanism in human scene perception. There, it belongs to the subconscious, early processing and is key prerequisite for other high level tasks such as recognition. In this paper, we introduce an efficient, realtime capable algorithm which likewise agglomerates a valuable hierarchical clustering of a scene, while using purely local appearance statistics.

To speed up the processing, first we subdivide the image into meaningful, atomic segments using a fast Watershed transform. Starting from there, our rapid, agglomerative clustering algorithm prunes and maintains the connectivity graph between clusters to contain only such pairs, which directly touch in the image domain and are reciprocal nearest neighbors (RNN) wrt. a distance metric. The core of this approach is our novel cluster distance: it combines boundary and surface statistics both in terms of appearance as well as spatial linkage. This yields state-of-the-art performance, as we demonstrate in conclusive experiments conducted on BSDS500 and Pascal-Context datasets.


Gradient Magnitude Agglomerative Cluster Object Candidate Adjacency List Runtime Complexity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)CrossRefGoogle Scholar
  2. 2.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 886–893 (2005)Google Scholar
  3. 3.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)CrossRefGoogle Scholar
  4. 4.
    Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., et al.: The role of context for object detection and semantic segmentation in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 891–898. IEEE (2014)Google Scholar
  5. 5.
    Arbelaez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 328–335. IEEE (2014)Google Scholar
  6. 6.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision (ICCV), pp. 1150–1157 (1999)Google Scholar
  7. 7.
    Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)Google Scholar
  8. 8.
    Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Workshop on Statistical Learning in Computer Vision, ECCV, p. 7 (2004)Google Scholar
  9. 9.
    Bruce Goldstein, E.: Perceiving objects and scenes. In: Sensation and Perception, 8th edn., pp. 99–130. Wadsworth Cengage Learning, Belmont, USA (2009). ISBN-13: 978-0-495-60149-4Google Scholar
  10. 10.
    Wagemans, J., Elder, J.H., Kubovy, M., Palmer, S.E., Peterson, M.A., Singh, M., von der Heydt, R.: A century of gestalt psychology in visual perception I. perceptual grouping and figure-ground organization. Psychol. Bull. 138, 1172–1217 (2012)CrossRefGoogle Scholar
  11. 11.
    Wertheimer, M., Spillmann, L., Wertheimer, M.: On Perceived Motion and Figural Organization. MIT Press, Cambridge (2012)Google Scholar
  12. 12.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics. Springer, Heidelberg (2009)CrossRefzbMATHGoogle Scholar
  13. 13.
    Murtagh, F., Contreras, P.: Algorithms for hierarchical clustering: an overview. Interdisc. Rev. Data Min. Knowl. Disc. 2, 86–97 (2012)CrossRefGoogle Scholar
  14. 14.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)CrossRefGoogle Scholar
  15. 15.
    Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1124–1131 (2005)Google Scholar
  16. 16.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59, 167–181 (2004)CrossRefGoogle Scholar
  17. 17.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012)CrossRefGoogle Scholar
  18. 18.
    Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., Van Gool, L.: SEEDS: superpixels extracted via energy-driven sampling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 13–26. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33786-4_2 CrossRefGoogle Scholar
  19. 19.
    Kovesi, P.: Image segmentation using SLIC superpixels and DBSCAN clustering (2013).
  20. 20.
    Zhou, B.: Image segmentation using SLIC superpixels and affinity propagation clustering. Int. J. Sci. Res. 4(4), 1525–1529 (2015)Google Scholar
  21. 21.
    Peng, B., Zhang, L., Zhang, D.: A survey of graph theoretical approaches to image segmentation. Pattern Recogn. 46, 1020–1038 (2013)CrossRefGoogle Scholar
  22. 22.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)CrossRefGoogle Scholar
  23. 23.
    Vilaplana, V., Marques, F., Salembier, P.: Binary partition trees for object detection. IEEE Trans. Image Process. 17, 2201–2216 (2008)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Calderero, F., Marques, F.: Region merging techniques using information theory statistical measures. IEEE Trans. Image Process. 19, 1567–1586 (2010)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Alpert, S., Galun, M., Brandt, A., Basri, R.: Image segmentation by probabilistic bottom-up aggregation and cue integration. IEEE Trans. Pattern Anal. Mach. Intell. 34, 315–327 (2012)CrossRefGoogle Scholar
  26. 26.
    Arbelaez, P.: Boundary extraction in natural images using ultrametric contour maps. In: 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW 2006), p. 182 (2006)Google Scholar
  27. 27.
    Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)CrossRefGoogle Scholar
  28. 28.
    Taylor, C.J.: Towards fast and accurate segmentation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1916–1922 (2013)Google Scholar
  29. 29.
    Haris, K., Efstratiadis, S.N., Maglaveras, N., Katsaggelos, A.K.: Hybrid image segmentation using watersheds and fast region merging. IEEE Trans. Image Process. 7, 1684–1699 (1998)CrossRefGoogle Scholar
  30. 30.
    Marcotegui, B., Beucher, S.: Fast implementation of waterfall based on graphs. In: Ronse, C., Najman, L., Decencière, E. (eds.) Mathematical Morphology: 40 Years On. Computational Imaging and Vision, vol. 30, pp. 177–186. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  31. 31.
    Jain, V., Turaga, S.C., Briggman, K., Helmstaedter, M.N., Denk, W., Seung, H.S.: Learning to agglomerate superpixel hierarchies. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 24, pp. 648–656. Curran Associates, Inc., New York (2011)Google Scholar
  32. 32.
    Roerdink, J.B., Meijster, A.: The watershed transform: definitions, algorithms and parallelization strategies. Fundamenta Informaticae 41, 187–228 (2001)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Farid, H., Simoncelli, E.P.: Differentiation of discrete multidimensional signals. IEEE Trans. Image Process. 13, 496–508 (2004)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Xie, S., Tu, Z.: Holistically-nested edge detection. In: International Conference on Computer Vision (ICCV), pp. 1395–1403 (2015)Google Scholar
  35. 35.
    Li, Y., Paluri, M., Rehg, J.M., Dollár, P.: Unsupervised learning of edges. In: International Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  36. 36.
    Bruynooghe, M.: Methodes nouvelles en classification automatique de donnees taxinomiqes nombreuses. Statistique et Anal. des Donnes 3, 24–42 (1977)Google Scholar
  37. 37.
    Chan, T.F., Golub, G.H., LeVeque, R.J.: Algorithms for computing the sample variance: analysis and recommendations. Am. Stat. 37, 242–247 (1983)MathSciNetzbMATHGoogle Scholar
  38. 38.
    Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 99–121 (2000)CrossRefzbMATHGoogle Scholar
  39. 39.
    Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Pont-Tuset, J., Marques, F.: Measures and meta-measures for the supervised evaluation of image segmentation. In: Computer Vision and Pattern Recognition (CVPR), pp. 2131–2138. IEEE (2013)Google Scholar
  41. 41.
    Robotics Foundation: ROS - Robot Operating System (2016).
  42. 42.
    Ren, Z., Shakhnarovich, G.: Image segmentation by cascaded region agglomeration. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2011–2018. IEEE (2013)Google Scholar
  43. 43.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results (2010).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Dominik Alexander Klein
    • 1
    Email author
  • Dirk Schulz
    • 1
  • Armin Bernd Cremers
    • 2
  1. 1.Department of Cognitive Mobile SystemsFraunhofer FKIEWachtbergGermany
  2. 2.Bonn-Aachen International Center for Information Technology (B-IT)BonnGermany

Personalised recommendations