Adding Discriminative Power to Hierarchical Compositional Models for Object Class Detection

  • Matej Kristan
  • Marko Boben
  • Domen Tabernik
  • Ales Leonardis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7944)


In recent years, hierarchical compositional models have been shown to possess many appealing properties for the object class detection such as coping with potentially large number of object categories. The reason is that they encode categories by hierarchical vocabularies of parts which are shared among the categories. On the downside, the sharing and purely reconstructive nature causes problems when categorizing visually-similar categories and separating them from the background. In this paper we propose a novel approach that preserves the appealing properties of the generative hierarchical models, while at the same time improves their discrimination properties. We achieve this by introducing a network of discriminative nodes on top of the existing generative hierarchy. The discriminative nodes are sparse linear combinations of activated generative parts. We show in the experiments that the discriminative nodes consistently improve a state-of-the-art hierarchical compositional model. Results show that our approach considers only a fraction of all nodes in the vocabulary (less than 10%) which also makes the system computationally efficient.


compositional models hierarchical models categorization discriminative parts 


  1. 1.
    Opelt, A., Pinz, A., Zisserman, A.: Learning an alphabet of shape and appearance for multi-class object detection. Int. J. Comput. Vision 80(1), 16–44 (2008)CrossRefGoogle Scholar
  2. 2.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  3. 3.
    Ferrari, V., Jurie, F., Schmid, C.: From images to shape models for object detection. Int. J. Comput. Vision 87(3), 284–303 (2010)CrossRefGoogle Scholar
  4. 4.
    Fidler, S., Leonardis, A.: Towards scalable representations of object categories: Learning a hierarchy of parts. In: Comp. Vis. Patt. Recognition, pp. 1–8 (2007)Google Scholar
  5. 5.
    Todorovic, S., Ahuja, N.: Learning subcategory relevances for category recognition. In: Comp. Vis. Patt. Recognition, pp. 1–8 (2008)Google Scholar
  6. 6.
    Zhu, L.L., Chen, Y., Torralba, A., Freeman, W., Yuille, A.: Part and appearance sharing: Recursive compositional models for multi-view multi-object detection. In: Comp. Vis. Patt. Recognition, pp. 1919–1926 (2010)Google Scholar
  7. 7.
    Kokkinos, I., Yuille, A.: Inference and learning with hierarchical shape models. Int. J. Comput. Vision 93(3), 1–25 (2011)MathSciNetGoogle Scholar
  8. 8.
    Si, Z., Zhu, S.: Unsupervised learning of stochastic and-or templates. In: Int’l Workshop on Stochastic Image Grammar, pp. 648–655 (2011)Google Scholar
  9. 9.
    Poon, H., Domingos, P.: Sum-product networks: A new deep architecture. In: Proc. 12th Conf. on Uncertainty in Artificial Intelligence, pp. 337–346 (2011)Google Scholar
  10. 10.
    Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proc. Int. Conf. Mach. Learning, pp. 609–616 (2009)Google Scholar
  11. 11.
    Kavukcuoglu, K., Sermanet, P., Boureau, Y., Gregor, K., Mathieu, M., LeCun, Y.: Learning convolutional feature hierarchies for visual recognition. In: Neural Inf. Proc. Systems, pp. 1090–1098 (2010)Google Scholar
  12. 12.
    Fidler, S., Boben, M., Leonardis, A.: Evaluating multi-class learning strategies in a hierarchical framework for object detection. In: Neural Inf. Proc. Systems, pp. 531–539 (2009)Google Scholar
  13. 13.
    Epshtein, B., Ullman, S.: Satellite features for the classification of visually similar classes. In: Comp. Vis. Patt. Recognition, vol. 2, pp. 2079–2086 (2006)Google Scholar
  14. 14.
    Yamashita, O., Sato, M., Yoshioka, T., Tong, F., Kamitani, Y.: Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns. NeuroImage 42(4), 1414–1429 (2008)CrossRefGoogle Scholar
  15. 15.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Discriminative learned dictionaries for local image analysis. In: Comp. Vis. Patt. Recognition, pp. 1–8 (2008)Google Scholar
  16. 16.
    Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Int. Conf. Computer Vision, vol. 2, pp. 416–423 (July 2001)Google Scholar
  17. 17.
    Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vision 77(1), 259–289 (2008)CrossRefGoogle Scholar
  18. 18.
    Borenstein, E., Ullman, S.: Combined top-down/bottom-up segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 2109–2125 (2008)CrossRefGoogle Scholar
  19. 19.
    Maji, S., Malik, J.: Object detection using a max-margin hough transform. In: Comp. Vis. Patt. Recognition, pp. 1038–1045 (2009)Google Scholar
  20. 20.
    Ommer, B., Malik, J.: Multi-scale object detection by clustering lines. In: Int. Conf. Computer Vision, pp. 484–491 (2009)Google Scholar
  21. 21.
    Riemenschneider, H., Donoser, M., Bischof, H.: Using partial edge contour matches for efficient object category localization. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 29–42. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  22. 22.
    Grauman, K., Darrell, T.: The pyramid match kernel: Efficient learning with sets of features. The Journal of Machine Learning Research 8, 725–760 (2007)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Matej Kristan
    • 1
  • Marko Boben
    • 1
  • Domen Tabernik
    • 1
  • Ales Leonardis
    • 2
    • 1
  1. 1.Faculty of Computer and Information ScienceUniversity of LjubljanaSlovenia
  2. 2.CN-CR Centre, School of Computer ScienceUniversity of BirminghamUK

Personalised recommendations