Learning to Rank Using High-Order Information

  • Puneet Kumar Dokania
  • Aseem Behl
  • C. V. Jawahar
  • M. Pawan Kumar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8692)


The problem of ranking a set of visual samples according to their relevance to a query plays an important role in computer vision. The traditional approach for ranking is to train a binary classifier such as a support vector machine (svm). Binary classifiers suffer from two main deficiencies: (i) they do not optimize a ranking-based loss function, for example, the average precision (ap) loss; and (ii) they cannot incorporate high-order information such as the a priori correlation between the relevance of two visual samples (for example, two persons in the same image tend to perform the same action). We propose two novel learning formulations that allow us to incorporate high-order information for ranking. The first framework, called high-order binary svm (hob-svm), allows for a structured input. The parameters of hob-svm are learned by minimizing a convex upper bound on a surrogate 0-1 loss function. In order to obtain the ranking of the samples that form the structured input, hob-svm sorts the samples according to their max-marginals. The second framework, called high-order average precision svm (hoap-svm), also allows for a structured input and uses the same ranking criterion. However, in contrast to hob-svm, the parameters of hoap-svm are learned by minimizing a difference-of-convex upper bound on the ap loss. Using a standard, publicly available dataset for the challenging problem of action classification, we show that both hob-svm and hoap-svm outperform the baselines that ignore high-order information.


Support Vector Machine Feature Vector Loss Function Action Class Negative Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Behl, A., Jawahar, C.V., Kumar, M.P.: Optimizing average precision using weakly supervised data. In: CVPR (2014)Google Scholar
  2. 2.
    Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3D human pose annotations. IJCV (2009)Google Scholar
  3. 3.
    Delaitre, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: BMVC (2010)Google Scholar
  4. 4.
    Everingham, M., Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV (2010)Google Scholar
  5. 5.
    Finley, T., Joachims, T.: Training structural SVMs when exact inference is intractable. In: ICML (2008)Google Scholar
  6. 6.
    Franc, V., Savchynskyy, B.: Discriminative learning of max-sum classifiers. JMLR (2008)Google Scholar
  7. 7.
    Horst, R., Thoai, N.: DC programming overview. Journal of Optimization Theory and Applications (1999)Google Scholar
  8. 8.
    Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learning (2009)Google Scholar
  9. 9.
    Kohli, P., Torr, P.: Measuring uncertainty in graph cut solutions efficiently computing min-marginal energies using dynamic graph cuts. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. Part II. LNCS, vol. 3952, pp. 30–43. Springer, Heidelberg (2006)Google Scholar
  10. 10.
    Kohli, P., Torr, P.: Dynamic graph cuts for efficient inference in Markov random fields. PAMI (2007)Google Scholar
  11. 11.
    Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. PAMI (2006)Google Scholar
  12. 12.
    Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. PAMI (2004)Google Scholar
  13. 13.
    Komodakis, N.: Efficient training for pairwise or higher order crfs via dual decomposition. In: CVPR (2011)Google Scholar
  14. 14.
    Komodakis, N., Paragios, N., Tziritas, G.: MRF energy minimization and beyond via dual decomposition. PAMI (2011)Google Scholar
  15. 15.
    Kulesza, A., Pereira, F.: Structured learning with approximate inference. In: NIPS (2007)Google Scholar
  16. 16.
    Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: CVPR (2011)Google Scholar
  17. 17.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV (2001)Google Scholar
  18. 18.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann (1988)Google Scholar
  19. 19.
    Rosenfeld, N., Meshi, O., Globerson, A., Tarlow, D.: Learning structured models with the AUC loss and its generalizations. In: AISTAT (2014)Google Scholar
  20. 20.
    Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: NIPS (2003)Google Scholar
  21. 21.
    Taskar, B., Lacoste-Julien, S., Jordan, M.I.: Structured prediction, dual extragradient and bregman projections. JMLR (2006)Google Scholar
  22. 22.
    Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML (2004)Google Scholar
  23. 23.
    Vapnik, V.: Statistical learning theory. Wiley (1998)Google Scholar
  24. 24.
    Yue, Y., Finley, T., Radlinski, F., Joachims, T.: A support vector method for optimizing average precision. In: SIGIR (2007)Google Scholar
  25. 25.
    Yuille, A., Rangarajan, A.: The concave-convex procedure. Neural Computation (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Puneet Kumar Dokania
    • 1
  • Aseem Behl
    • 2
  • C. V. Jawahar
    • 2
  • M. Pawan Kumar
    • 1
  1. 1.Ecole Centrale de ParisINRIASaclayFrance
  2. 2.IIITHyderabadIndia

Personalised recommendations