Learning Balanced Trees for Large Scale Image Classification

  • Tien-Dung MaiEmail author
  • Thanh Duc Ngo
  • Duy-Dinh Le
  • Duc Anh Duong
  • Kiem Hoang
  • Shin’ichi Satoh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9280)


The label tree is one of the popular approaches for the problem of large scale multi-class image classification in which the number of class labels is large, for example, several tens of thousands of labels. In learning stage, class labels are organized into a hierarchical tree, in which each node is associated with a subset of class labels and a classifier that determines which branch to follow; and each leaf node is associated with a single class label. In testing stage, the fact that a test example travels from the root of the tree to a leaf node reduces the test time significantly compared to the approach of using multiple binary one-versus-all classifiers. The balance of the learned tree structure is the key essential of the label tree approach. Previous methods for learning the tree structure use clustering techniques such as k-means or spectral clustering to group confused labels into clusters associated with the nodes. However, the output tree might not be balanced. We propose a method for learning effective and balanced tree structure by jointly optimizing the balance constraint and the confusion constraint. The experimental results on the datasets such as Caltech-256, SUN-397, and ImageNet-1K show that the classification accuracy of the proposed approach outperforms that of other state of the art methods.


Leaf Node Class Label Child Node Spectral Cluster Balance Constraint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multi-class to binary: a unifying approach for margin classifiers. J. Mach. Learn. Res. (2001)Google Scholar
  2. 2.
    Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 163–171 (2010)Google Scholar
  3. 3.
    Bradley, P., Bennett, K., Demiriz, A.: Constrained k-means clustering, pp. 1–8. Microsoft Research, Redmond (2000)Google Scholar
  4. 4.
    Crammer, K., Singer, Y.: On the learnability and design of output codes for multiclass problems. Machine Learning (2002)Google Scholar
  5. 5.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008)CrossRefGoogle Scholar
  6. 6.
    Deng, J., Berg, A., Fei-Fei, L.: Hierarchical semantic indexing for large scale image retrieval. In: The Twenty-Fourth IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, June 2011Google Scholar
  7. 7.
    Deng, J., Satheesh, S., Berg, A., Fei-Fei, L.: Fast and balanced: efficient label tree learning for large scale object recognition. In: Proceedings of the Neural Information Processing Systems (NIPS) (2011)Google Scholar
  8. 8.
    Dietterich, T.G., Bakiri, G.: Solving multi-class learning problems via error-correcting output codes. J. A.I. Res. (1995)Google Scholar
  9. 9.
    Escalera, S., Tax, M., Pujol, O., Radeva, P.: Subclass problem-dependent design for error-correcting output codes. PAMI (2008)Google Scholar
  10. 10.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  11. 11.
    Gao, T., Koller, D.: Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2072–2079, November 2011Google Scholar
  12. 12.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. Rep. 7694, California Institute of Technology (2007).
  13. 13.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178. IEEE Computer Society (2006)Google Scholar
  14. 14.
    Li, L., Socher, R., Li, F.: Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), June 20–25, 2009, Miami, Florida, USA, pp. 2036–2043 (2009)Google Scholar
  15. 15.
    Liu, B., Sadeghi, F., Tappen, M., Shamir, O., Liu, C.: Probabilistic label trees for efficient large scale image classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 843–850. IEEE (2013)Google Scholar
  16. 16.
    Liu, S., Yi, H., Chia, L.T., Rajan, D.: Adaptive hierarchical multi-class svm classifier for texture-based image classification. In: IEEE International Conference on Multimedia and Expo. ICME 2005, pp. 1190–1193, July 2005Google Scholar
  17. 17.
    Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, vol. 2, pp. 849–856 (2002)Google Scholar
  18. 18.
    Pujol, O., Radeva, P., Vitria, J.: Discriminant ecoc: A heuristic method for application dependent design of error correcting output codes. PAMI (2006)Google Scholar
  19. 19.
    Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge (2014)Google Scholar
  21. 21.
    Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008).
  22. 22.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367. IEEE (2010)Google Scholar
  23. 23.
    Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492 (2010)Google Scholar
  24. 24.
    Zhang, X., Liang, L., Shum, H.: Spectral error correcting output codes for efficient multiclass recognition. In: ICCV (2009)Google Scholar
  25. 25.
    Zhao, B., Xing, E.P.: Sparse output coding for large-scale visual recognition. In: CVPR (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Tien-Dung Mai
    • 1
    Email author
  • Thanh Duc Ngo
    • 1
  • Duy-Dinh Le
    • 1
    • 2
  • Duc Anh Duong
    • 1
  • Kiem Hoang
    • 1
  • Shin’ichi Satoh
    • 2
  1. 1.University of Information Technology, VNU-HCMHo Chi Minh CityVietnam
  2. 2.National Institute of InformaticsChiyoda-ku, TokyoJapan

Personalised recommendations