Learning Balanced Trees for Large Scale Image Classification

Mai, Tien-Dung; Ngo, Thanh Duc; Le, Duy-Dinh; Duong, Duc Anh; Hoang, Kiem; Satoh, Shin’ichi

doi:10.1007/978-3-319-23234-8_1

Learning Balanced Trees for Large Scale Image Classification

Tien-Dung Mai¹⁵,
Thanh Duc Ngo¹⁵,
Duy-Dinh Le^15,16,
Duc Anh Duong¹⁵,
Kiem Hoang¹⁵ &
…
Shin’ichi Satoh¹⁶

Conference paper
First Online: 01 January 2015

1811 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9280))

Abstract

The label tree is one of the popular approaches for the problem of large scale multi-class image classification in which the number of class labels is large, for example, several tens of thousands of labels. In learning stage, class labels are organized into a hierarchical tree, in which each node is associated with a subset of class labels and a classifier that determines which branch to follow; and each leaf node is associated with a single class label. In testing stage, the fact that a test example travels from the root of the tree to a leaf node reduces the test time significantly compared to the approach of using multiple binary one-versus-all classifiers. The balance of the learned tree structure is the key essential of the label tree approach. Previous methods for learning the tree structure use clustering techniques such as k-means or spectral clustering to group confused labels into clusters associated with the nodes. However, the output tree might not be balanced. We propose a method for learning effective and balanced tree structure by jointly optimizing the balance constraint and the confusion constraint. The experimental results on the datasets such as Caltech-256, SUN-397, and ImageNet-1K show that the classification accuracy of the proposed approach outperforms that of other state of the art methods.

Download to read the full chapter text

Chapter PDF

References

Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multi-class to binary: a unifying approach for margin classifiers. J. Mach. Learn. Res. (2001)
Google Scholar
Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: Advances in Neural Information Processing Systems (NIPS), pp. 163–171 (2010)
Google Scholar
Bradley, P., Bennett, K., Demiriz, A.: Constrained k-means clustering, pp. 1–8. Microsoft Research, Redmond (2000)
Google Scholar
Crammer, K., Singer, Y.: On the learnability and design of output codes for multiclass problems. Machine Learning (2002)
Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008)
Article Google Scholar
Deng, J., Berg, A., Fei-Fei, L.: Hierarchical semantic indexing for large scale image retrieval. In: The Twenty-Fourth IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, June 2011
Google Scholar
Deng, J., Satheesh, S., Berg, A., Fei-Fei, L.: Fast and balanced: efficient label tree learning for large scale object recognition. In: Proceedings of the Neural Information Processing Systems (NIPS) (2011)
Google Scholar
Dietterich, T.G., Bakiri, G.: Solving multi-class learning problems via error-correcting output codes. J. A.I. Res. (1995)
Google Scholar
Escalera, S., Tax, M., Pujol, O., Radeva, P.: Subclass problem-dependent design for error-correcting output codes. PAMI (2008)
Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Gao, T., Koller, D.: Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2072–2079, November 2011
Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. Rep. 7694, California Institute of Technology (2007). http://authors.library.caltech.edu/7694
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178. IEEE Computer Society (2006)
Google Scholar
Li, L., Socher, R., Li, F.: Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), June 20–25, 2009, Miami, Florida, USA, pp. 2036–2043 (2009)
Google Scholar
Liu, B., Sadeghi, F., Tappen, M., Shamir, O., Liu, C.: Probabilistic label trees for efficient large scale image classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 843–850. IEEE (2013)
Google Scholar
Liu, S., Yi, H., Chia, L.T., Rajan, D.: Adaptive hierarchical multi-class svm classifier for texture-based image classification. In: IEEE International Conference on Multimedia and Expo. ICME 2005, pp. 1190–1193, July 2005
Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, vol. 2, pp. 849–856 (2002)
Google Scholar
Pujol, O., Radeva, P., Vitria, J.: Discriminant ecoc: A heuristic method for application dependent design of error correcting output codes. PAMI (2006)
Google Scholar
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
MathSciNet MATH Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge (2014)
Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367. IEEE (2010)
Google Scholar
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492 (2010)
Google Scholar
Zhang, X., Liang, L., Shum, H.: Spectral error correcting output codes for efficient multiclass recognition. In: ICCV (2009)
Google Scholar
Zhao, B., Xing, E.P.: Sparse output coding for large-scale visual recognition. In: CVPR (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Information Technology, VNU-HCM, Ho Chi Minh City, Vietnam
Tien-Dung Mai, Thanh Duc Ngo, Duy-Dinh Le, Duc Anh Duong & Kiem Hoang
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan
Duy-Dinh Le & Shin’ichi Satoh

Authors

Tien-Dung Mai
View author publications
You can also search for this author in PubMed Google Scholar
Thanh Duc Ngo
View author publications
You can also search for this author in PubMed Google Scholar
Duy-Dinh Le
View author publications
You can also search for this author in PubMed Google Scholar
Duc Anh Duong
View author publications
You can also search for this author in PubMed Google Scholar
Kiem Hoang
View author publications
You can also search for this author in PubMed Google Scholar
Shin’ichi Satoh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tien-Dung Mai .

Editor information

Editors and Affiliations

Pattern Analysis and Computer Vision, Istituto Italiano di Tecnologia (IIT), Genoa, Italy
Vittorio Murino
Università di Genova, Genoa, Italy
Enrico Puppo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mai, TD., Ngo, T.D., Le, DD., Duong, D.A., Hoang, K., Satoh, S. (2015). Learning Balanced Trees for Large Scale Image Classification. In: Murino, V., Puppo, E. (eds) Image Analysis and Processing — ICIAP 2015. ICIAP 2015. Lecture Notes in Computer Science(), vol 9280. Springer, Cham. https://doi.org/10.1007/978-3-319-23234-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-23234-8_1
Published: 21 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23233-1
Online ISBN: 978-3-319-23234-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)