In order to improve the accuracy of image classification problem, this paper proposes a new classification method based on image feature extraction and neural network. The method consists of two stages: image feature extraction and neural network classification. At the stage of feature extraction, spatial pyramid matching (SPM) feature, local position feature and global contour feature are extracted. The utilization of spatial information in SPM feature is effectively improved by combining the above three features. At the stage of neural network classification, a multi-hidden layer feedforward Histogram Intersection Kernel Weighted Learning Network (HWLN) is proposed to take advantage of three features to improve the classification accuracy. In the structure of network, the hidden layer output features are used as the bias of the input features to weight the coefficients of the features. The input layer and the hidden layers are directly connected with the output layer to realize the combination of linear mapping and nonlinear mapping. And the Histogram Intersection Kernel is used instead of random initialization of the input weight matrix. Taking combined features as input information, HWLN can realize the mapping relationship between input information and target category, so as to complete the image classification task. Extensive experiments are performed on Caltech 101, Caltech 256 and MSRC databases respectively. The experimental results show that the proposed method further utilizes the spatial information, and thus improves the accuracy of image classification.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Ahmed KT, Irtaza A, Iqbal MA (2017) Fusion of local and global features for effective image extraction. Appl Intell 47(2):526–543
Anwar H, Zambanini S, Kampel M (2014) Encoding spatial arrangements of visual words for rotation-invariant image classification. In: German Conference on Pattern Recognition, 443–452
Avila S, Thome N, Cord M et al (2013) Pooling in image representation:the visual Codeword point of view. Comput Vis Image Underst 117(5):453–465. https://doi.org/10.1016/j.cviu.2012.09.007
Boiman O, Shechtman E, Irani M (2008) In defense of nearest-neighbor based image classification. In: IEEE Conference on Computer Vision and Pattern Recognition. CVPR. 1–8
Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. In: Computer Vision, ICCV, pp 1–8
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):389–396. https://doi.org/10.1145/1961189.1961199
Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV, 1–22
Cunha ALD, Zhou JP, Do MN (2006) The nonsubsampled contourlet transform: theory, design, and applications. IEEE Trans Image Process 15(10):3089–3101. https://doi.org/10.1109/TIP.2006.877507
Deng WY, Ong YS, Zheng QH (2016) A fast reduced kernel extreme learning machine. Neural Netw 76:29–38
Frome A, Singer Y, Malik J (2007) Image retrieval and classification using local distance functions. In: Advances in neural information processing systems, 417–424
Frome A, Singer Y, Sha F, Malik J (2007) Learning globally-consistent local distance functions for shape-based image retrieval and classification. IEEE International Conference on Computer Vision
Goh H, Thome N, Cord M, Lim JH (2014) Learning deep hierarchical visual feature coding. IEEE Trans Neural Netw Learn Syst 25(12):2212–2225
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. International Conference on Computer Vision. 1458–1465
Gui J, Liu T, Tao D, Tan T (2016) Representative vector machines: a unified framework for classical classifiers. IEEE Trans Cybernet 46(8):1877–1888
Hu J, Shen L, Sun G (2017) Squeeze-and-Excitation Networks. arXiv preprint arXiv:1709.01507
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on, 2004. IEEE, 985–990
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Huang FJ, Boureau YL, LeCun Y (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: Computer Vision and Pattern Recognition, CVPR, 1–8
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529
Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In Computer Vision and Pattern Recognition. CVPR. 3304–3311
Juneja M, Vedaldi A, Jawahar CV, Zisserman A (2013) Blocks that shout: Distinctive parts for scene classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 923–930
Khan R, Barat C, Muselet D, Ducottet C (2015) Spatial histograms of soft pairwise similar patches to improve the bag-of-visual-words model. Comput Vision Image Understand 132:102–112. https://doi.org/10.1016/j.cviu.2014.09.005
Koniusz P, Yan F, Gosselin P, Mikolajczyk K (2017) Higher-order occurrence pooling for bags-of-words: visual concept detection. IEEE Trans Pattern Anal Mach Intell 39(2):313–326. https://doi.org/10.1109/TPAMI.2016.2545667
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Computer Society Conference on Computer Vision and Pattern Recognition New York, 2169–2178
Li G, Niu P, Duan X, Zhang X (2014) Fast learning network: a novel artificial neural network with a fast learning speed. Neural Comput & Applic 24(7–8):1683–1695
Li WS, Dong P, Xiao B, Zhou L (2016) Object recognition based on the region of interest and optimal bag of words model. Neurocomputing 172(8):271–280. https://doi.org/10.1016/j.neucom.2015.01.083
Li YQ, Wu C, Li HB (2018) Image classification method combining local position feature with global contour feature[J]. Acta Electron Sin 46(7):1726–1731. https://doi.org/10.3969/j.issn.0372-2112.2018.07.026
Li Q, Peng Q, Chen J, Yan C (2018) Improving image classification accuracy with ELM and CSIFT. Comput Sci Eng 99:1–1
Liu LQ, Wang L, Liu XW (2011) In defense of soft-assignment coding. Proceedings of the International Conference on Computer Vision. 2486–2493. https://doi.org/10.1109/CVPR.2010.5540039
Mansourian L, Abdullah MT, Abdullah LN, Azman A, Mustaffa MR, Applications (2018) An effective fusion model for image retrieval. Multimed Tools Appl, 77 (13):16131–16154
Microsoft Research Cambridge Object Recognition Image Database, https://www.microsoft.com/en-us/download/details.aspx?id=52644
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: Computer Vision, Graphics & Image Processing, 722–729
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition, 1–8
Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: European conference on computer vision, 143–156
Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105(3):222–245
Van Gemert JC, Veenman CJ, Smeulders AW, Geusebroek JM (2009) Visual word ambiguity. IEEE Trans Patt Anal Mach Intell 32(7):1271–1283
Wang JY, Yang JC, Yu K, et al (2010) Locality-constrained linear coding for image classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3360–3367
Wang S, Lu J, Gu X, Yang J (2016) Semi-supervised linear discriminant analysis for dimension reduction and classification. Pattern Recogn 57:179–189
Xiong W, Zhang L, Du B, Tao D (2017) Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn 62:225–235
Zafar B, Ashraf R, Ali N, Ahmed M, Jabbar S, Chatzichristofis SA (2018) Image classification by addition of spatial information based on histograms of orthogonal vectors. PLoS One 13(6):e0198175
Zhu QH, Wang ZZ, Mao XJ, Yang YB (2017) Spatial locality-preserving feature coding for image classification. Appl Intell 47(1):148–157
Zou J, Li W, Chen C, Du Q (2016) Scene classification using local and global features with collaborative representation fusion. Inf Sci 348:209–226
This work is supported by National Natural Science Foundation of China (No. 51641609), Natural Science Foundation of Hebei Province of China (No. F2015203212).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Wu, C., Li, Y., Zhao, Z. et al. Image classification method rationally utilizing spatial information of the image. Multimed Tools Appl 78, 19181–19199 (2019). https://doi.org/10.1007/s11042-019-7254-8
- Spatial pyramid matching
- Local position feature
- Global contour feature
- Neural network
- Histogram intersection kernel weighted learning network