Scene Classification Based on Local Binary Pattern and Improved Bag of Visual Words

  • Gholam Ali MontazerEmail author
  • Davar Giveki
  • Mohammad Ali Soltanshahi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9094)


Today, image classification is considered as one of the most important and challenging tasks in computer vision. This paper presents a new method for image classification using Bag Of Visual Words and Local Binary Patterns (LBP). The bag-of-visual-words (BoVW) model has been proven to be very efficient for image classification and image retrieval. However, most proposals directly use local features extracted from an image while ignoring hidden information that could be extracted from an image. To solve this problem, we propose a novel image classification method using information extracted from different channels of the image and the grayscale version of the image. In this way more discriminant information is extracted from the image and as a result the constructed BoVW model gives highly discriminative features that considerably increases the classification performance. In this work we embed features extracted using LBP into BoVW model to construct our proposed scene classification model. The choice of LBP as image feature descriptor is because of the fact that the content of most of the scene images contains textural information so extracting LBP features is a very wise choice compared to other popular image features like Scale Invariant Feature Transform (SIFT) that fails to capture image information in homogeneous areas or textual images. Experiments on Oliva and Torralba (OT) dataset demonstrate the effectiveness of the proposed method.


Image classification Bag of visual words LBP Scene classification Bag of features 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Henderson, J.: Introduction to real-world scene perception. Visual Cognition 12(3), 849–851 (2005)CrossRefGoogle Scholar
  2. 2.
    Heitz, G., Koller, D.: Learning spatial context: using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research: Visual Perception 155, 23–36 (2006)Google Scholar
  4. 4.
    Chang, E., Kingshy, G., Sychay, G., Gang, W.: Content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Transactions on Circuits and Systems for Video Technology 13(1), 6–38 (2003)CrossRefGoogle Scholar
  5. 5.
    Vailaya, A., Figueiredo, M., Jain, A., Zhang, H.J.: Content-based hierarchical classification of vacation images. In: IEEE International Conference on Multimedia Computing and Systems, vol. 1, pp. 518–523 (1999)Google Scholar
  6. 6.
    Siagian, C., Itti, L.: Gist: a mobile robotics application of context-based vision in outdoor environment. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 3, pp. 1063–1069 (2005)Google Scholar
  7. 7.
    Torralba, A., Oliva, A.: Statistics of natural image categories. Network: Computation in Neural Systems 14(3), 391–412 (2003)CrossRefGoogle Scholar
  8. 8.
    Vogel, J., Schiele, B.: A semantic typicality measure for natural scene categorization. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 195–203. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)Google Scholar
  10. 10.
    Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)CrossRefGoogle Scholar
  11. 11.
    Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, pp. 1470–1477 (2003)Google Scholar
  12. 12.
    Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 524–531 (2005)Google Scholar
  13. 13.
    Quelhas, P., Monay, F., Odobez, J.M., Gatica-Perez, D., Tuytelaars, T., Van Gool, L.: Modeling scenes with local descriptors and latent aspects. In: Tenth IEEE International Conference on Computer Vision, vol. 1, pp. 883–890 (2005)Google Scholar
  14. 14.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178 (2006)Google Scholar
  15. 15.
    Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1458–1465 (2005)Google Scholar
  16. 16.
    Bosch, A., Munoz, X., Marti, R.: Which is the best way to organize/classify images by content? Image and Vision Computing 25(6), 778–791 (2007)CrossRefGoogle Scholar
  17. 17.
    Bosch, A., Zisserman, A., Muñoz, X.: Scene classification via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning, pp. 1–22. ECCV Computer Vision (2004)Google Scholar
  19. 19.
    Opelt, A., Pinz, A., Zisserman, A.: A boundary-fragment-model for scene detection. In: European Conference on Computer Vision, vol. 2, pp. 575–588 (2006)Google Scholar
  20. 20.
    Yu, J., Qin, Z., Wan, T., Zhang, X.: Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120, 355–364 (2013)CrossRefGoogle Scholar
  21. 21.
    Penatti Otávio, A., Silva, B., Fernanda, B., Valle, E., Gouet-Brunet, V., Torres, R.da S.: Visual word spatial arrangement for image retrieval and classification. Pattern Recognition 47, 705–720 (2014)CrossRefGoogle Scholar
  22. 22.
    Zhang, S., Tian, Q., Hua, G., Huang, Q., Gao, W.: ScenePatchNet: Towards scalable and semantic image annotation and retrieval. Computer Vision and Image Understanding 118, 16–29 (2014)CrossRefGoogle Scholar
  23. 23.
    Zhang, H., Berg, A., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2126–2136 (2006)Google Scholar
  24. 24.
    Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision 43(1), 29–44 (2001)zbMATHCrossRefGoogle Scholar
  25. 25.
    Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  26. 26.
    Ojala, D., Pietikäinen, M., Mäenpää, T.: Multiresolution gray scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)CrossRefGoogle Scholar
  27. 27.
    Maenpaa, T.: The Local Binary Pattern Approach to Texture Analysis - Extensions and Applications. Oulu University Press (2003)Google Scholar
  28. 28.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175 (2001)zbMATHCrossRefGoogle Scholar
  29. 29.
    Zhang, J., Marszałek, M., Lazebnik, C., Schmid, S.: Local features and kernels for classification of texture and Scene categories: a comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)CrossRefGoogle Scholar
  30. 30.
    Wang, H., Liang, W., Wu, X., Teng, P.: Scene image retrieval via re-ranking semantic and packed dense interest points. Neurocomputing 119, 65–73 (2013)CrossRefGoogle Scholar
  31. 31.
    Kim, J., Grauman, K.: Asymmetric region-to-image matching for comparing images with generic object categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2344–2351 (2010)Google Scholar
  32. 32.
    Dai, D., Wut, T., Zhu, S.: Discovering scene categories by information projection and cluster sampling. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3455–3462 (2010)Google Scholar
  33. 33.
    Wang, H., Teng, P., Liang, W.: Packed dense interest points for scene image retrieval. In: Sixth IEEE International Conference on Image and Graphics (ICIG), pp. 789–794 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Gholam Ali Montazer
    • 1
    Email author
  • Davar Giveki
    • 2
  • Mohammad Ali Soltanshahi
    • 3
  1. 1.Department of Information Technology Engineering, School of EngineeringTarbiat Modares UniversityTehranIran
  2. 2.Department of Information TechnologyIranian Research Institute for Information Science and Technology (Iran Doc)TehranIran
  3. 3.Department of Computer ScienceUniversity of TehranTehranIran

Personalised recommendations