ColorNet: Investigating the Importance of Color Spaces for Image Classification

  • Shreyank N. GowdaEmail author
  • Chun Yuan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11364)


Image classification is a fundamental application in computer vision. Recently, deeper networks and highly connected networks have shown state of the art performance for image classification tasks. Most datasets these days consist of a finite number of color images. These color images are taken as input in the form of RGB images and classification is done without modifying them. We explore the importance of color spaces and show that color spaces (essentially transformations of original RGB images) can significantly affect classification accuracy. Further, we show that certain classes of images are better represented in particular color spaces and for a dataset with a highly varying number of classes such as CIFAR and Imagenet, using a model that considers multiple color spaces within the same model gives excellent levels of accuracy. Also, we show that such a model, where the input is preprocessed into multiple color spaces simultaneously, needs far fewer parameters to obtain high accuracy for classification. For example, our model with 1.75M parameters significantly outperforms DenseNet 100-12 that has 12M parameters and gives results comparable to Densenet-BC-190-40 that has 25.6M parameters for classification of four competitive image classification datasets namely: CIFAR-10, CIFAR-100, SVHN and Imagenet. Our model essentially takes an RGB image as input, simultaneously converts the image into 7 different color spaces and uses these as inputs to individual densenets. We use small and wide densenets to reduce computation overhead and number of hyperparameters required. We obtain significant improvement on current state of the art results on these datasets as well.


Color spaces Densenet Fusion 


  1. 1.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE, June 2009Google Scholar
  2. 2.
    Krizhevsky, A., Nair, V., Hinton, G.: The CIFAR-10 dataset (2014).
  3. 3.
    Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, no. 2, December 2011Google Scholar
  4. 4.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  5. 5.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput. Vis. Image Understand. 106(1), 59–70 (2007)CrossRefGoogle Scholar
  6. 6.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)Google Scholar
  7. 7.
    Gowda, S.N.: Human activity recognition using combinatorial deep belief networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1589–1594. IEEE, July 2017Google Scholar
  8. 8.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)Google Scholar
  9. 9.
    LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRefGoogle Scholar
  10. 10.
    Gowda, S.N.: Face verification across age progression using facial feature extraction. In: International Conference on Signal and Information Processing (IConSIP), pp. 1–5. IEEE, 2016 OctoberGoogle Scholar
  11. 11.
    Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
  12. 12.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  13. 13.
    Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)Google Scholar
  14. 14.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)Google Scholar
  15. 15.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456, June 2015Google Scholar
  16. 16.
    Gowda, S.N.: Fiducial points detection of a face using RBF-SVM and adaboost classification. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10116, pp. 590–598. Springer, Cham (2017). Scholar
  17. 17.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Cham (2014). Scholar
  18. 18.
    Girshick, R.: Fast r-cnn. arXiv preprint arXiv:1504.08083 (2015)
  19. 19.
    Gowda, S.N.: Age estimation by LS-SVM regression on facial images. In: Bebis, G., et al. (eds.) ISVC 2016. LNCS, vol. 10073, pp. 370–379. Springer, Cham (2016). Scholar
  20. 20.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  21. 21.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256, March 2010Google Scholar
  22. 22.
    Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120 (2013)
  23. 23.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In Advances in Neural Information Processing Systems, pp. 2377–2385 (2015)Google Scholar
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  26. 26.
    Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv:1605.07146 (2016)
  27. 27.
    Larsson, G., Maire, M., Shakhnarovich, G.: Fractalnet: Ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648 (2016)
  28. 28.
    Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, no. 2, p. 3, July 2017Google Scholar
  29. 29.
    Chai, D., Bouzerdoum, A.: A Bayesian approach to skin color classification in YCbCr color space. In: TENCON 2000 Proceedings, pp. 421–424. IEEE (2000)Google Scholar
  30. 30.
    Vandenbroucke, N., Macaire, L., Postaire, J.G.: Color pixels classification in an hybrid color space. In: Proceedings of 1998 International Conference on Image Processing, ICIP 98, vol. 1, pp. 176–180. IEEE, October 1998Google Scholar
  31. 31.
    Vandenbroucke, N., Macaire, L., Postaire, J.G.: Color image segmentation by pixel classification in an adapted hybrid color space. Application to soccer image analysis. Comput. Vis. Image Underst. 90(2), 190–216 (2003)CrossRefGoogle Scholar
  32. 32.
    Shin, M.C., Chang, K.I., Tsap, L.V.: Does colorspace transformation make any difference on skin detection? In: Proceedings of Sixth IEEE Workshop on Applications of Computer Vision, (WACV 2002), pp. 275–279. IEEE (2002)Google Scholar
  33. 33.
    Zarit, B.D., Super, B.J., Quek, F.K.: Comparison of five color models in skin pixel classification. In: 1999 International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, pp. 58–63. IEEE (1999)Google Scholar
  34. 34.
    Van der Walt, S., et al.: scikit-image: image processing in Python (2014)Google Scholar
  35. 35.
    Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147, February 2013Google Scholar
  36. 36.
    Lin, M., Chen, Q., Yan, S.: Network in network. arXiv:1312.4400 (2013)
  37. 37.
    Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806 (2014)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Computer Science DepartmentTsinghua UniversityBeijingChina
  2. 2.Graduate School at ShenzhenTsinghua UniversityShenzhenChina

Personalised recommendations