Tensor Deep Stacking Networks and Kernel Deep Convex Networks for Annotating Natural Scene Images

  • Niharjyoti SarangiEmail author
  • C. Chandra Sekhar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9493)


Image annotation is defined as the task of assigning semantically relevant tags to an image. Features such as color, texture, and shape are used by many machine learning algorithms for the image annotation task. Success of these algorithms is dependent on carefully handcrafted features. Deep learning models use multiple layers of processing to learn abstract, high level representations from raw data. Deep belief networks are the most commonly used deep learning models formed by pre-training the individual Restricted Boltzmann Machines in a layer-wise fashion and then stacking together and training them using error back-propagation. However, the time taken to train a deep learning model is extensive. To reduce the time taken for training, models that try to eliminate back-propagation by using convex optimization and kernel trick to get a closed-form solution for the weights of the connections have been proposed. In this paper we explore two such models, Tensor Deep Stacking Network and Kernel Deep Convex Network, for the task of automatic image annotation. We use a deep convolutional network to extract high level features from different sub-regions of the images, and then use these features as inputs to these models. Performance of the proposed approach is evaluated on benchmark image datasets.


Image annotation Tensor deep stacking networks Kernel deep convex networks Deep convolutional network Deep learning 


  1. 1.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  2. 2.
    Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)CrossRefGoogle Scholar
  3. 3.
    Deng, L., Tür, G., He, X., Hakkani-Tür, D.Z.: Use of kernel deep convex networks and end-to-end learning for spoken language understanding. In: IEEE Workshop on Spoken Language Technologies, pp. 210–215, December 2012Google Scholar
  4. 4.
    Deng, L., Yu, D.: Deep convex network: a scalable architecture for speech pattern classification. In: Interspeech, August 2011Google Scholar
  5. 5.
    Deng, L., Yu, D., Platt, J.: Scalable stacking and learning for building deep architectures. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, March 2012Google Scholar
  6. 6.
    Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. Adv. Neural Inf. Process. Syst. 14, 681–687 (2001)Google Scholar
  7. 7.
    Hare, J., Samangooei, S., Lewis, P., Nixon, M.: Semantic spaces revisited: investigating the performance of auto-annotation and semantic retrieval using semantic spaces. In: Proceedings of the International Conference on Content-based Image and Video Retrieval, pp. 359–368, July 2008Google Scholar
  8. 8.
    Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Hinton, G.E., Osindero, S., Welling, M., Teh, Y.W.: Unsupervised discovery of nonlinear structure using contrastive backpropagation. Cogn. Sci. 30(4), 725–731 (2006)CrossRefGoogle Scholar
  10. 10.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Uncertainty in Artificial Intelligence, pp. 289–296 (1999)Google Scholar
  12. 12.
    Huiskes, M.J., Lew, M.S.: The mir flickr retrieval evaluation. In: Proceedings of the 2008 ACM International Conference on Multimedia Information Retrieval (2008)Google Scholar
  13. 13.
    Hutchinson, B., Deng, L., Yu, D.: Tensor deep stacking networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1944–1957 (2013)CrossRefGoogle Scholar
  14. 14.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Proc. Neural Inf. Process. Syst. 22, 1106–1114 (2012)Google Scholar
  15. 15.
    Le Roux, N., Bengio, Y.: Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput. 20(6), 1631–1649 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of International Symposium on Circuits and Systems, pp. 253–256 (2010)Google Scholar
  17. 17.
    Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616 (2009)Google Scholar
  18. 18.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  19. 19.
    Montavon, G., Braun, M.L., Mller, K.-R.: Deep Boltzmann machines as feed-forward hierarchies. Proc. Int. Conf. Artif. Intell. Stat. 22, 798–804 (2012)Google Scholar
  20. 20.
    Ranzato, M., Krizhevsky, A., Hinton, G.E.: Factored 3-way restricted Boltzmann machines for modeling natural images. J. Mach. Learn. Res. Proc. Track 9, 621–628 (2010)Google Scholar
  21. 21.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Salakhutdinov, R., Hinton, G.: Deep Boltzmann machines. Proc. Int. Conf. Artif. Intell. Stat. 5, 448–455 (2009)zbMATHGoogle Scholar
  23. 23.
    Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse. Min. 3(3), 1–13 (2007)CrossRefGoogle Scholar
  24. 24.
    Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185–214 (2008)CrossRefGoogle Scholar
  25. 25.
    Washington, U.: Washington ground truth database. (2004)
  26. 26.
    Zhang, M.-L., Zhou, Z.-H.: Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)CrossRefzbMATHGoogle Scholar
  27. 27.
    Zhang, M.-L., Zhou, Z.-H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology MadrasChennaiIndia

Personalised recommendations