Abstract
Recent advances in deep learning models have demonstrated remarkable accuracy in object classification. However, the limitations of Convolutional Neural Networks such as the requirement for a large collection of labeled data for training and supervised learning process has called for enhanced feature representation and for unsupervised models.
In this paper we propose a novel unsupervised sparsity-based model using Independent Subspace Analysis (ISA) to implement a hierarchical network for feature extraction. The results of our empirical evaluation demonstrates an improved classification accuracy when max pooling is paired with square pooling within each layer. In addition to accuracy, we further show that it also reduces the data dimensions within the layers outperforming known sparsity-based models.
R. Nath—Research performed whilst the author was at the University of Reading.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Tested on a database of 10 different categories included: airplane, bonsai, butterfly, car-side, chandelier, faces, ketch, leopards, motorbikes, watch of objects from the CalTech101 dataset [1].
References
Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories, vol. 106. https://doi.org/10.1016/j.cviu.2005.09.012
Baddeley, R., et al.: Responses of neurons in primary and inferior temporal visual cortices to natural scenes. Proc. Biol. Sci. 264(1389), 1775–1783 (1997). http://www.jstor.org/stable/51114
Hu, X., Zhang, J., Li, J., Zhang, B.: Sparsity-regularized HMAX for visual recognition. PLoS One 9(1), e81813 (2014). http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0081813
Hyvärinen, A., Hoyer, P.: Emergence of phase- and shift-invariant features by decomposition of natural images into independent feature subspaces. Neural Comput. 12(7), 1705–1720 (2000)
Hyvärinen, A., Hoyer, P.O., Inki, M.: Topographic independent component analysis. Neural Comput. 13(7), 1527–1558 (2001). https://doi.org/10.1162/089976601750264992
Hyvärinen, A., Hurri, J., Hoyer, P.O.: Natural Image Statistics: A Probabilistic Approach to Early Computational Vision. Springer, Heidelberg (2009). https://doi.org/10.1007/978-1-84882-491-1. Google-Books-ID: pq\_Fr1eYr7cC
Hyvärinen, A., Köster, U.: Complex cell pooling and the statistics of natural images. Netw. Comput. Neural Syst. 18(2), 81–100 (2007). https://doi.org/10.1080/09548980701418942
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR abs/1502.03167 (2015). http://arxiv.org/abs/1502.03167
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates Inc. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, pp. 3361–3368. IEEE Computer Society, Washington, DC (2011). https://doi.org/10.1109/CVPR.2011.5995496
Le, Q., et al.: Building high-level features using large scale unsupervised learning (2012). http://research.google.com/pubs/pub38115.html
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 609–616. ACM, New York (2009). https://doi.org/10.1145/1553374.1553453
Mutch, J., Lowe, D.G.: Object class recognition and localization using sparse features with limited receptive fields. Int. J. Comput. Vis. 80(1), 45–57 (2008). http://link.springer.com/article/10.1007/s11263-007-0118-0
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by V1? Vis. Res. 37(23), 3311–3325 (1997)
Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nat. Neurosci. 2(11), 1019–1025 (1999). https://doi.org/10.1038/14819. PMID: 10526343
Rolls, E.T.: Invariant visual object and face recognition: neural and computational bases, and a model, VisNet. Front. Comput. Neurosci. 6, 35 (2012). https://doi.org/10.3389/fncom.2012.00035, PMID: 22723777
Rolls, E.T., Treves, A.: The neuronal encoding of information in the brain. Prog. Neurobiol. 95(3), 448–490 (2011). http://www.sciencedirect.com/science/article/pii/S030100821100147X
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 994–1000 (2005). https://doi.org/10.1109/CVPR.2005.254
Serre, T.: Hierarchical models of the visual system. In: Jaeger, D., Jung, R. (eds.) Encyclopedia of Computational Neuroscience, pp. 1–12. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-7320-6_345-1
Serre, T., Oliva, A., Poggio, T.: A feedforward architecture accounts for rapid categorization. Proc. Nat. Acad. Sci. 104(15), 6424–6429 (2007). http://www.pnas.org/content/104/15/6424
Serre, T., Riesenhuber, M.: Realistic modeling of simple and complex cell tuning in the HMAX model, and implications for invariant object recognition in cortex (2004)
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007). https://doi.org/10.1109/TPAMI.2007.56
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014). http://jmlr.org/papers/v15/srivastava14a.html
Theriault, C., Thome, N., Cord, M.: HMAX-S: deep scale representation for biologically inspired image categorization. In: 2011 18th IEEE International Conference on Image Processing, pp. 1261–1264 (2011). https://doi.org/10.1109/ICIP.2011.6115663
Theriault, C., Thome, N., Cord, M.: Extended coding and pooling in the HMAX model. IEEE Trans. Image Process. 22(2), 764–777 (2013). https://doi.org/10.1109/TIP.2012.2222900
Xu, Y., Xiao, T., Zhang, J., Yang, K., Zhang, Z.: Scale-invariant convolutional neural networks. http://arxiv.org/abs/1411.6369
Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1713–1720, June 2011. https://doi.org/10.1109/CVPR.2011.5995732
Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV 2011, pp. 2018–2025. IEEE Computer Society, Washington, DC (2011). https://doi.org/10.1109/ICCV.2011.6126474
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Nath, R., Manjunathaiah, M. (2019). Sparse Feature Extraction Model with Independent Subspace Analysis. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2018. Lecture Notes in Computer Science(), vol 11331. Springer, Cham. https://doi.org/10.1007/978-3-030-13709-0_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-13709-0_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13708-3
Online ISBN: 978-3-030-13709-0
eBook Packages: Computer ScienceComputer Science (R0)