Perceptual Feature Extraction

Chang, Edward Y.

doi:10.1007/978-3-642-20429-6_2

Edward Y. Chang²

1023 Accesses
1 Citations

Abstract

In this chapter,\(^\dagger\) we present a deep model-based and data-driven hybrid architecture (DMD) for feature extraction. First, we construct a deep learning pipeline for progressively learning image features from simple to complex. We mix this deep model-based pipeline with a data-driven pipeline, which extracts features from a large collection of unlabeled images. Sparse regularization is then performed on features extracted from both pipelines in an unsupervised way to obtain representative patches. Upon obtaining these patches, a supervised learning algorithm is employed to conduct object prediction. We present how DMD works and explain why it works more effectively than traditional models from both aspects of neuroscience and computational learning theory.

^†© ACM, 2010. This chapter is a minor revision of the author’s work with Zhiyu Wang and Dingyin Xia [1] published in VLS-MCMR’10. Permission to publish this chapter is granted under copyright license #2587600190581.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Disclaimer: We do not claim these heuristic-based features to be novel. Other heuristic-based features [27] may also be useful. What we consider to be important is that these features can augment model-based features to improve diversity before a principled theory can be formulated by neuroscientists to model cortex feedback/feedforward recursive signals.

References

Z. Wang, D. Xia, E.Y. Chang, A deep model-based and data-driven hybrid architecture for image annotation, in Proceedings of ACM International Workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval, pp. 13–18, 2010
Google Scholar
D.H. Hubel, T.N. Wiesel, Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)
Google Scholar
E. Miller, The prefrontal cortex and cognitive control. Nat. Rev. Neurosci. 1(1), 59–66 (2000)
Article Google Scholar
G. Potamianos, C. Neti, J. Luettin, I. Matthews, Audio–visual automatic speech recognition: An overview. in Issues in Visual and Audio–Visual Speech Processing (MIT Press, Cambridge, 2004)
Google Scholar
H. Lee, R. Grosse, R. Ranganath, A. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations, in Proceedings of International Con- ference on Machine Learning (ICML), 2009
Google Scholar
T. Serre, Learning a dictionary of shape-components in visual cortex: comparison with neu- rons, humans and machines. Ph.D. Thesis, Massachusetts Institute of Technology, 2006
Google Scholar
M. Riesenhuber, T. Poggio, Are cortical models really bound by the binding problem. Neuron 24(1), 87–93 (1999)
Article Google Scholar
T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, T. Poggio, Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007)
Article Google Scholar
Y. Bengio, Learning Deep Architectures for AI (Now Publishers, 2009)
Google Scholar
G. Hinton, S. Osindero, Y. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Y. Bengio, Y. LeCun, Scaling learning algorithms towards AI, in Large-Scale Kernel Machines (MIT Press, Cambridge, 2007), pp. 321–360
Google Scholar
G. Loosli, S. Canu, L. Bottou, Training invariant support vector machines using selective sampling, in Large Scale Kernel Machines (MIT Press, Cambridge, 2007) pp. 301–320
Google Scholar
M. Yasuda, T. Banno, H. Komatsu, Color selectivity of neurons in the posterior inferior temporal cortex of the macaque monkey. Cereb. Cortex 20(7), 1630–1646 (2009)
Article Google Scholar
K. Tsunoda, Y. Yamane, M. Nishizaki, M. Tanifuji, Complex objects are represented in macaque inferotemporal cortex by the combination of feature columns. Nat. Neurosci. 4, 832–838 (2001)
Article Google Scholar
I. Lampl, D. Ferster, T. Poggio, M. Riesenhuber, Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. J. Neurophysiol. 92(5), 2704 (2004)
Article Google Scholar
T. Gawne, J. Martin, Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. J. Neurophysiol. 88(3), 1128 (2002)
Google Scholar
D. Hubel, T. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 160(1), 106 (1962)
Google Scholar
M. Ranzato, F. Huang, Y. Boureau, Y. LeCun, Unsupervised learning of invariant feature hierarchies with applications to object recognition, in Proceedings of IEEE CVPR, 2007
Google Scholar
J. Jones, L. Palmer, An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J. Neurophysiol. 58(6), 1233 (1987)
Google Scholar
T. Serre, M. Riesenhuber, Realistic modeling of simple and complex cell tuning in the hmax model, and implications for invariant object recognition in cortex. MIT technical report, 2004
Google Scholar
C. Ekanadham, S. Reader, H. Lee, Sparse deep belief net models for visual area V2, in Proceedings of NIPS, 2008
Google Scholar
B. Olshausen, C. Anderson, D. Van Essen, A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. J. Neurosci. 13(11), 4700 (1993)
Google Scholar
D. Walther, T. Serre, T. Poggio, C. Koch, Modeling feature sharing between object detection and top-down attention. J. Vis. 5(8), 1041 (2005)
Article Google Scholar
S. Chikkerur, T. Serre, T. Poggio, A Bayesian inference theory of attention: neuroscience and algorithms. MIT technical report MIT-CSAIL-TR-2009-047, 2009
Google Scholar
E. Chang, B. Li, C. Li, Toward perception-based image retrieval.IEEE Content-Based Access of Image and Video Libraries, pp. 101–105, 2000
Google Scholar
S. Tong, E.Y. Chang, Support vector machine active learning for image retrieval, in Proceedings of ACM International Conference on Multimedia (ACM, New York, 2001) pp. 107–118
Google Scholar
R. Datta, D. Joshi, J. Li, J. Wang, Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. (CSUR) 40(2), 1–60 (2008)
Article Google Scholar
S. Coren, L.M. Ward, J.T. Enns, Sensation and Perception, 6th edn. (Wiley, New York, 2003)
Google Scholar
J. Leu, Computing a shape’s moments from its boundary. Pattern Recognit. 24(10), 949–957 (1991)
Article MathSciNet Google Scholar
H. Tamura, S. Mori, T. Yamawaki, Textural features corresponding to visual perception. IEEE Trans. Syst. Man Cybern. 8(6), 460–473 (1978)
Article Google Scholar
J. Smith, S. Chang, Automated image retrieval using color and texture. IEEE Trans. Pattern Anal. Mach. Intell. 1996
Google Scholar
P. Wu, B. Manjunath, S. Newsam, H. Shin, A texture descriptor for browsing and similarity retrieval. Sig. Process. Image Commun. 16(1-2), 33–43 (2000)
Article Google Scholar
W. Ma, H. Zhang, Benchmarking of image features for content-based retrieval, in Proceedings of Asilomar Conference on Signals Systems and Computers, pp. 253–260, 1998
Google Scholar
J. Deng, W. Dong, R. Socher, L. Li, K. Li, F.F. Li, Imagenet: a large-scale hierarchical image database, in Proceedings of IEEE CVPR, pp 156–161, 2009
Google Scholar
A. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-based image retrieval at the end of the early years. IEEE Trans. Pattern. Anal. Mach. Intell. 22(12), 1349–1380 (2000)
Article Google Scholar
Y. Ke, R. Sukthankar, PCA-SIFT: a more distinctive representation for local image descriptors, in Proceedings of IEEE CVPR, pp. 506–513, 2004
Google Scholar
O. Chapelle, P. Haffner, V. Vapnik, SVMs for histogram-based image classification. IEEE Trans. Neural Netw. 10(5), 1055 (1999)
Article Google Scholar
D. Blei, A. Ng, M. Jordan, Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
T.L.W. Serre, T. Poggio, Object recognition with features inspired by visual cortex, in Proceedings of IEEE CVPR, 2005
Google Scholar
C. Gross, Visual functions of inferotemporal cortex. Handbook of Sensory Physiology, vol 7(3), 1973
Google Scholar
C. Gross, C. Rocha-Miranda, D. Bender, Visual properties of neurons in inferotemporal cortex of the macaque. J. Neurophysiol. 35(1), 96–111 (1972)
Google Scholar
R. Salakhutdinov, A. Mnih, G. Hinton, Restricted Boltzmann machines for collaborative filtering, in Proceedings of International Conference on Machine Learning (ICML), pp. 791–798, 2007
Google Scholar

Download references

Author information

Authors and Affiliations

Google Inc., Mountain View, CA, 94306, USA
Edward Y. Chang

Authors

Edward Y. Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edward Y. Chang .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chang, E.Y. (2011). Perceptual Feature Extraction. In: Foundations of Large-Scale Multimedia Information Management and Retrieval. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20429-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-20429-6_2
Published: 26 August 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20428-9
Online ISBN: 978-3-642-20429-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics