Efficient Object Localization with Variation-Normalized Gaussianized Vectors

Zhuang, Xiaodan; Zhou, Xi; Hasegawa-Johnson, Mark A.; Huang, Thomas S.

doi:10.1007/978-3-642-17554-1_5

Xiaodan Zhuang⁶,
Xi Zhou⁶,
Mark A. Hasegawa-Johnson⁶ &
…
Thomas S. Huang⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 332))

863 Accesses

Abstract

Effective object localization relies on efficient and effective searching method, and robust image representation and learning method. Recently, the Gaussianized vector representation has been shown effective in several computer vision applications, such as facial age estimation, image scene categorization and video event recognition. However, all these tasks are classification and regression problems based on the whole images. It is not yet explored how this representation can be efficiently applied in the object localization, which reveals the locations and sizes of the objects. In this work, we present an efficient object localization approach for the Gaussianized vector representation, following a branch-and-bound search scheme introduced by Lampert et al. [5]. In particular, we design a quality bound for rectangle sets characterized by the Gaussianized vector representation for fast hierarchical search. This bound can be obtained for any rectangle set in the image, with little extra computational cost, in addition to calculating the Gaussianized vector representation for the whole image. Further, we propose incorporating a normalization approach that suppresses the variation within the object class and the background class. Experiments on a multi-scale car dataset show that the proposed object localization approach based on the Gaussianized vector representation outperforms previous work using the histogram-of-keywords representation. The within-class variation normalization approach further boosts the performance. This chapter is an extended version of our paper at the 1st International Workshop on Interactive Multimedia for Consumer Electronics at ACM Multimedia 2009 [16].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
Google Scholar
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: CVPR (2005)
Google Scholar
Hatch, A., Stolcke, A.: Generalized linear kernels for one-versus-all classification: application to speaker recognition. In: ICASSP, vol. V, pp. 585–588 (2006)
Google Scholar
Lampert, C., Blaschko, M., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: Proc. of CVPR (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Google Scholar
Permuter, H., Francos, J., Jermyn, I.: Gaussian mixture models of texture and colour for image database retrieval. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 3, pp. III-569-72 (April 2003)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Rowley, H.A., Baluja, S., Kanade, T.: Human face detection in visual scenes. In: NIPS 8, pp. 875–881 (1996)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. of CVPR (2001)
Google Scholar
Yan, S., Zhou, X., Liu, M., Hasegawa-Johnson, M., Huang, T.S.: Regression from patch-kernel. In: CVPR (2008)
Google Scholar
Zhou, X., Zhuang, X., Tang, H., Hasegawa-Johnson, M., Huang, T.S.: A Novel Gaussianized Vector Representation for Natural Scene Categorization. In: ICPR (2008)
Google Scholar
Zhou, X., Zhuang, X., Yan, S., Chang, S., Hasegawa-Johnson, M., Huang, T.S.: SIFT-Bag Kernel for Video Event Analysis. In: ACM Multimedia (2008)
Google Scholar
Zhu, Q., Avidan, S., Yeh, M.-c., Cheng, K.-t.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)
Google Scholar
Zhuang, X., Zhou, X., Hasegawa-Johnson, M., Huang, T.S.: Face Age Estimation Using Patch-based Hidden Markov Model Supervectors. In: ICPR (2008)
Google Scholar
Zhuang, X., Zhou, X., Hasegawa-Johnson, M., Huang, T.S.: Efficient object localization with gaussianized vector representation. In: IMCE: Proceedings of the 1st International Workshop on Interactive Multimedia for Consumer Electronics, pp. 89–96 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Beckman Inst., ECE Dept., UIUC, USA
Xiaodan Zhuang, Xi Zhou, Mark A. Hasegawa-Johnson & Thomas S. Huang

Authors

Xiaodan Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Xi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Hasegawa-Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Thomas S. Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, University of Dundee , DD1 4HN, Dundee, Scotland, UK
Jianguo Zhang
Department of Electronic & Electrical Engineering, The University of Sheffield, S1 3JD, Sheffield, UK
Ling Shao
Microsoft Research Asia , 49 Zhichun Road, 100190, Beijing, P.R. China
Lei Zhang
Digital Imaging Research Centre, Faculty of Computing, Information Systems and Mathematics, Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE, Surrey, UK
Graeme A. Jones

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhuang, X., Zhou, X., Hasegawa-Johnson, M.A., Huang, T.S. (2011). Efficient Object Localization with Variation-Normalized Gaussianized Vectors. In: Zhang, J., Shao, L., Zhang, L., Jones, G.A. (eds) Intelligent Video Event Analysis and Understanding. Studies in Computational Intelligence, vol 332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17554-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-17554-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17553-4
Online ISBN: 978-3-642-17554-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics