Image Retrieval with Structured Object Queries Using Latent Ranking SVM

Lan, Tian; Yang, Weilong; Wang, Yang; Mori, Greg

doi:10.1007/978-3-642-33783-3_10

Tian Lan²¹,
Weilong Yang²¹,
Yang Wang²² &
…
Greg Mori²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7577))

Included in the following conference series:

European Conference on Computer Vision

9550 Accesses
22 Citations

Abstract

We consider image retrieval with structured object queries – queries that specify the objects that should be present in the scene, and their spatial relations. An example of such queries is “car on the road”. Existing image retrieval systems typically consider queries consisting of object classes (i.e. keywords). They train a separate classifier for each object class and combine the output heuristically. In contrast, we develop a learning framework to jointly consider object classes and their relations. Our method considers not only the objects in the query (“car” and “road” in the above example), but also related object categories can be useful for retrieval. Since we do not have ground-truth labeling of object bounding boxes on the test image, we represent them as latent variables in our model. Our learning method is an extension of the ranking SVM with latent variables, which we call latent ranking SVM. We demonstrate image retrieval and ranking results on a dataset with more than a hundred of object classes.

Download to read the full chapter text

Chapter PDF

Fast Re-ranking of Visual Search Results by Example Selection

Image Retrieval for Online Browsing in Large Image Collections

Machine Learning for Visual Concept Recognition and Ranking for Images

References

Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
MATH Google Scholar
Li, L.J., Fei-Fei, L.: What, where and who? classifying events by scene and object recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots: Learning a visually grounded storyline model from annotated videos. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Farhadi, A., Hejrati, M., Sadeghi, M.A., Young, P., Rashtchian, C., Hockenmaier, J., Forsyth, D.: Every Picture Tells a Story: Generating Sentences from Images. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 15–29. Springer, Heidelberg (2010)
Chapter Google Scholar
Kulkarni, G., Premraj, V., Dhar, S., Li, S., Choi, Y., Berg, A.C., Berg, T.L.: Baby talk: Understanding and generating simple image descriptions. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Survey 40, 1–60 (2008)
Article Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: ACM SIGKDD (2002)
Google Scholar
Yu, C.N., Joachims, T.: Learning structural SVMs with latent variables. In: International Conference on Machine Learning (2009)
Google Scholar
Blaschko, M.B., Vedaldi, A., Zisserman, A.: Simultaneous object detection and ranking with weak supervision. In: NIPS (2010)
Google Scholar
Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Parikh, D., Zitnick, C.L., Chen, T.: From appearance to context-based recognition: Dense labeling in small images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: IEEE International Conference on Computer Vision (2009)
Google Scholar
Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2Photo: Internet image montage. ACM Transactions on Graphics (2009)
Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Joachims, T.: Training linear SVMs in linear time. In: SIGKDD (2006)
Google Scholar
Choi, M.J., Lim, J.J., Torralba, A., Willsky, A.S.: Exploiting hierarchical context on a large database of object categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2010)
Google Scholar
Chapelle, O., Le, Q., Smola, A.: Large margin optimization of ranking measures. In: NIPS Workshop on Learning to Rank (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Simon Fraser University, Canada
Tian Lan, Weilong Yang & Greg Mori
University of Manitoba, Canada
Yang Wang

Authors

Tian Lan
View author publications
You can also search for this author in PubMed Google Scholar
Weilong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Greg Mori
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lan, T., Yang, W., Wang, Y., Mori, G. (2012). Image Retrieval with Structured Object Queries Using Latent Ranking SVM. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33783-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-33783-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33782-6
Online ISBN: 978-3-642-33783-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Image Retrieval with Structured Object Queries Using Latent Ranking SVM

Abstract

Chapter PDF

Similar content being viewed by others

Fast Re-ranking of Visual Search Results by Example Selection

Image Retrieval for Online Browsing in Large Image Collections

Machine Learning for Visual Concept Recognition and Ranking for Images

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Image Retrieval with Structured Object Queries Using Latent Ranking SVM

Abstract

Chapter PDF

Similar content being viewed by others

Fast Re-ranking of Visual Search Results by Example Selection

Image Retrieval for Online Browsing in Large Image Collections

Machine Learning for Visual Concept Recognition and Ranking for Images

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation