Abstract
High user interaction capability of mobile devices can help improve the accuracy of mobile visual search systems. At query time, it is possible to capture multiple views of an object from different viewing angles and at different scales with the mobile device camera to obtain richer information about the object compared to a single view and hence return more accurate results. Motivated by this, we propose a new multi-view visual query model on multi-view object image databases for mobile visual search. Multi-view images of objects acquired by the mobile clients are processed and local features are sent to a server, which combines the query image representations with early/late fusion methods and returns the query results. We performed a comprehensive analysis of early and late fusion approaches using various similarity functions, on an existing single view and a new multi-view object image database. The experimental results show that multi-view search provides significantly better retrieval accuracy compared to traditional single view search.
Similar content being viewed by others
References
A9.com, Inc. (2015) Amazon flow. http://www.a9.com/whatwedo/mobile-technology/flow-powered-by-amazon. Accessed: 2016-04-20
Arandjelovic R, Zisserman A (2012) Multiple queries for large scale specific object retrieval. In: British machine vision conference. BMVA Press, pp 92.1–92.11
Babenko A, Lempitsky V (2015) Aggregating deep convolutional features for image retrieval
Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: European conference on computer vision, lecture notes in computer science, vol 3951. Springer, Berlin, pp 404–417
CamFind (2015) CamFind. http://camfindapp.com. Accessed: 2016-04-20
Cha SH (2007) Comprehensive survey on distance/similarity measures between probability density functions. International Journal of Mathematical Models and Methods in Applied Sciences 1(4):300–307
Chen DM, Baatz G, Köser K, Tsai SS, Vedantham R, Pylvä T, Roimela K, Chen X, Bach J, Pollefeys M, Girod B, Grzeszczuk R (2011) City-scale landmark identification on mobile devices. In: Computer vision and pattern recognition. IEEE, pp 737–744
Chen DM, Girod B (2013) Memory-efficient image databases for mobile visual search. In: IEEE multimedia, vol 21, pp 14–23
Cummins M, Philbin J (2015) PlinkArt. http://www.androidtapp.com/plinkart. Accessed: 2016-04-20
DigiMarc, Co (2015) Digimarc discover. http://www.digimarc.com/discover. Accessed: 2016-04-20
Girod B, Chandrasekhar V, Chen DM, Cheung N, Grzeszczuk R, Reznik YA, Takacs G, Tsai SS, Vedantham R (2011) Mobile visual search. IEEE Signal Proc Mag 28(4):61–76
Girod B, Chandrasekhar V, Grzeszczuk R, Reznik YA (2011) Mobile visual search: architectures, technologies, and the emerging MPEG standard. IEEE Multimedia 18(3):86–94
Google, Inc (2015) Google goggles. http://www.google.com/mobile/goggles. Accessed: 2016-04-20
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. Tech. Rep. CNS-TR-2007-001, California Institute of Technology
Guan T, He Y, Duan L, Yang J, Gao J, Yu J (2014) Efficient BOF generation and compression for on-device mobile visual location recognition. IEEE Multimedia 21(2):32–41
Gunawardana A, Shani G (2009) A survey of accuracy evaluation metrics of recommendation tasks. J Mach Learn Res 10:2935–2962
Itseez (2015) OpenCV: open source computer vision library. http://opencv.org. Accessed: 2016-04-20
Ji R, Yu FX, Zhang T, Chang SF (2012) Active query sensing: suggesting the best query view for mobile visual search. ACM Trans Multimed Comput Commun Appl 8(3s):40
Joseph S, Balakrishnan K (2011) Multi-query content based image retrieval system using local binary patterns. Int J Comput Appl 17(7):1–5
Lampert CH (2009) Detecting objects in large image collections and videos by efficient subimage retrieval. In: International conference on computer vision, pp 987–994
Lee CH, Lin MF (2012) A multi-query strategy for content-based image retrieval. International Journal of Advanced Information Technologies 5(2):266–275
Li D, Chuah MC (2015) EMOD: an efficient on-device mobile visual search system. In: ACM multimedia systems conference, pp 25–36
Mazloom M, Habibian AH, Snoek CGM (2013) Querying for video events by semantic signatures from few examples. In: ACM multimedia conference, pp 609–612
Min W, Xu C, Xu M, Xiao X, Bao BK (2014) Mobile landmark search with 3D models. IEEE Trans Multimedia 16(3):623–636
Moghaddam B, Biermann H, Margaritis D (2001) Regions-of-interest and spatial layout for content-based image retrieval. Multimedia Tools and Applications 14 (2):201–210
Niaz U, Merialdo B (2013) Fusion methods for multimodal indexing of web data. In: International workshop on image and audio analysis for multimedia interactive services. Paris, France
Nokia (2015) Point and find. https://en.wikipedia.org/wiki/Nokia_Point_%26_Find. Accessed: 2016-04-20
Paulin M, Douze M, Harchaoui Z, Mairal J, Perronin F, Schmid C (2015) Local convolutional features with unsupervised training for image retrieval. In: International conference on computer vision
Qualcomm Connected Experiences, Inc (2015) Kooaba: image recognition. http://www.kooaba.com. Accessed: 2016-04-20
Shen X, Lin Z, Brandt J, Wu Y (2012) Mobile product image search by automatic query object extraction. In: European conference on computer vision, pp 114–127
Su Y, Chiu T, Chen Y, Yeh C, Hsu WH (2013) Enabling low bitrate mobile visual recognition: a performance versus bandwidth evaluation. In: ACM multimedia conference, pp 73–82
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. arXiv:1512.00567
Tang J, Acton S (2003) An image retrieval algorithm using multiple query images. In: International symposium on signal processing and its applications, vol 1, pp 193–196
Tuytelaars T, Mikolajczyk K (2008) Local invariant feature detectors: a survey. Foundations and Trends in Computer Graphics and Vision:177–280
Wang Y, Mei T, Wang J, Li H, Li S (2011) JIGSAW: interactive mobile visual search with multimodal queries. In: ACM international conference on multimedia, MM ’11, pp 73–82
Xue Y, Qian X, Zhang B (2013) Mobile image retrieval using multi-photos as query. In: IEEE international conference on multimedia and expo workshops, pp 1–4
Yu FX, Ji R, Chang SF (2011) Active query sensing for mobile location search. In: ACM international conference on multimedia. ACM, pp 3–12
Zhang C, Chen X, Chen WB (2007) An online multiple instance learning system for semantic image retrieval. In: IEEE international symposium on multimedia workshops, pp 83–84
Zhang N, Mei T, Hua XS, Guan L, Li S (2015) Taptell: interactive visual search for mobile task recommendation. J Vis Commun Image Represent 29(0):114–124
Zhang S, Yang M, Cour T, Yu K, Metaxas DN (2012) Query specific fusion for image retrieval. In: Computer vision - ECCV 2012, lecture notes in computer science, vol 7573, pp 660–673
Zhu L, Shen J, Jin H, Xie L, Zheng R (2015) Landmark classification with hierarchical multi-modal exemplar feature. IEEE Trans Multimedia 17(7):981–993
Zhu L, Zhang A (2000) Supporting multi-example image queries in image databases. In: International conference on multimedia and expo, pp 697–700
Acknowledgments
The first author was supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) under BİDEB 2228-A Graduate Scholarship.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Çalışır, F., Baştan, M., Ulusoy, Ö. et al. Mobile multi-view object image search. Multimed Tools Appl 76, 12433–12456 (2017). https://doi.org/10.1007/s11042-016-3659-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3659-9