A Survey of Landmark Recognition Using the Bag-of-Words Framework

Bhattacharya, Priyadarshi; Gavrilova, Marina

doi:10.1007/978-3-642-31745-3_13

Priyadarshi Bhattacharya³ &
Marina Gavrilova³

Part of the book series: Studies in Computational Intelligence ((SCI,volume 441))

1267 Accesses
11 Citations

Abstract

Recent years have seen an exponential increase in the use of mobile devices. Since many of the mobile devices are equipped with a camera and are connected to the internet, localization in an urban environment using landmark images is gaining popularity. The idea is simple. A tourist takes images of a landmark where he or she is standing with a mobile camera which are then transmitted to a server where the image(s) are matched against a database of landmark images for that locality. If a match is found, relevant information such as background information on the landmark, nearby transit facilities or information on other important landmarks nearby is sent back. This type of application has tremendous potential as a mobile city guide or navigation aid. In this paper, we investigate the use of local invariant shape features and global features such as colour and texture for the recognition task as evident from literature and present various retrieval techniques. A variety of descriptors for landmark recognition and scene classification are discussed. Insights into vocabulary building and weighting schemes for representing landmark images are provided that can help in boosting recognition rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(4), 509–522 (2002)
Article Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. on Pattern Analysis and Machine Intelligence 30(4), 712–727 (2008)
Article Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004)
Article Google Scholar
Chen, T., Wu, K., Yap, K.-H., Li, Z., Tsai, F.S.: A survey on mobile landmark recognition for information retrieval. In: Mobile Data Management: Systems, Services and Middleware, pp. 625–630 (2009)
Google Scholar
Chum, O., Matas, J.: Large-scale discovery of spatially related images. IEEE PAMI 32, 371–377 (2010)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Chen, D., et al.: City-scale landmark identification on mobile devices. In: CVPR (2011)
Google Scholar
Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)
Chapter Google Scholar
Mikolajczyk, K., et al.: A comparison of affine region detectors. IJCV 65, 43–72 (2005)
Article Google Scholar
Chen, T., et al.: A multi-scale learning approach for landmark recognition using mobile devices. In: IEEE Information, Communications and Signal Processing, pp. 1–4 (2009)
Google Scholar
Zheng, Y.-T., et al.: Tour the world: building a web-scale landmark recognition engine. In: CVPR, pp. 1085–1092 (2009)
Google Scholar
Fischler, M., Bolles, R.: Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Communications of the ACM 24, 381–395 (1981)
Article MathSciNet Google Scholar
Fritz, G., Seifert, C., Paletta, L.: A mobile vision system for urban detection with informative local descriptors. In: IEEE Conf. Computer Vision, pp. 30–35 (2006)
Google Scholar
Ge, Y., Yu, J.: A scene recognition algorithm based on covariance descriptor. In: IEEE Conf. Cybernetics and Systems, pp. 838–842 (2008)
Google Scholar
Gokalp, D., Aksoy, S.: Scene classification using bag-of-regions representations. In: CVPR, pp. 1–8 (2007)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Hile, H., Grzeszczuk, R., Liu, A., Vedantham, R., Košecka, J., Borriello, G.: Landmark-Based Pedestrian Navigation with Enhanced Spatial Reasoning. In: Tokuda, H., Beigl, M., Friday, A., Brush, A.J.B., Tobe, Y. (eds.) Pervasive 2009. LNCS, vol. 5538, pp. 59–76. Springer, Heidelberg (2009)
Chapter Google Scholar
Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: ACM Int’l Conference on Image and Video Retrieval, pp. 494–501 (2007)
Google Scholar
Kadir, T., Zisserman, A., Brady, M.: An Affine Invariant Salient Region Detector. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004)
Chapter Google Scholar
Ke, Y., Sukthankar, R.: PCA-SIFT: A more distinctive representation for local image descriptors. In: CVPR, pp. 66–75 (2004)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Google Scholar
Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)
Google Scholar
Li, Y., Lim, J.H., Goh, H.: Cascaded classification with optimal candidate selection for effective place recognition. In: IEEE Conf. Multimedia, pp. 1493–1496 (2008)
Google Scholar
Lim, J., Li, Y., You, Y.: Scene recognition with camera phones for tourist information access. In: IEEE Conf. Multimedia, pp. 100–103 (2007)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV, 91–110 (2004)
Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22(10), 761–767 (2004)
Article Google Scholar
Mikolajczyk, K.: Scale and affine invariant interest point detectors. PhD thesis (2002)
Google Scholar
Mikolajczyk, K., Schmid, C.: An Affine Invariant Interest Point Detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)
Chapter Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: CVPR (2003)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. IJCV 60(1), 63–86 (2004)
Article Google Scholar
Nguyen, G.P., Andersen, H.J., Hoilund, C.: Street navigation using visual information on mobile phones. In: Intelligent Systems Design and Applications, pp. 37–42 (2010)
Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV) 42(3), 145–175 (2001)
Article MATH Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)
Google Scholar
Pronobis, S., Caputo, A.: Confidence-based cue integration for visual place recognition. In: IEEE Conf. Intelligent Robots and Systems, pp. 2394–2401 (2007)
Google Scholar
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1478 (2003)
Google Scholar
Steinbach, M., Ertoz, L., Kumar, V.: Challenges of clustering high dimensional data. In: New Vistas in Statistical Physics Applications in Econophysics, Bioinformatics, and Pattern Recognition (2003)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: ICCV, pp. 273–280 (2003)
Google Scholar
Turcot, P., Lowe, D.G.: Better matching with fewer features: The selection of useful features in large database recognition problems. In: ICCV Workshop on Emergent Issues in Large Amounts of Visual Data, WS-LAVD (2009)
Google Scholar
Tuytelaars, T., Van Gool, L.: Matching widely separated views based on affine invariant regions. IJCV 59(1), 61–85 (2004)
Article Google Scholar
Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision 3(3), 177–280 (2008)
Article Google Scholar
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1582–1596 (2010)
Article Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008), http://www.vlfeat.org/ (last accessed 2012)
Yang, J., Jiang, Y.-G., Hauptmann, A.G., Ngo, C.-W.: Evaluating bag-of-visual-words representations in scene classification. In: Int’l Workshop on Multimedia Information Retrieval, pp. 197–206 (2007)
Google Scholar
Yang, Y., Pedersen, J.: A comparative study on feature selection in text categorization. In: Intl Conf. on Machine Learning, pp. 412–420 (1997)
Google Scholar
Yeh, T., Tollmar, K., Darrell, T.: Searching the web with mobile images for location recognition. In: CVPR, pp. 76–81 (2004)
Google Scholar
Zhang, W., Kosecka, J.: Image based localization in urban environments. In: International Symposium on 3D Data Processing, Visualization and Transmission, pp. 33–40 (2006)
Google Scholar
Zhang, W., Kosecka, J.: Hierarchical building recognition. In: Image and Vision Computing, pp. 704–716 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Calgary, 2500 University Dr. NW, Calgary, Alberta, Canada, T2N 1N4
Priyadarshi Bhattacharya & Marina Gavrilova

Authors

Priyadarshi Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Marina Gavrilova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Priyadarshi Bhattacharya .

Editor information

Editors and Affiliations

Laboratoire MSI, Université de Limoges, rue d'Isle 83, Limoges, 87000, France
Dimitri Plemenos
Institution of Athens, Dept. Computer Science, Technological Education, Ag. Spyridonos Str., Athens, 122 10, Greece
Georgios Miaoulis

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bhattacharya, P., Gavrilova, M. (2013). A Survey of Landmark Recognition Using the Bag-of-Words Framework. In: Plemenos, D., Miaoulis, G. (eds) Intelligent Computer Graphics 2012. Studies in Computational Intelligence, vol 441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31745-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-31745-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31744-6
Online ISBN: 978-3-642-31745-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics