Learning from #Barcelona Instagram Data What Locals and Tourists Post About Its Neighbourhoods

  • Raul GomezEmail author
  • Lluis Gomez
  • Jaume Gibert
  • Dimosthenis Karatzas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11134)


Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods.

The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.


Self-supervised learning Webly supervised learning Social media analysis City tourism analysis 



This work was supported by the Doctorats Industrials program from the Generalitat de Catalunya, the Spanish project TIN2017-89779-P, the H2020 Marie Skłodowska-Curie actions of the European Union, grant agreement No 712949 (TECNIOspring PLUS), and the Agency for Business Competitiveness of the Government of Catalonia (ACCIO).


  1. 1.
    Ayuntament de Barcelona: Barcelona stadistics. Observatory districts (2017).
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Boy, J.D., Uitermark, J.: Reassembling the city through Instagram. Trans. Inst. Br. Geogr. 42, 612–624 (2017)CrossRefGoogle Scholar
  4. 4.
    García-Palomares, J.C., Gutiérrez, J., Mínguez, C.: Identification of tourist hot spots based on social networks: a comparative analysis of European metropolises using photo-sharing services and GIS. Appl. Geogr. 63, 408–417 (2015)CrossRefGoogle Scholar
  5. 5.
    Gomez, L., Patel, Y., Rusiñol, M., Karatzas, D., Jawahar, C.V.: Self-supervised learning of visual features through embedding images into text topic spaces. In: CVPR (2017)Google Scholar
  6. 6.
    Gordo, A., Larlus, D.: Beyond instance-level image retrieval: leveraging captions to learn a global visual representation for semantic retrieval. In: CVPR (2017)Google Scholar
  7. 7.
    He, Y., Yang, X., Zhang, X.: Instagram post data analysis. arXiv (2015)Google Scholar
  8. 8.
    Instagram: Instagram’s Neighborhood Flavors - Instagram Engineering. Medium (2017)Google Scholar
  9. 9.
    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  10. 10.
    Kuo, Y.H., et al.: Discovering the city by mining diverse and multimodal data streams. In: ACM International Conference on Multimedia (2014)Google Scholar
  11. 11.
    Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR (2013)Google Scholar
  12. 12.
    Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. In: NIPS (2013)Google Scholar
  13. 13.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: EMNLP (2014)Google Scholar
  14. 14.
    Salvador, A., et al.: Learning cross-modal embeddings for cooking recipes and food images. In: CVPR (2017)Google Scholar
  15. 15.
    Singh, V.K., Hegde, S., Atrey, A.: Towards measuring fine-grained diversity using social media photographs. In: ICWSM (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Eurecat, Centre Tecnològic de Catalunya, Unitat de Tecnologies AudiovisualsBarcelonaSpain
  2. 2.Computer Vision CenterUniversitat Autònoma de BarcelonaBarcelonaSpain

Personalised recommendations