Learning from #Barcelona Instagram Data What Locals and Tourists Post About Its Neighbourhoods
Massive tourism is becoming a big problem for some cities, such as Barcelona, due to its concentration in some neighborhoods. In this work we gather Instagram data related to Barcelona consisting on images-captions pairs and, using the text as a supervisory signal, we learn relations between images, words and neighborhoods. Our goal is to learn which visual elements appear in photos when people is posting about each neighborhood. We perform a language separate treatment of the data and show that it can be extrapolated to a tourists and locals separate analysis, and that tourism is reflected in Social Media at a neighborhood level. The presented pipeline allows analyzing the differences between the images that tourists and locals associate to the different neighborhoods.
The proposed method, which can be extended to other cities or subjects, proves that Instagram data can be used to train multi-modal (image and text) machine learning models that are useful to analyze publications about a city at a neighborhood level. We publish the collected dataset, InstaBarcelona and the code used in the analysis.
KeywordsSelf-supervised learning Webly supervised learning Social media analysis City tourism analysis
This work was supported by the Doctorats Industrials program from the Generalitat de Catalunya, the Spanish project TIN2017-89779-P, the H2020 Marie Skłodowska-Curie actions of the European Union, grant agreement No 712949 (TECNIOspring PLUS), and the Agency for Business Competitiveness of the Government of Catalonia (ACCIO).
- 1.Ayuntament de Barcelona: Barcelona stadistics. Observatory districts (2017). http://www.bcn.cat/estadistica/angles/documents/districtes/index.htm
- 5.Gomez, L., Patel, Y., Rusiñol, M., Karatzas, D., Jawahar, C.V.: Self-supervised learning of visual features through embedding images into text topic spaces. In: CVPR (2017)Google Scholar
- 6.Gordo, A., Larlus, D.: Beyond instance-level image retrieval: leveraging captions to learn a global visual representation for semantic retrieval. In: CVPR (2017)Google Scholar
- 7.He, Y., Yang, X., Zhang, X.: Instagram post data analysis. arXiv (2015)Google Scholar
- 8.Instagram: Instagram’s Neighborhood Flavors - Instagram Engineering. Medium (2017)Google Scholar
- 9.Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
- 10.Kuo, Y.H., et al.: Discovering the city by mining diverse and multimodal data streams. In: ACM International Conference on Multimedia (2014)Google Scholar
- 11.Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR (2013)Google Scholar
- 12.Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. In: NIPS (2013)Google Scholar
- 13.Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: EMNLP (2014)Google Scholar
- 14.Salvador, A., et al.: Learning cross-modal embeddings for cooking recipes and food images. In: CVPR (2017)Google Scholar
- 15.Singh, V.K., Hegde, S., Atrey, A.: Towards measuring fine-grained diversity using social media photographs. In: ICWSM (2017)Google Scholar