MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-Streams

  • Md. Mostafa Kamal SarkerEmail author
  • Hatem A. Rashwan
  • Estefania Talavera
  • Syeda Furruka Banu
  • Petia Radeva
  • Domenec Puig
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11133)


First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called “EgoFoodPlaces”. Experimental results shows promising results of food places classification in egocentric photo-streams.


Deep learning Food pattern classification Egocentric photo-streams Visual lifelogging 



This research is funded by the program Marti Franques under the agreement between Universitat Rovira Virgili and Fundació Catalunya La Pedrera. This work was partially founded by TIN2015-66951-C2, SGR 1742, ICREA Academia 2014, Marat TV3 (n 20141510), and Nestore Horizon2020 SC1-PM-15-2017 (n 769643).


  1. 1.
    Aghaei, M., Dimiccoli, M., Ferrer, C.C., Radeva, P.: Towards social pattern characterization in egocentric photo-streams. Comput. Vis. Image Underst. 171, 104–117 (2018)CrossRefGoogle Scholar
  2. 2.
    Aghaei, M., Dimiccoli, M., Radeva, P.: Towards social interaction detection in egocentric photo-streams. In: Eighth International Conference on Machine Vision (ICMV 2015), vol. 9875, p. 987514. International Society for Optics and Photonics (2015)Google Scholar
  3. 3.
    Bolanos, M., Dimiccoli, M., Radeva, P.: Toward storytelling from visual lifelogging: an overview. IEEE Trans. Hum. Mach. Syst. 47(1), 77–90 (2017)Google Scholar
  4. 4.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)CrossRefGoogle Scholar
  5. 5.
    Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
  6. 6.
    Dimiccoli, M., Bolaños, M., Talavera, E., Aghaei, M., Nikolov, S.G., Radeva, P.: SR-clustering: semantic regularized clustering for egocentric photo streams segmentation. Comput. Vis. Image Underst. 155, 55–69 (2017)CrossRefGoogle Scholar
  7. 7.
    Grimm, E.R., Steinle, N.I.: Genetics of eating behavior: established and emerging concepts. Nutr. Rev. 69(1), 52–60 (2011)CrossRefGoogle Scholar
  8. 8.
    Gulcehre, C., Sotelo, J., Moczulski, M., Bengio, Y.: A robust adaptive stochastic gradient method for deep learning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 125–132. IEEE (2017)Google Scholar
  9. 9.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  10. 10.
    Kemps, E., Tiggemann, M., Hollitt, S.: Exposure to television food advertising primes food-related cognitions and triggers motivation to eat. Psychol. Health 29(10), 1192–1205 (2014)CrossRefGoogle Scholar
  11. 11.
    Moran, T.H., Gao, S.: Looking for food in all the right places? Cell Metab. 3(4), 233–234 (2006)CrossRefGoogle Scholar
  12. 12.
    Paszke, A., Gross, S., Chintala, S., Chanan, G.: Pytorch (2017)Google Scholar
  13. 13.
    Schüssler-Fiorenza Rose, S.M., et al.: Potentially avoidable hospitalizations among people at different activity of daily living limitation stages. Health Serv. Res. 52(1), 132–155 (2017)CrossRefGoogle Scholar
  14. 14.
    Sebag, A., Schoenauer, M., Sebag, M.: Stochastic gradient descent: going as fast as possible but not faster. In: OPTML 2017: 10th NIPS Workshop on Optimization for Machine Learning (2017)Google Scholar
  15. 15.
    de Wijk, R.A., Polet, I.A., Boek, W., Coenraad, S., Bult, J.H.: Food aroma affects bite size. Flavour 1(1), 3 (2012)CrossRefGoogle Scholar
  16. 16.
    Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492. IEEE (2010)Google Scholar
  17. 17.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.DEIMRovira i Virgili UniversityTarragonaSpain
  2. 2.ETSEQRovira i Virgili UniversityTarragonaSpain
  3. 3.Department of MathematicsUniversity of BarcelonaBarcelonaSpain

Personalised recommendations