Customers’ Activity Recognition in Intelligent Retail Environments

  • Emanuele Frontoni
  • Paolo Raspa
  • Adriano Mancini
  • Primo Zingaretti
  • Valerio Placidi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8158)


This paper aims to propose a novel idea of an embedded intelligent system where low cost embedded vision systems can analyze human behaviors to obtain interactivity and statistical data, mainly devoted to customer behavior analysis. In this project we addressed the need for new services into the shop, involving consumers more directly and instigating them to increase their satisfaction and, as a consequence, their purchases. To do this, technology is very important and allows making interactions between costumers and products and between customers and the environment of the shop a rich source of marketing analysis.

We construct a novel system that uses vertical RGBD sensor for people counting and shelf interaction analysis, where the depth information is used to remove the affect of the appearance variation and to evaluate customers’ activities inside the store and in front of the shelf, with products. Also group interactions are monitored and analyzed with the main goal of having a better knowledge of the customers’ activities, using real data in real time.

Even if preliminary, results are convincing and most of all the general architecture is affordable in this specific application, robust, easy to install and maintain and low cost.


RGBD depth images activity recognition interactions retail environments people counting shelf interaction map 


  1. 1.
    Ascani, A., Frontoni, E., Mancini, A., Zingaretti, P.: Feature group matching for appearance-based localization. In: IEEE/RSJ, International Conference on Intelligent RObots and Systems, IROS 2008, Nice (2008)Google Scholar
  2. 2.
    Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3D human pose annotations. In: ICCV (2009)Google Scholar
  3. 3.
    Brox, T., Bourdev, L., Maji, S., Malik, J.: Object segmentation by alignment of poselet activations to image contours. In: CVPR (2011)Google Scholar
  4. 4.
    Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of key-points. In: WS-SLCV, ECCV (2004) Google Scholar
  5. 5.
    Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: ICCV (2009)Google Scholar
  6. 6.
    Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for static human-object interactions. In: SMiCV, CVPR (2010)Google Scholar
  7. 7.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  8. 8.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE PAMI (2009)Google Scholar
  9. 9.
    Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Pose search: retrieving people using their pose. In: CVPR (2009)Google Scholar
  10. 10.
    Gupta, A., Kembhavi, A., Davis, L.: Observing human-object interactions: Using spatial and functional compatibility for recognition. IEEE PAMI 31(10), 1775–1789 (2009)CrossRefGoogle Scholar
  11. 11.
    Johnson, S., Everingham, M.: Learning effective human pose estimation from inaccurate annotation. In: CVPR (2011)Google Scholar
  12. 12.
    Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)Google Scholar
  13. 13.
    Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)Google Scholar
  14. 14.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. II: 2169–II: 2178 (2006)Google Scholar
  15. 15.
    Li, L., Su, H., Xing, E., Fei-Fei, L.: Object bank: A high-level image representation for scene classification and semantic feature sparsification. In: NIPS (2010)Google Scholar
  16. 16.
    Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed represen-tation of pose and appearance. In: CVPR (2011)Google Scholar
  17. 17.
    Moeslund, T.B., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. CVIU 103(2-3), 90–126 (2006)Google Scholar
  18. 18.
    Yao, B., Fei-Fei, L.: Grouplet: A structured image representation for recognizing human and object interactions. In: CVPR (2010)Google Scholar
  19. 19.
    Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: CVPR (2010)Google Scholar
  20. 20.
    Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. IJCV 73(2), 213–238 (2007)CrossRefGoogle Scholar
  21. 21.
    Mancini, A., Frontoni, E., Zingaretti, P., Placidi, V.: Smart vision system for shelf analisys in intelligent retail environments. In: ASME/IEEE International Conference on Mechatronic and Embedded Systems and Applications (MESA 2013), Portland, Oregon (2013)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Emanuele Frontoni
    • 1
  • Paolo Raspa
    • 1
  • Adriano Mancini
    • 1
  • Primo Zingaretti
    • 1
  • Valerio Placidi
    • 2
  1. 1.Dipartimento di Ingegneria dell’InformazioneUniversità Politecnica delle MarcheAnconaItaly
  2. 2.Grottini LabPorto RecanatiItaly

Personalised recommendations