Calorie Counter: RGB-Depth Visual Estimation of Energy Expenditure at Home

  • Lili TaoEmail author
  • Tilo Burghardt
  • Majid Mirmehdi
  • Dima Damen
  • Ashley Cooper
  • Sion Hannuna
  • Massimo Camplani
  • Adeline Paiement
  • Ian Craddock
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10116)


We present a new framework for vision-based estimation of calorific expenditure from RGB-D data - the first that is validated on physical gas exchange measurements and applied to daily living scenarios. Deriving a person’s energy expenditure from sensors is an important tool in tracking physical activity levels for health and lifestyle monitoring. Most existing methods use metabolic lookup tables (METs) for a manual estimate or systems with inertial sensors which ultimately require users to wear devices. In contrast, the proposed pose-invariant and individual-independent vision framework allows for a remote estimation of calorific expenditure. We introduce, and evaluate our approach on, a new dataset called SPHERE-calorie, for which visual estimates can be compared against simultaneously obtained, indirect calorimetry measures based on gas exchange. We conclude from our experiments that the proposed vision pipeline is suitable for home monitoring in a controlled environment, with calorific expenditure estimates above accuracy levels of commonly used manual estimations via METs. With the dataset released, our work establishes a baseline for future research for this little-explored area of computer vision.


Ground Truth Activity Recognition Calorific Expenditure Ground Truth Label Slide Window Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was performed under the SPHERE IRC project funded by the UK Engineering and Physical Sciences Research Council (EPSRC), Grant EP/K031910/1.


  1. 1.
    Samitz, G., Egger, M., Zwahlen, M.: Domains of physical activity and all-cause mortality: systematic review and dose-response meta-analysis of cohort studies. Int. J. Epidemiol. 40, 1382–1400 (2011)CrossRefGoogle Scholar
  2. 2.
    Ravussin, E., Lillioja, S., Anderson, T., Christin, L., Bogardus, C.: Determinants of 24-hour energy expenditure in man. Methods and results using a respiratory chamber. J. Clin. Investig. 78, 1568 (1986)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Altini, M., Penders, J., Vullers, R., Amft, O.: Estimating energy expenditure using body-worn accelerometers: a comparison of methods, sensors number and positioning. IEEE J. Biomed. Health Inform. 19, 219–226 (2015)CrossRefGoogle Scholar
  5. 5.
    Chen, C., Jafari, R., Kehtarnavaz, N.: Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Trans. Hum. Mach. Syst. 45, 51–61 (2015)CrossRefGoogle Scholar
  6. 6.
    Aggarwal, J., Xia, L.: Human activity recognition from 3D data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)CrossRefGoogle Scholar
  7. 7.
    Zhu, N., Diethe, T., Camplani, M., Tao, L., Burrows, A., Twomey, N., Kaleshi, D., Mirmehdi, M., Flach, P., Craddock, I.: Bridging eHealth and the internet of things: the SPHERE project. IEEE Intell. Syst. 30(4), 39–46 (2015)CrossRefGoogle Scholar
  8. 8.
    Woznowski, P., et al.: A multi-modal sensor infrastructure for healthcare in a residential environment (2015)Google Scholar
  9. 9.
    Ainsworth, B., et al.: Compendium of physical activities: an update of activity codes and met intensities. Med. Sci. Sports Exerc. 32, 498–504 (2000)CrossRefGoogle Scholar
  10. 10.
    Guo, G., Lai, A.: A survey on still image based human action recognition. Pattern Recogn. 47, 3343–3361 (2014)CrossRefGoogle Scholar
  11. 11.
    Laptev, I.: On space-time interest points. Int. J. Comput. Vision 64, 107–123 (2005)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Tao, L., Burghardt, T., Hannuna, S., Camplani, M., Paiement, A., Damen, D., Mirmehdi, M., Craddock, I.: A comparative home activity monitoring study using visual and inertial sensors. In: IEEE International Conference on E-Health Networking, Application and Services (2015)Google Scholar
  13. 13.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding, pp. 675–678 (2014)Google Scholar
  14. 14.
    Oreifej, O., Liu, Z.: Hon4d: Histogram of oriented 4D normals for activity recognition from depth sequences, pp. 716–723 (2013)Google Scholar
  15. 15.
    Tao, L., Paiement, A., Damen, D., Mirmehdi, M., Hannuna, S., Camplani, M., Burghardt, T., Craddock, I.: A comparative study of pose representation and dynamics modelling for online motion quality assessment. Comput. Vis. Image Underst. 148, 136–152 (2016)CrossRefGoogle Scholar
  16. 16.
    Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies, pp. 1–8 (2008)Google Scholar
  17. 17.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15561-1_11 CrossRefGoogle Scholar
  18. 18.
    Ryoo, M., Rothrock, B., Matthies, L.: Pooled motion features for first-person videos, pp. 896–904 (2015)Google Scholar
  19. 19.
    Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43, 16 (2011)CrossRefGoogle Scholar
  20. 20.
    Presti, L.L., La Cascia, M.: 3D skeleton-based human action classification: a survey. Pattern Recogn. 53, 130–147 (2016)CrossRefGoogle Scholar
  21. 21.
    Edgcomb, A., Vahid, F.: Estimating daily energy expenditure from video for assistive monitoring, pp. 184–191 (2013)Google Scholar
  22. 22.
    Tsou, P.F., Wu, C.C.: Estimation of calories consumption for aerobics using kinect based skeleton tracking, pp. 1221–1226 (2015)Google Scholar
  23. 23.
    OpenNI organization: OpenNI User Guide (2010)Google Scholar
  24. 24.
    Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 548–561. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88682-2_42 CrossRefGoogle Scholar
  25. 25.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection, vol. 1, pp. 886–893 (2005)Google Scholar
  26. 26.
    Dietterich, T.G.: Machine learning for sequential data: a review. In: Caelli, T., Amin, A., Duin, R.P.W., Ridder, D., Kamel, M. (eds.) SSPR /SPR 2002. LNCS, vol. 2396, pp. 15–30. Springer, Heidelberg (2002). doi: 10.1007/3-540-70659-3_2 CrossRefGoogle Scholar
  27. 27.
    Chang, C., Lin, C.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011)CrossRefGoogle Scholar
  28. 28.
    McArdle, W., Katch, F., Katch, V.: Exercise physiology: energy, nutrition, and human performance. Med. Sci. Sports Exerc. 23, 1403 (1991)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Lili Tao
    • 1
    Email author
  • Tilo Burghardt
    • 1
  • Majid Mirmehdi
    • 1
  • Dima Damen
    • 1
  • Ashley Cooper
    • 1
  • Sion Hannuna
    • 1
  • Massimo Camplani
    • 1
  • Adeline Paiement
    • 1
  • Ian Craddock
    • 1
  1. 1.SPHERE, Faculty of EngineeringUniversity of BristolBristolUK

Personalised recommendations