CNN-Based Non-contact Detection of Food Level in Bottles from RGB Images

  • Yijun Jiang
  • Elim Schenck
  • Spencer Kranz
  • Sean Banerjee
  • Natasha Kholgade BanerjeeEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11295)


In this paper, we present an approach that detects the level of food in store-bought containers using deep convolutional neural networks (CNNs) trained on RGB images captured using an off-the-shelf camera. Our approach addresses three challenges—the diversity in container geometry, the large variations in shapes and appearances of labels on store-bought containers, and the variability in color of container contents—by augmenting the data used to train the CNNs using printed labels with synthetic textures attached to the training bottles, interchanging the contents of the bottles of the training containers, and randomly altering the intensities of blocks of pixels in the labels and at the bottle borders. Our approach provides an average level detection accuracy of 92.4% using leave-one-out cross-validation on 10 store-bought bottles of varying geometries, label appearances, label shapes, and content colors.


Food Level detection Deep convolutional neural networks Training set augmentation 



This work was partially supported by the National Science Foundation (NSF) grant #1730183.


  1. 1.
    Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: OSDI (2016)Google Scholar
  2. 2.
    Arebey, M., Hannan, M., Begum, R.A., Basri, H.: Solid waste bin level detection using gray level co-occurrence matrix feature extraction approach. J. Environ. Manag. 104, 9–18 (2012)CrossRefGoogle Scholar
  3. 3.
    Arteta, C., Lempitsky, V., Zisserman, A.: Counting in the wild. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 483–498. Springer, Cham (2016). Scholar
  4. 4.
    Bonanni, L., Lee, C.H., Selker, T.: Counterintelligence: augmented reality kitchen. In: ACM SIGCHI (2005)Google Scholar
  5. 5.
    Canbolat, H.: A novel level measurement technique using three capacitive sensors for liquids. IEEE Trans. Instrum. Meas. 58, 3762–3768 (2009)CrossRefGoogle Scholar
  6. 6.
    Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE CVPR, pp. 1–7 (2008)Google Scholar
  7. 7.
    Chattopadhyay, P., Vedantam, R., Selvaraju, R.R., Batra, D., Parikh, D.: Counting everyday objects in everyday scenes. CoRR abs/1604.03505, 1(10) (2016)Google Scholar
  8. 8.
    Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC. vol. 1, 3 (2012)Google Scholar
  9. 9.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE TPAMI 40(4), 834–848 (2018)CrossRefGoogle Scholar
  10. 10.
    Chi, P.-Y.P., Chen, J.-H., Chu, H.-H., Lo, J.-L.: Enabling calorie-aware cooking in a smart kitchen. In: Oinas-Kukkonen, H., Hasle, P., Harjumaa, M., Segerståhl, K., Øhrstrøm, P. (eds.) PERSUASIVE 2008. LNCS, vol. 5033, pp. 116–127. Springer, Heidelberg (2008). Scholar
  11. 11.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE TPAMI 24(5), 603–619 (2002)CrossRefGoogle Scholar
  12. 12.
    Fan, M., Truong, K.N.: SoQr: sonically quantifying the content level inside containers. In: ACM UbiComp (2015)Google Scholar
  13. 13.
    Hassannejad, H., Matrella, G., Ciampolini, P., De Munari, I., Mordonini, M., Cagnoni, S.: Food image recognition using very deep convolutional networks. In: MADiMa (2016)Google Scholar
  14. 14.
    Hassannejad, H., Matrella, G., Ciampolini, P., Munari, I.D., Mordonini, M., Cagnoni, S.: A new approach to image-based estimation of food volume. Algorithms 10(2), 66 (2017)MathSciNetCrossRefGoogle Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR (2016)Google Scholar
  16. 16.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  17. 17.
    Kagaya, H., Aizawa, K., Ogawa, M.: Food detection and recognition using convolutional neural network. In: ACMMM (2014)Google Scholar
  18. 18.
    Kawano, Y., Yanai, K.: Food image recognition with deep convolutional features. In: ACM UbiComp (2014)Google Scholar
  19. 19.
    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  20. 20.
    Kong, D., Gray, D., Tao, H.: A viewpoint invariant approach for crowd counting. In: IEEE ICPR. vol. 3, pp. 1187–1190 (2006)Google Scholar
  21. 21.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  22. 22.
    Laput, G., Lasecki, W.S., Wiese, J., Xiao, R., Bigham, J.P., Harrison, C.: Zensors: adaptive, rapidly deployable, human-intelligent sensor feeds. In: ACM SIGCHI, pp. 1935–1944 (2015)Google Scholar
  23. 23.
    Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
  24. 24.
    Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., Ma, Y.: Deepfood: deep learning-based food image recognition for computer-aided dietary assessment. In: ICOST (2016)Google Scholar
  25. 25.
    Martinel, N., Foresti, G.L., Micheloni, C.: Wide-slice residual networks for food recognition. arXiv preprint arXiv:1612.06543 (2016)
  26. 26.
    Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: IEEE CVPR, pp. 1520–1528 (2015)Google Scholar
  27. 27.
    Norouzzadeh, M.S., et al.: Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proc. Nat. Acad. Sci. 115(25), E5716–E5725 (2018)CrossRefGoogle Scholar
  28. 28.
    Oñoro-Rubio, Daniel, López-Sastre, Roberto J.: Towards perspective-free object counting with deep learning. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9911, pp. 615–629. Springer, Cham (2016). Scholar
  29. 29.
    Peng, E., Peursum, P., Li, L.: Product barcode and expiry date detection for the visually impaired using a smartphone. In: DICTA (2012)Google Scholar
  30. 30.
    Ray, S., Turi, R.H.: Determination of number of clusters in k-means clustering and application in colour image segmentation. In: Proceedings of the 4th International Conference On Advances in Pattern Recognition and Digital Techniques, pp. 137–143, Calcutta, India (1999)Google Scholar
  31. 31.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE CVPR, pp. 779–788 (2016)Google Scholar
  32. 32.
    Reverter, F., Li, X., Meijer, G.C.: Liquid-level measurement system based on a remote grounded capacitive sensor. Sens. Actuators, A 138, 1–8 (2007)CrossRefGoogle Scholar
  33. 33.
    Sandholm, T., Lee, D., Tegelund, B., Han, S., Shin, B., Kim, B.: Cloudfridge: a testbed for smart fridge interactions. arXiv preprint arXiv:1401.0585 (2014)
  34. 34.
    Sato, A., Watanabe, K., Rekimoto, J.: Mimicook: a cooking assistant system with situated guidance. In: TEI (2014)Google Scholar
  35. 35.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  36. 36.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Terzic, E., Nagarajah, C., Alamgir, M.: Capacitive sensor-based fluid level measurement in a dynamic environment using neural network. Eng. Appl. Artif. Intell. 23, 614–619 (2010)CrossRefGoogle Scholar
  38. 38.
    Xu, C., He, Y., Khannan, N., Parra, A., Boushey, C., Delp, E.: Image-based food volume estimation. In: Proceedings of the 5th International Workshop on Multimedia For Cooking & Eating Activities, pp. 75–80 (2013)Google Scholar
  39. 39.
    Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: IEEE CVPR, pp. 833–841 (2015)Google Scholar
  40. 40.
    Zhao, Y., Yao, S., Li, S., Hu, S., Shao, H., Abdelzaher, T.F.: Vibebin: a vibration-based waste bin level detection system. ACM IMWUT 1, 122 (2017)Google Scholar
  41. 41.
    Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. arXiv preprint arXiv:1708.04896 (2017)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yijun Jiang
    • 1
  • Elim Schenck
    • 1
  • Spencer Kranz
    • 1
  • Sean Banerjee
    • 1
  • Natasha Kholgade Banerjee
    • 1
    Email author
  1. 1.Clarkson UniversityPotsdamUSA

Personalised recommendations