Food Category Recognition Using SURF and MSER Local Feature Representation

  • Mohd Norhisham Razali
  • Noridayu ManshorEmail author
  • Alfian Abdul Halin
  • Razali Yaakob
  • Norwati Mustapha
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10645)


Food object recognition has gained popularity in recent years. This can perhaps be attributed to its potential applications in fields such as nutrition and fitness. Recognizing food images however is a challenging task since various foods come in many shapes and sizes. Besides having unexpected deformities and texture, food images are also captured in differing lighting conditions and camera viewpoints. From a computer vision perspective, using global image features to train a supervised classifier might be unsuitable due to the complex nature of the food images. Local features on the other hand seem the better alternative since they are able to capture minute intricacies such as interest points and other intricate information. In this paper, two local features namely SURF (Speeded- Up Robust Feature) and MSER (Maximally Stable Extremal Regions) are investigated for food object recognition. Both features are computationally inexpensive and have shown to be effective local descriptors for complex images. Specifically, each feature is firstly evaluated separately. This is followed by feature fusion to observe whether a combined representation could better represent food images. Experimental evaluations using a Support Vector Machine classifier shows that feature fusion generates better recognition accuracy at 86.6%.


Food category recognition MSER SURF Bag of features 


  1. 1.
    Yanai, K., Kawano, Y.: Twitter food photo mining and analysis for one hundred kinds of foods. In: Ooi, W.T., Snoek, C.G.M., Tan, H.K., Ho, C.-K., Huet, B., Ngo, C.-W. (eds.) PCM 2014. LNCS, vol. 8879, pp. 22–32. Springer, Cham (2014). doi: 10.1007/978-3-319-13168-9_3 Google Scholar
  2. 2.
    Farinella, G.M., Allegra, D., Moltisanti, M., Stanco, F., Battiato, S.: Food understanding from digital images. (2015)Google Scholar
  3. 3.
    Xu, R., Jiang, S., Wang, S., Song, X., Jain, R., Herranz, L.: Geolocalized modeling for dish recognition. IEEE Trans. Multimed. 17, 1187–1199 (2015)CrossRefGoogle Scholar
  4. 4.
    Pouladzadeh, P., Shirmohammadi, S., Al-maghrabi, R.: Measuring calorie and nutrition from food image. IEEE Trans. Instrum. Measur. 63, 1947–1956 (2014)CrossRefGoogle Scholar
  5. 5.
    Kong, F., Raynor, H.A., Tan, J., He, H.: DietCam: multi-view regular shape food recognition with a camera phone. Pervasive Mob. Comput. 19, 108–121 (2015)CrossRefGoogle Scholar
  6. 6.
    Kong, F., Tan, J.: DietCam: automatic dietary assessment with mobile camera phones. Pervasive Mob. Comput. 8, 147–163 (2012)CrossRefGoogle Scholar
  7. 7.
    Bosch, M., Zhu, F., Khanna, N., Boushey, C.J., Delp, E.J.: Combining global and local features for food identification in dietary assessment. pp. 1789–1792 (2011)Google Scholar
  8. 8.
    Kagaya, H., Aizawa, K.: New Trends in Image Analysis and Processing - ICIAP 2015 Workshops, vol. 9281, pp. 350–357. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-23222-5 CrossRefGoogle Scholar
  9. 9.
    Nguyen, D.T., Ogunbona, P.O., Probst, Y., Li, W., Zong, Z.: Food image classification using local appearance and global structural information. Neurocomputing. 140, 242–251 (2014)CrossRefGoogle Scholar
  10. 10.
    Altintakan, U.L., Yazici, A.: An improved BOW approach using fuzzy feature encoding and visual-word weighting. In: IEEE International Conference on Fuzzy System 2015-November, (2015). doi: 10.1109/FUZZ-IEEE.2015.7338108
  11. 11.
    Kong, F., Tan, J.: DietCam: Regular Shape Food Recognition with a Camera Phone. In: International Conference on Body Sensor Networks (2011)Google Scholar
  12. 12.
    Anthimopoulos, M.M., Scarnato, L., Diem, P., Mougiakakou, S.G., Gianola, L.: A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE J. Biomed. Health Inform. 18, 1261–1271 (2014)CrossRefGoogle Scholar
  13. 13.
    Razali, M.N., Manshor, N.: Object detection framework for multiclass food object localization and classificationGoogle Scholar
  14. 14.
    Norhisham, M., Manshor, N., Halin, A.A., Mustapha, N.: Analysis of SURF and SIFT representations to recognize food objectsGoogle Scholar
  15. 15.
    Zhu, F., Woo, I., Kim, S.Y., Boushey, C.J., Ebert, D.S., Delp, E.J., Bosch, M.: The use of mobile devices in aiding dietary assessment and evaluation. IEEE J. Sel. Top. Signal Process. 4, 756–766 (2010)CrossRefGoogle Scholar
  16. 16.
    Oliveira, L., Neves, G., Oliveira, T., Jorge, E., Lizarraga, M., Costa, V.: A mobile, lightweight, poll-based food identification system. Pattern Recogn. 47, 1941–1952 (2014)CrossRefGoogle Scholar
  17. 17.
    Wu, J., Cui, Z., Sheng, V.S., Zhao, P., Su, D., Gong, S.: A comparative study of SIFT and its variants. Meas. Sci. Rev. 13, 122–131 (2013)Google Scholar
  18. 18.
    Zong, Z., Nguyen, D.T., Ogunbona, P., Li, W.: On the combination of local texture and global structure for food classification. In: Proceedings of 2010 IEEE International Symposium Multimedia, ISM 2010, pp. 204–211 (2010). doi: 10.1109/ISM.2010.37
  19. 19.
    Chen, M., Wu, W., Yang, L., Sukthankar, R., Yang, J., Dhingra, K.: PFID: pittsburgh fast-food image dataset. In: Proceedings of the 16th IEEE International Conference on Image Processing. pp. 289–292 (2009)Google Scholar
  20. 20.
    Kawano, Y., Yanai, K.: FoodCam: a real-time food recognition system on a smartphone. Multimed. Tools Appl. 74, 5263–5287 (2015). doi: 10.1007/s11042-014-2000-8 CrossRefGoogle Scholar
  21. 21.
    Pooja, H., Madival, P.S.A.: Food recognition and calorie extraction using bag-of- surf and spatial pyramid matching methods. Int. J. Comput. Sci. Mobile Comput. 5, 387–393 (2016)Google Scholar
  22. 22.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). doi: 10.1007/11744023_32 CrossRefGoogle Scholar
  23. 23.
    Donoser, M., Riemenschneider, H., Bischof, H.: Shape guided maximally stable extremal region (MSER) tracking. pp. 1800–1803 (2010). doi: 10.1109/ICPR.2010.444
  24. 24.
    Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 183–196. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88688-4_14 CrossRefGoogle Scholar
  25. 25.
    Extremal, M.S., Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from. In: British Machine Vision Conference, pp. 384–393 (2002). doi: 10.5244/C.16.36
  26. 26.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bag of keypoints. In: International Workshop in Statistic Learning and Computer Vision, pp. 1–22 (2004). doi: 10.1234/12345678
  27. 27.
    Kawano, Y., Yanai, K.: FoodCam: a real-time mobile food recognition system employing fisher vector. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds.) MMM 2014. LNCS, vol. 8326, pp. 369–373. Springer, Cham (2014). doi: 10.1007/978-3-319-04117-9_38 CrossRefGoogle Scholar
  28. 28.
    Aizawa, K., Li, H., Morikawa, C., Maruyama, Y.: Food balance estimation by using personal dietary tendencies in a multimedia food log. IEEE Trans. Multimed. 15, 2176–2185 (2013)CrossRefGoogle Scholar
  29. 29.
    Jiang, Y., Yang, J., Ngo, C., Hauptmann, A.G.: Representations of keypoint-based semantic concept detection: a comprehensive study representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans. Multimed. 12, 42–53 (2010)CrossRefGoogle Scholar
  30. 30.
    Yu, J., Qin, Z., Wan, T., Zhang, X.: Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120, 355–364 (2013). doi: 10.1016/j.neucom.2012.08.061 CrossRefGoogle Scholar
  31. 31.
    Matsuda, Y., Hoashi, H., Yanai, K.: Recognition of multiple-food images by detecting candidate regions. In: Proceedings of IEEE International Conference on Multimedia and Exposition, pp. 25–30 (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Mohd Norhisham Razali
    • 1
    • 2
  • Noridayu Manshor
    • 1
    Email author
  • Alfian Abdul Halin
    • 1
  • Razali Yaakob
    • 1
  • Norwati Mustapha
    • 1
  1. 1.Faculty of Computer Science and Information TechnologyUniversiti Putra MalaysiaSerdangMalaysia
  2. 2.Faculty of Computing and InformaticsUniversiti Malaysia SabahKota KinabaluMalaysia

Personalised recommendations