Improving Image Categorization by Using Multiple Instance Learning with Spatial Relation

  • Thanh Duc Ngo
  • Duy-Dinh Le
  • Shin’ichi Satoh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6978)

Abstract

Image categorization is a challenging problem when a label is provided for the entire training image only instead of the object region. To eliminate labeling ambiguity, image categorization and object localization should be performed simultaneously. Discriminative Multiple Instance Learning (MIL) can be used for this task by regarding each image as a bag and sub-windows in the image as instances. Learning a discriminative MI classifier requires an iterative solution. In each round, positive sub-windows for the next round should be selected. With standard approaches, selecting only one positive sub-window per positive bag may limit the search space for global optimum; meanwhile, selecting all temporal positive sub-windows may add noise into learning. We select a subset of sub-windows per positive bag to avoid those limitations. Spatial relations between sub-windows are used as clues for selection. Experimental results demonstrate that our approach outperforms previous discriminative MIL approaches and standard categorization approaches.

Keywords

Image Categorization Multiple Instance Learning Spatial Relation 

References

  1. 1.
    Dietterich, T., Lathrop, R., Lozano-Perez, T.: Solving the Multiple-Instance Problem with Axis-Parallel Rectangles. Artificial Intelligence, 31–71 (1997)Google Scholar
  2. 2.
    Maronand, O., Lozano-Perez, T.: A Framework for Multiple Instance Learning. In: Advances in Neural Information Processing Systems, pp. 570–576 (1998)Google Scholar
  3. 3.
    Zhang, Q., Goldman, S.: EM-DD: An Improved Multiple Instance Learning Technique. In: Advances in Neural Information Processing Systems, pp. 1073–1080 (2002)Google Scholar
  4. 4.
    Chen, Y., Wang, J.Z.: Image Categorization by Learning and Reasoning with Regions. Journal of Machine Learning Research, 913–939 (2004)Google Scholar
  5. 5.
    Andrews, S., Tsochantaridi, I., Hofmann, T.: Support Vector Machines for Multiple-Instance Learning. In: Advances in Neural Information Processing Systems, pp. 561–568 (2003)Google Scholar
  6. 6.
    Chen, Y., Bi, J., Wang, J.: MILES: Multiple-Instance Learning via Embedded Instance Selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1931–1947 (2006)Google Scholar
  7. 7.
    Fu, Z., Robles-Kelly, A.: An Instance Selection Approach to Multiple Instance Learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 911–918 (2009)Google Scholar
  8. 8.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Workshop on Generative-Model Based Vision, IEEE Conference on Computer Vision and Pattern Recognition (2004)Google Scholar
  9. 9.
    Nguyen, M.H., Torresani, L., Torre, F., Rother, C.: Weakly Supervised Discriminative Localization and Classification: A Joint Learning Process. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)Google Scholar
  10. 10.
    Galleguillos, C., Belongie, S.: Context Based Object Categorization: A Critical Survey. In: Computer Vision and Image Understanding (2010)Google Scholar
  11. 11.
    Marques, O., Barenholtz, E., Charvillat, V.: Context Modeling in Computer Vision: Techniques, Implications, and Applications. Journal of Multimedia Tools and Applications (2010)Google Scholar
  12. 12.
    Zha, Z.J., Hua, X.S., Mei, T., Wang, J., Qi, G.J., Wang, Z.: Joint Multi-Label Multi-Instance Learning for Image Classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  13. 13.
    Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A., Hebert, M.: An Empirical Study of Context in Object Detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1271–1278 (2009)Google Scholar
  14. 14.
    Wolf, L., Bileschi, S.: A Critical View of Context. International Journal of Computer Vision (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Thanh Duc Ngo
    • 1
    • 2
  • Duy-Dinh Le
    • 2
    • 1
  • Shin’ichi Satoh
    • 2
    • 1
  1. 1.The Graduate University for Advanced Studies (Sokendai)Japan
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations