Skip to main content
Log in

Multi-taskmulti-labelmultiple instance learning

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

For automatic object detection tasks, large amounts of training images are usually labeled to achieve more reliable training of the object classifiers; this is cost-expensive since it requires hiring professionals to label large-scale training images. When a large number of object classes come into view, the issue of obtaining a large enough amount of the labeled training images becomes more critical. There are three potential solutions to reduce the burden for image labeling: (1) allowing people to provide the object labels loosely at the image level rather than at the object level (e.g., loosely-tagged images without identifying the exact object locations in the images); (2) harnessing large-scale collaboratively-tagged images that are available on the Internet; and, (3) developing new machine learning algorithms that can directly leverage large-scale collaboratively- or loosely-tagged images for achieving more effective training of a large number of object classifiers. Based on these observations, a multi-task multi-label multiple instance learning (MTML-MIL) algorithm is developed in this paper by leveraging both interobject correlations and large-scale loosely-labeled images for object classifier training. By seamlessly integrating multi-task learning, multi-label learning, and multiple instance learning, our MTML-MIL algorithm can achieve more accurate training of a large number of inter-related object classifiers (where an object network is constructed for determining the inter-related learning tasks directly in the feature space rather than in the label space). Our experimental results have shown that our MTML-MIL algorithm can achieve higher detection accuracy rates for automatic object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Boutell, M.R., Luo, J., Shen, X., Brown, C.M., 2004. Learning multi-label scene classification. Pattern Recogn., 37(9):1757–1771. [doi:10.1016/j.patcog.2004.03.009]

    Article  Google Scholar 

  • Chen, Y., Bi, J., Wang, J.Z., 2006. MILES: multiple instance learning via embedded instance selection. IEEE Trans. PAMI, 28(12):1931–1947. [doi:10.1109/TPAMI.2006.248]

    Google Scholar 

  • Deng, Y., Manjunath, B.S., 1999. Color Image Segmentation. IEEE CVPR, p.2446–2451. [doi:10.1109/CVPR.1999.784719]

  • Evgeniou, T., Micchelli, C.A., Pontil, M., 2005. Learning multiple tasks with kernel methods. J. Mach. Learn. Res., 6:615–637.

    MathSciNet  Google Scholar 

  • Fan, J., Gao, Y., Luo, H., 2004. Multi-Level Annotation of Natural Scenes Using Dominant Image Components and Semantic Image Concepts. ACM Multimedia, p.540–547. [doi:10.1145/1027527.1027660]

  • Fan, J., Luo, H., Gao, Y., Jain, R., 2007. Incorporating concept ontology for hierarchical video classification, annotation and visualization. IEEE Trans. Multimedia, 9(5):939–957. [doi:10.1109/TMM.2007.900143]

    Article  Google Scholar 

  • Fan, J., Gao, Y., Luo, H., 2008a. Integrating concept ontology and multi-task learning to achieve more effective classifier training for multi-level image annotation. IEEE Trans. Image Process., 17(3):407–426. [doi:10.1109/TIP.2008.916999]

    Article  MathSciNet  Google Scholar 

  • Fan, J., Gao, Y., Luo, H., Jain, R., 2008b. Mining multi-level image semantics via hierarchical classification IEEE Trans. Multimedia, 10(1):167–187. [doi:10.1109/TMM.2007.911775]

    Article  Google Scholar 

  • Fan, J., Shen, Y., Zhou, N., Gao, Y., 2010. Harvesting Large-Scale Weakly-Tagged Image Databases from the Web. IEEE CVPR, p.802–809. [doi:10.1109/CVPR.2010.5540135]

  • Fan, R., Chen, P., Lin, C.J., 2005. Working set selection using the second order information for training SVM. J. Mach. Learn. Res., 6:1889–1918.

    MathSciNet  Google Scholar 

  • Frey, B.J., Dueck, D., 2007. Clustering by passing messages between data points. Science, 315(5814):972–976. [doi:10.1126/science.1136800]

    Article  MathSciNet  Google Scholar 

  • Graf, H.P., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V., 2004. Parallel Support Vector Machines: the Cascade SVM. NIPS, p.1–8.

  • Hanley, J.A., McNeil, B.J., 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29–36.

    Google Scholar 

  • Jiang, W., Chang, S.F., Loui, A., 2007. Context-Based Concept Fusion with Boosted Conditional Random Fields. IEEE ICASSP, p.949–952. [doi:10.1109/ICASSP.2007.366066]

  • Joachims, T., Finley, T., Yu, C., 2009. Cuttingplane training of structural SVMs. Mach. Learn., 77(1):27–59. [doi:10.1007/s10994-009-5108-8]

    Article  Google Scholar 

  • Kumar, S., Herbert, M., 2006. Discriminative random fields. Int. J. Comput. Vis., 68(2):179–201. [doi:10.1007/s11263-006-7007-9]

    Article  Google Scholar 

  • Liu, J., Li, M., Ma, W.Y., Liu, Q., Lu, H., 2006. An Adaptive Graph Model for Automatic Image Annotation. ACM Multimedia Workshop on MIR, p.61–70. [doi:10.1145/1178677.1178689]

  • Maron, O., Ratan, A.L., 1998. Multiple-Instance Learning for Natural Scene Classification. ICML, p.341–349.

  • Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Mei, T., Zhang, H.J., 2007. Correlative Multi-Label Video Annotation. ACM Multimedia, p.17–26. [doi:10. 1145/1291233.1291245]

  • Russell, B., Efros, A., Sivic, J., Freeman, W., Zisserman, A., 2006. Using Multiple Segmentations to Discover Objects and Their Extent in Image Collections. IEEE CVPR, p.1605–1614. [doi:10.1109/CVPR.2006.326]

  • Tang, J., Hua, X., Wang, M., Gu, Z., Qi, G., Wu, X., 2009. Correlative linear neighborhood propagation for video annotation. IEEE Trans. SMC, 39(2):409–416. [doi:10.1109/TSMCB.2008.2006045]

    Google Scholar 

  • Torralba, A., Murphy, K.P., Freeman, W.T., 2004. Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection. IEEE CVPR, p.762–769. [doi:10.1109/CVPR.2004.1315241]

  • Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y., 2005. Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res., 6:1453–1484.

    MathSciNet  Google Scholar 

  • Vijayanarasimhan, S., Grauman, K., 2008. Keywords to Visual Categories: Multiple-Instance Learning for Weakly Supervised Object Categorization. IEEE CVPR, p.1–8. [doi:10.1109/CVPR.2008.4587632]

  • Yang, J., Liu, Y., Ping, E.X., Hauptmann, A.G., 2007. Harmonium Models for Semantic Video Representation and Classification. SIAM Conf. on Data Mining, p.1–12.

  • Zha, Z., Hua, X.S., Mei, T., Wang, J., Qi, G.J., Wang, Z., 2008. Joint Multi-Label Multi-Instance Learning for Image Classification. IEEE CVPR, p.1–8. [doi:10.1109/CVPR.2008.4587384]

  • Zhang, Q., Yu, W., Goldman, S.A., Fritts, J.E., 2002. Content-Based Image Retrieval Using Multiple-Instance Learning. ICML, p.682–689.

  • Zhu, Z.H., Zhang, M.L., 2006. Multi-Instance Multi-Label Learning with Application to Scene Classification. NIPS, p.1609–1616.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, Y., Fan, Jp. Multi-taskmulti-labelmultiple instance learning. J. Zhejiang Univ. - Sci. C 11, 860–871 (2010). https://doi.org/10.1631/jzus.C1001005

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C1001005

Key words

CLC number

Navigation