Multi-instance Classification

Herrera, Francisco; Ventura, Sebastián; Bello, Rafael; Cornelis, Chris; Zafra, Amelia; Sánchez-Tarragó, Dánel; Vluymans, Sarah

doi:10.1007/978-3-319-47759-6_3

Francisco Herrera⁸,
Sebastián Ventura⁹,
Rafael Bello¹⁰,
Chris Cornelis¹¹,
Amelia Zafra¹²,
Dánel Sánchez-Tarragó¹³ &
…
Sarah Vluymans¹¹

1455 Accesses

Abstract

In the machine-learning community, the most widely used MIL paradigm is Multi-Instance Classification (MIC). Most contributions in MIL are related to this predictive task and a considerable number of problems have been solved successfully. In Sects. 3.1 and 3.2, we introduce the MIC problem, give a formal definition, and describe the evaluation metrics. Section 3.3 recalls a general taxonomy, describing the main categories established within MIC. An in-depth study of the different methods in each category is made in later chapters. In Sects. 3.4 and 3.5, we discuss two specific design aspects related to MIC algorithms. In the former, we present the different assumptions that can be used to relate class labels of instances within a bag to the class label of the bag itself. The latter section describes the main distance metrics that allow to determine similarity between bags. We conclude this chapter by listing common MIC case studies found in the literature in Sect. 3.6 as well as the relevant MIC software tools in Sect. 3.7.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast Discovery of Association Rules. Lect. Notes Artif. Int. 12(1), 307–328 (1996)
Google Scholar
Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult.-Valued Log. Soft Comput. 17(2–3), 255–287 (2010)
Google Scholar
Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)
Article MathSciNet MATH Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Proceedings of 15th Conference on Advances in neural information processing systems (NIPS 2002), pp. 561–568. MIT Press, Cambridge (2002)
Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: MIL dataset repository. http://www.cs.columbia.edu/~andrews/mil/datasets.html
Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803–821 (1993)
Article MathSciNet MATH Google Scholar
Briggs, F., Fern, X.Z., Raich, R.: Rank-loss support instance machines for MIML instance annotation. In: Goethals, B. (ed.) Proceedings of the 18th ACM International Conference on Knowledge discovery and data mining (SIGKDD 2012), pp. 534–542. ACM, New York (2012)
Google Scholar
Brossi, S.D., Bradley, A.P.: A comparison of multiple instance and group based learning. In: Langford, J., Pineau, J. (eds.) Proceedings of the International Conference on Digital Image Computing Techniques and Applications (DICTA 2012), pp. 1–8. IEEE, Los Alamitos (2012)
Google Scholar
Budka, M., Gabrys, B., Musial, K.: On accuracy of PDF divergence estimators and their applicability to representative data sampling. Entropy 13(7), 1229–1266 (2011)
Article MathSciNet Google Scholar
Cambridge Dictionary of English. Cambridge University Press (2016). http://dictionary.cambridge.org/
Cha, S.H.: Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions. Math. Mod. Meth. Appl. Sci. 4(1), 300–307 (2007)
Google Scholar
Chen, Y., Bi, J., Wang, J.Z.: MILES: Multiple-instance learning via embedded instance selection. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 1931–1947 (2006)
Article Google Scholar
Chen, Y., Wu, O.: Contextual Hausdorff dissimilarity for multi-instance clustering. In: Liu, Y. (ed.) Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2012), pp. 870–873. IEEE, Los Alamitos (2012)
Google Scholar
Cheplygina, V., Tax, D.M., Loog, M.: Multiple instance learning with bag dissimilarities. Pattern Recogn. 48(1), 264–275 (2015)
Article Google Scholar
Cheplygina, V., Tax, D.M., Loog, M.: On classification with bags, groups and sets. Pattern Recogn. Lett. 59, 11–17 (2015)
Article Google Scholar
Cheplygina, V., Tax, D.M.J.: MIL dataset repository (matlab format). http://www.miproblems.org/datasets
Deza, M.M., Deza, E.: Dictionary of Distances. Elsevier, Amsterdam (2006)
MATH Google Scholar
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1), 31–71 (1997)
Article MATH Google Scholar
Doran, G.B.: TRX protein sequence classification dataset (C4.5 format). http://engr.case.edu/doran_gary/code.html
Foulds, J.R.: Learning instance weights in multi-instance learning. Master thesis, The University of Waikato, New Zealand (2008)
Google Scholar
Foulds, J., Frank, E.: A review of multi-instance learning assumptions. Knowl. Eng. Rev. 25(1), 1–25 (2010)
Article Google Scholar
Frank, E., Xu, X.: Applying propositional learning algorithms to multi-instance data. Master thesis, The University of Waikato, New Zealand (2003)
Google Scholar
Gärtner, T., Flach, P.A., Kowalczyk, A., Smola, A.: Multi-Instance Kernels. In: Sammut, C., Hoffmann, A. (eds.) Proceedings of the 19th International Conference on Machine Learning (ICML 2002), pp. 179–186. Morgan Kaufmann Publishers, San Francisco (2002)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. In: Fayyad, U. (ed.) Proceedings of the 15th ACM International Conference on Knowledge discovery and data mining (SIGKDD 2009), Explorations Newsletter, pp. 10–18. ACM, New York (2009)
Google Scholar
Haussler, D.: Convolution kernels on discrete structures. Technical report, Department of Computer Science, University of California, Santa Cruz, United States of America (1999)
Google Scholar
Jousselme, A.L., Maupin, P.: Distances in evidence theory: Comprehensive survey and generalizations. Int. J. Approx. Reason. 53(2), 118–145 (2012)
Article MathSciNet MATH Google Scholar
Kandemir, M., Zhang, C., Hamprecht, F. A.: Empowering multiple instance histopathology cancer diagnosis by cell graphs. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) Proceedings of the 17th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2014), Lecture Notes Computer Science, vol. 8674, no. 2, pp. 228–235 (2014)
Google Scholar
Kandemir, M., Hamprecht, F.A.: Computer-aided diagnosis from weak supervision: a benchmarking study. Comput. Med. Imag. Grap. 42, 44–50 (2015)
Article Google Scholar
Li, Y., Tax, D.M., Duin, R.P., Loog, M.: Multiple-instance learning as a classifier combining problem. Pattern Recogn. 46(3), 865–874 (2013)
Article Google Scholar
Li, W., Vasconcelos, N.: Multiple instance learning for soft bags via top instances. In: Durand, F., Freeman W.T. (eds.) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 4277–4285. IEEE, Los Alamitos (2015)
Google Scholar
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Jordan, M., Kearns, M., Solla, S. (eds.) Advances in Neural Information Processing Systems, no. 10. pp. 570–576. MIT press, Cambridge (1998)
Google Scholar
Scott, S., Zhang, J., Brown, J.: On generalized multiple-instance learning. Int. J. Comput. Int. Sys. 5(1), 21–35 (2005)
Google Scholar
Srinivasan, A., Muggleton, S., King, R.D.: Comparing the use of background knowledge by inductive logic programming systems. In: Lavrac, N., Dzeroski, S. (eds.) Proceedings of the 5th International Workshop on Inductive Logic Programming (ICLP 1995), pp. 199–230. Springer, London (1995)
Google Scholar
Ventura, S., Romero, C., Zafra, A., Delgado, J.A., Hervás, C.: JCLEC: a Java framework for evolutionary computation. Soft Comput. 12(4), 381–392 (2008)
Article Google Scholar
Vluymans, S., Sánchez Tarragó, D.S., Saeys, Y., Cornelis, C., Herrera, F.: Fuzzy multi-instance classifiers. IEEE Trans. Fuzzy Syst. (2016) (in press)
Google Scholar
Vluymans, S., Sánchez Tarragó, D.S., Saeys, Y., Cornelis, C., Herrera, F.: Fuzzy rough classifiers for class imbalanced multi-instance data. Pattern Recogn. 53, 36–45 (2016)
Google Scholar
Wang, J., Zucker, J.D.: Solving multiple-instance problem: A lazy learning approach. In: Langley, P. (ed.) Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp. 1119–1126. Morgan Kaufmann Publishers, San Francisco (2000)
Google Scholar
Wang, H.Y., Yang, Q., Zha, H.: Adaptive p-posterior mixture-model kernels for multiple instance learning. In: Getoor, L., Scheffer, T. (eds.) Proceedings of the 26th International Conference on Machine Learning (ICML 2008), pp. 1136–1143. Omnipress, Lille Grand Palais (2008)
Google Scholar
Wang, H., Huang, H., Kamangar, F., Nie, F., Ding, C.H.: (2011). Maximum margin multi-instance learning. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Proceedings of 24th Conference on Advances in neural information processing systems (NIPS 2011), pp. 1–9. MIT Press, Cambridge (2011)
Google Scholar
Wang, H., Nie, F., Huang, H.: Learning Instance Specific Distance for Multi-Instance Classification. In: Wolfram, B., Dan, Roth., Program, C. (eds.) Proceedings of 25th Conference on Artificial Intelligence (AAAI 2011), 2, pp. 6–15. AAAI Press, Vancouver (2011)
Google Scholar
Weidmann, N., Frank, E., Pfahringer, B.: A two-level learning method for generalized multi-instance problems. In: Lavrac, N., Gamberger, D., Blockeel, H., Todorovski, L. (eds.) Proceedings of 14th European Conference on Machine Learning (ECML 2003), pp. 468–479. Springer, Berlin (2003)
Google Scholar
Xu, L., Neufeld, J., Larson, B., Schuurmans, D.: Maximum margin clustering. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Proceedings of 17th Conference on Advances in neural information processing systems (NIPS 2004), pp. 1537–1544. MIT Press, Cambridge (2004)
Google Scholar
Xu, X.: Statistical learning in multiple instance problems. Master thesis, The University of Waikato, New Zealand (2003)
Google Scholar
Yager, R.R., Kacprzyk, J.: The Ordered Weighted Averaging Operators: Theory and Applications. Springer Science Business Media, New York (2012)
Google Scholar
Zafra, A., Romero, C., Ventura, S., Herrera-Viedma, E.: Multi-instance genetic programming for web index recommendation. Expert Syst. Appl. 36(9), 11470–11479 (2009)
Article Google Scholar
Zafra, A., Pechenizkiy, M., Ventura, S.: ReliefF-MI: An extension of ReliefF to multiple instance learning. Neurocomputing 75(1), 210–218 (2012)
Article Google Scholar
Zafra, A., Ventura, S.: MIL dataset repository (weka format). http://www.uco.es/grupos/kdis/momil
Zhang, D., Wang, F., Si, L., Li, T.: Maximum margin multiple instance clustering with applications to image and text clustering. IEEE Trans. Neural Netw. 22(5), 739–751 (2011)
Article Google Scholar
Zhang, M.L., Zhou, Z.H.: Multi-instance clustering with applications to multi-instance prediction. Appl. Intell. 31(1), 47–68 (2009)
Article Google Scholar
Zhang, T., Liu, S., Xu, C., Lu, H.: M4L: maximum margin multi-instance multi-cluster learning for scene modeling. Pattern Recogn. 46(10), 2711–2723 (2013)
Article MATH Google Scholar
Zhou, Z.H., Jiang, K., Li, M.: Multi-instance learning based web mining. Appl. Intell. 22(2), 135–147 (2005)
Article Google Scholar
Zhou, Z.H., Sun, Y.Y., Li, Y.F.: Multi-instance learning by treating instances as non-iid samples. In: Bottou, L., Littman, M. (eds.) Proceedings of the 26th International Conference on Machine Learning (ICML 2009), pp. 1249–1256. Omnipress, Lille Grand Palais (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
Francisco Herrera
Department of Computer Science, University of Córdoba, Córdoba, Spain
Sebastián Ventura
Center of Information Studies, Central University “Marta Abreu” of Las Villas, Santa Clara, Villa Clara, Cuba
Rafael Bello
Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
Chris Cornelis & Sarah Vluymans
Department of Computer Science and Numerical Analysis, University of Córdoba, Córdoba, Spain
Amelia Zafra
Central University “Marta Abreu” of Las Villas, Santa Clara, Villa Clara, Cuba
Dánel Sánchez-Tarragó

Authors

Francisco Herrera
View author publications
You can also search for this author in PubMed Google Scholar
Sebastián Ventura
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Bello
View author publications
You can also search for this author in PubMed Google Scholar
Chris Cornelis
View author publications
You can also search for this author in PubMed Google Scholar
Amelia Zafra
View author publications
You can also search for this author in PubMed Google Scholar
Dánel Sánchez-Tarragó
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Vluymans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco Herrera .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Herrera, F. et al. (2016). Multi-instance Classification. In: Multiple Instance Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-47759-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-47759-6_3
Published: 09 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47758-9
Online ISBN: 978-3-319-47759-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics