Soft Margin Trees

  • Jorge Díez
  • Juan José del Coz
  • Antonio Bahamonde
  • Oscar Luaces
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)


From a multi-class learning task, in addition to a classifier, it is possible to infer some useful knowledge about the relationship between the classes involved. In this paper we propose a method to learn a hierarchical clustering of the set of classes. The usefulness of such clusterings has been exploited in bio-medical applications to find out relations between diseases or populations of animals. The method proposed here defines a distance between classes based on the margin maximization principle, and then builds the hierarchy using a linkage procedure. Moreover, to quantify the goodness of the hierarchies we define a measure. Finally, we present a set of experiments comparing the scores achieved by our approach with other methods.


Support Vector Machine Hierarchical Cluster Agglomerative Hierarchical Cluster Hierarchical Cluster Algorithm Machine Learn Research 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Patra, J., Ang, E.L., Meher, P., Zhen, Q.: A new som-based visualization technique for dna microarray data. In: IJCNN 2006. International Joint Conference on Neural Networks, pp. 4429–4434 (2006)Google Scholar
  2. 2.
    Ng, R., Sander, J., Sleumer, M., et al.: Hierarchical cluster analysis of SAGE data for cancer profiling. In: Proceedings of BIOKDD 2001 Workshop on Data Mining in Bioinformatics, pp. 65–72 (2001)Google Scholar
  3. 3.
    Hanotte, O., Bradley, D.G., Ochieng, J.W., Verjee, Y., Hill, E.W., Rege, J.E.O.: African Pastoralism: Genetic Imprints of Origins and Migrations. Science 296(5566), 336–339 (2002)CrossRefGoogle Scholar
  4. 4.
    Vargo, E.: Hierarchical analysis of colony and population genetic structure of the eastern subterranean termite, reticulitermes flavipes, using two classes of molecular markers. Evolution 57(12), 2805–2818 (2003)CrossRefGoogle Scholar
  5. 5.
    Li, M., Tapio, I., Vilkki, J., Ivanova, Z., Kiselyova, T., Marzanov, N., Cinkulov, M., Stojanovic, S., Ammosov, I., Popov, R., Kantanen, J.: The genetic structure of cattle populations (Bos taurus) in northern Eurasia and the neighbouring Near Eastern regions: implications for breeding strategies and conservation. Molecular Ecology 16(18), 3839–3853 (2007)CrossRefGoogle Scholar
  6. 6.
    Rosenberg, N.A., Pritchard, J.K., Weber, J.L., Cann, H.M., Kidd, K.K., Zhivotovsky, L.A., Feldman, M.W.: Genetic Structure of Human Populations. Science 298(5602), 2381–2385 (2002)CrossRefGoogle Scholar
  7. 7.
    Vural, V., Dy, J.G.: A hierarchical method for multi-class support vector machines. In: ICML 2004: Proceedings of the twenty-first international conference on Machine learning, pp. 105–112. ACM Press, New York (2004)Google Scholar
  8. 8.
    Kohonen, T.: Self-organizing maps. Springer-Verlag New York, Inc., Secaucus (1997)CrossRefzbMATHGoogle Scholar
  9. 9.
    Flexer, A.: On the use of self-organizing maps for clustering and visualization. In: Principles of Data Mining and Knowledge Discovery, pp. 80–88. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  10. 10.
    Vesanto, J.: Som-based data visualization methods. Intelligent Data Analysis 3, 111–126 (1999)CrossRefzbMATHGoogle Scholar
  11. 11.
    Nikkilä, J., Törönen, P., Kaski, S., Venna, J., Castrén, E., Wong, G.: Analysis and visualization of gene expression data using Self-Organizing Maps. Neural Networks 15(8-9), 953–966 (2002)CrossRefGoogle Scholar
  12. 12.
    Johnson, S.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)CrossRefGoogle Scholar
  13. 13.
    Vapnik, V.: Statistical Learning Theory. John Wiley, New York (1998)zbMATHGoogle Scholar
  14. 14.
    Weston, J., Watkins, C.: Multi-class support vector machines. In: Verleysen, M. (ed.) Proceedings of the 6th European Symposium on Artificial Neural Networks (ESANN), D. Facto Press, Brussels (1999)Google Scholar
  15. 15.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research 2, 265–292 (2001)zbMATHGoogle Scholar
  16. 16.
    Kreßel, U.: Pairwise classification and support vector machines. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 255–268. MIT Press, Cambridge (1999)Google Scholar
  17. 17.
    Platt, J.C., Cristianini, N., Shawe-taylor, J.: Large margin dags for multiclass classification. In: Advances in Neural Information Processing Systems, pp. 547–553. MIT Press, Cambridge (2000)Google Scholar
  18. 18.
    Lei, H., Govindaraju, V.: Half-against-half multi-class support vector machines. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 156–164. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    Benabdeslem, K., Bennani, Y.: Dendogram based svm for multi-class classification. In: 28th International Conference on Information Technology Interfaces, pp. 173–178 (2006)Google Scholar
  20. 20.
    Tibshirani, R., Hastie, T.: Margin Trees for High-dimensional Classification. Journal of Machine Learning Research 8, 637–652 (2007)zbMATHGoogle Scholar
  21. 21.
    Asuncion, A., Newman, D.J.: UCI machine learning repository. School of Information and Computer Sciences. University of California, Irvine, California, USA (2007)Google Scholar
  22. 22.
    Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. Computational Learing Theory, 144–152 (1992)Google Scholar
  23. 23.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)CrossRefzbMATHGoogle Scholar
  24. 24.
    Wu, T., Lin, C., Weng, R.: Probability Estimates for Multi-class Classification by Pairwise Coupling. The Journal of Machine Learning Research 5, 975–1005 (2004)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Demšar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jorge Díez
    • 1
  • Juan José del Coz
    • 1
  • Antonio Bahamonde
    • 1
  • Oscar Luaces
    • 1
  1. 1.Artificial Intelligence CenterUniversity of Oviedo at GijónAsturiasSpain

Personalised recommendations