The performance of many clustering algorithms such as k-means depends strongly on the dissimilarity considered to evaluate the sample proximities. The choice of a good dissimilarity is a difficult task because each dissimilarity reflects different features of the data. Therefore, different dissimilarities should be integrated in order to reflect more accurately which is similar for the user and the problem at hand.

In many applications, the user feedback or the a priory knowledge about the problem provide pairs of similar and dissimilar examples. This side-information may be used to learn a distance metric and to improve the clustering results. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting.

The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves the classical k-means clustering based on a single dissimilarity.


Cluster Result Machine Learn Research Idealize Kernel Sample Proximity Standard Quadratic Optimization Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cox, T.F., Cox, M.A.A.: Multidimensional scaling, 2nd edn. Chapman & Hall/CRC, USA (2001)zbMATHGoogle Scholar
  2. 2.
    Cristianini, N., Kandola, J., Elisseeff, J., Shawe-Taylor, A.: On the kernel target alignment. Journal of Machine Learning Research 1, 1–31 (2002)Google Scholar
  3. 3.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)Google Scholar
  4. 4.
    Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association 97(457), 77–87 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a Mahalanobis Metric from Equivalence Constraints. Journal of Machine Learning Research 6, 937–965 (2005)MathSciNetGoogle Scholar
  6. 6.
    Hubert, L., Arabie, P.: Comparing Partitions. Journal of Classification, 193–218 (1985)Google Scholar
  7. 7.
    Huang, D., Pan, W.: Incorporating Biological Knowledge into Distance-Based Clustering Analysis of Microarray Gene Expression Data. Bioinformatics 22(10), 1259–1268 (2006)CrossRefGoogle Scholar
  8. 8.
    Kwok, J.T., Tsang, I.W.: Learning with Idealized Kernels. In: Proceedings of the Twentieth International Conference on Machine Learning, Washington DC, pp. 400–407 (2003)Google Scholar
  9. 9.
    Jeffery, I.B., Higgins, D.G., Culhane, A.C.: Comparison and Evaluation Methods for Generating Differentially Expressed Gene List from Microarray Data. BMC Bioinformatics 7(359), 1–16 (2006)Google Scholar
  10. 10.
    Lanckriet, G., Cristianini, N., Barlett, P., El Ghaoui, L., Jordan, M.: Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research 3, 27–72 (2004)Google Scholar
  11. 11.
    Martín-Merino, M., Blanco, A., De Las Rivas, J.: Combining Dissimilarities in a Hyper Reproducing Kernel Hilbert Space for Complex Human Cancer Prediction. Journal of Biomedicine and Biotechnology, 1–9 (2009)Google Scholar
  12. 12.
    Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)CrossRefGoogle Scholar
  13. 13.
    Soon Ong, C., Smola, A., Williamson, R.: Learning the kernel with hyperkernels. Journal of Machine Learning Research 6, 1043–1071 (2005)Google Scholar
  14. 14.
    Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1998)zbMATHGoogle Scholar
  15. 15.
    Wu, G., Chang, E.Y., Panda, N.: Formulating distance functions via the kernel trick. In: ACM SIGKDD, Chicago, pp. 703–709 (2005)Google Scholar
  16. 16.
    Zhao, B., Kwok, J.T., Zhang, C.: Multiple Kernel Clustering. In: Proceedings of the Ninth SIAM International Conference on Data Mining, Nevada, pp. 638–649 (2009)Google Scholar
  17. 17.
    Xing, E., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning, with Application to Clustering with Side-Information. In: Advances in Neural Information Processing Systems, vol. 15, pp. 505–512. MIT Press, Cambridge (2003)Google Scholar
  18. 18.
    Xiong, H., Chen, X.-W.: Kernel-Based Distance Metric Learning for Microarray Data Classification. BMC Bioinformatics 7(299), 1–11 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Manuel Martín-Merino
    • 1
  1. 1.Universidad Pontificia de SalamancaSalamancaSpain

Personalised recommendations