Abstract
Pattern Recognition algorithms depend strongly on the dissimilarity considered to evaluate the sample proximities. In real applications, several dissimilarities are available that may come from different object representations or data sources. Each dissimilarity provides usually complementary information about the problem. Therefore, they should be integrated in order to reflect accurately the object proximities.
In many applications, the user feedback or the a priori knowledge about the problem provide pairs of similar and dissimilar examples. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting.
The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves classification and clustering results based on a single dissimilarity and data source.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a Mahalanobis Metric from Equivalence Constraints. Journal of Machine Learning Research 6, 937–965 (2005)
Cristianini, N., Kandola, J., Elisseeff, J., Shawe-Taylor, A.: On the kernel target alignment. Journal of Machine Learning Research 1, 1–31 (2002)
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association 97(457), 77–87 (2002)
Hubert, L., Arabie, P.: Comparing Partitions. J. of Classification, 193–218 (1985)
Huang, D., Pan, W.: Incorporating Biological Knowledge into Distance-Based Clustering Analysis of Microarray Gene Expression Data. Bioinformatics 22(10), 1259–1268 (2006)
Kwok, J.T., Tsang, I.W.: Learning with Idealized Kernels. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 400–407 (2003)
Lanckriet, G., Cristianini, N., Barlett, P., El Ghaoui, L., Jordan, M.: Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research 3, 27–72 (2004)
MartÃn-Merino, M., Blanco, A., De Las Rivas, J.: Combining Dissimilarities in a Hyper Reproducing Kernel Hilbert Space for Complex Human Cancer Prediction. Journal of Biomedicine and Biotechnology, 1–9 (2009)
Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)
Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2001)
Soon Ong, C., Smola, A., Williamson, R.: Learning the kernel with hyperkernels. Journal of Machine Learning Research 6, 1043–1071 (2005)
Woznica, A., Kalousis, A., Hilario, M.: Learning to Combine Distances for Complex Representations. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis, USA, pp. 1031–1038 (2007)
Wu, G., Chang, E.Y., Panda, N.: Formulating distance functions via the kernel trick. In: ACM SIGKDD, Chicago, pp. 703–709 (2005)
Xing, E., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning, with Application to Clustering with Side-Information. In: NIPS, vol. 15, pp. 505–512. MIT Press, Cambridge (2003)
Xiong, H., Chen, X.-W.: Kernel-Based Distance Metric Learning for Microarray Data Classification. BMC Bioinformatics 7(299), 1–11 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
MartÃn-Merino, M. (2011). Fusing Heterogeneous Data Sources Considering a Set of Equivalence Constraints. In: Cabestany, J., Rojas, I., Joya, G. (eds) Advances in Computational Intelligence. IWANN 2011. Lecture Notes in Computer Science, vol 6691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21501-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-21501-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21500-1
Online ISBN: 978-3-642-21501-8
eBook Packages: Computer ScienceComputer Science (R0)