Learning a Combination of Heterogeneous Dissimilarities from Incomplete Knowledge

Martín-Merino, Manuel

doi:10.1007/978-3-642-15825-4_7

Manuel Martín-Merino¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6354))

Included in the following conference series:

International Conference on Artificial Neural Networks

3282 Accesses

Abstract

The performance of many pattern recognition algorithms depends strongly on the dissimilarity considered to evaluate the sample proximities. The choice of a good dissimilarity is a difficult task because each one reflects different features of the data. Therefore, different dissimilarities and data sources should be integrated in order to reflect more accurately which is similar for the user and the problem at hand.

In many applications, the user feedback or the a priory knowledge about the problem provide pairs of similar and dissimilar examples. This side-information may be used to learn a distance metric that reflects more accurately the sample proximities. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting.

The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves classification and clustering results based on a single dissimilarity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cox, T.F., Cox, M.A.A.: Multidimensional scaling, 2nd edn. Chapman & Hall/CRC (2001)
Google Scholar
Cristianini, N., Kandola, J., Elisseeff, J., Shawe-Taylor, A.: On the kernel target alignment. Journal of Machine Learning Research 1, 1–31 (2002)
Google Scholar
Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a Mahalanobis Metric from Equivalence Constraints. Journal of Machine Learning Research 6, 937–965 (2005)
MathSciNet Google Scholar
Hubert, L., Arabie, P.: Comparing Partitions. Journal of Classification, 193–218 (1985)
Google Scholar
Hulsman, M., Reinders, M.J.T., de Ridder, D.: Evolutionary Optimization of Kernel Weights Improves Protein Complex Comembership Prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics 6(3), 427–437 (2009)
Article Google Scholar
Huang, D., Pan, W.: Incorporating Biological Knowledge into Distance-Based Clustering Analysis of Microarray Gene Expression Data. Bioinformatics 22(10), 1259–1268 (2006)
Article Google Scholar
Kwok, J.T., Tsang, I.W.: Learning with Idealized Kernels. In: Proceedings of the Twentieth International Conference on Machine Learning, Washington DC, pp. 400–407 (2003)
Google Scholar
Jeffery, I.B., Higgins, D.G., Culhane, A.C.: Comparison and Evaluation Methods for Generating Differentially Expressed Gene List from Microarray Data. BMC Bioinformatics 7(359), 1–16 (2006)
Google Scholar
Lanckriet, G., Cristianini, N., Barlett, P., El Ghaoui, L., Jordan, M.: Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research 3, 27–72 (2004)
Google Scholar
Lanckriet, G.R.G., De Bie, T., Cristianini, N., Jordan, M.I., Stafford Noble, W.: A statistical framework for genomic data fusion. Bioinformatics 20(16), 2626–2635 (2004)
Article Google Scholar
Martín-Merino, M., Blanco, A.: A local semi-supervised Sammon algorithm for textual data visualization. Journal of Intelligence Information System 33(1), 23–40 (2009)
Article Google Scholar
Martín-Merino, M., Blanco, A., De Las Rivas, J.: Combining Dissimilarities in a Hyper Reproducing Kernel Hilbert Space for Complex Human Cancer Prediction. Journal of Biomedicine and Biotechnology, 1–9 (2009)
Google Scholar
Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)
Article Google Scholar
Soon Ong, C., Smola, A., Williamson, R.: Learning the kernel with hyperkernels. Journal of Machine Learning Research 6, 1043–1071 (2005)
Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1998)
MATH Google Scholar
Wu, G., Chang, E.Y., Panda, N.: Formulating distance functions via the kernel trick. In: ACM SIGKDD, Chicago, pp. 703–709 (2005)
Google Scholar
Zhao, B., Kwok, J.T., Zhang, C.: Multiple Kernel Clustering. In: Proceedings of the Ninth SIAM International Conference on Data Mining, Nevada, pp. 638–649 (2009)
Google Scholar
Xing, E., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning, with Application to Clustering with Side-Information. In: Advances in Neural Information Processing Systems, pp. 505–512. MIT Press, Cambridge (2003)
Google Scholar
Xiong, H., Chen, X.-W.: Kernel-Based Distance Metric Learning for Microarray Data Classification. BMC Bioinformatics 7(299), 1–11 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Pontificia de Salamanca, C/Compañía 5, 37002, Salamanca, Spain
Manuel Martín-Merino

Authors

Manuel Martín-Merino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, TEI of Thessaloniki, 57400, Sindos, Greece
Konstantinos Diamantaras
Department of Informatics, Nicolaus Copernicus University, School of Physics, Astronomy, and Informatics, ul. Grudziadzka 5, 87-100, Torun, Poland
Wlodek Duch
Department of Forestry and Management of the Environment and Natural Resources, Democritus University of Thrace, Pantazidou 193, 68200, Orestiada Thrace, Greece
Lazaros S. Iliadis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martín-Merino, M. (2010). Learning a Combination of Heterogeneous Dissimilarities from Incomplete Knowledge. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds) Artificial Neural Networks – ICANN 2010. ICANN 2010. Lecture Notes in Computer Science, vol 6354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15825-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-15825-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15824-7
Online ISBN: 978-3-642-15825-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics