Fusing Heterogeneous Data Sources Considering a Set of Equivalence Constraints

Martín-Merino, Manuel

doi:10.1007/978-3-642-21501-8_12

Manuel Martín-Merino¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6691))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

2173 Accesses

Abstract

Pattern Recognition algorithms depend strongly on the dissimilarity considered to evaluate the sample proximities. In real applications, several dissimilarities are available that may come from different object representations or data sources. Each dissimilarity provides usually complementary information about the problem. Therefore, they should be integrated in order to reflect accurately the object proximities.

In many applications, the user feedback or the a priori knowledge about the problem provide pairs of similar and dissimilar examples. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting.

The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves classification and clustering results based on a single dissimilarity and data source.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a Mahalanobis Metric from Equivalence Constraints. Journal of Machine Learning Research 6, 937–965 (2005)
MathSciNet MATH Google Scholar
Cristianini, N., Kandola, J., Elisseeff, J., Shawe-Taylor, A.: On the kernel target alignment. Journal of Machine Learning Research 1, 1–31 (2002)
Google Scholar
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association 97(457), 77–87 (2002)
Article MathSciNet MATH Google Scholar
Hubert, L., Arabie, P.: Comparing Partitions. J. of Classification, 193–218 (1985)
Google Scholar
Huang, D., Pan, W.: Incorporating Biological Knowledge into Distance-Based Clustering Analysis of Microarray Gene Expression Data. Bioinformatics 22(10), 1259–1268 (2006)
Article Google Scholar
Kwok, J.T., Tsang, I.W.: Learning with Idealized Kernels. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 400–407 (2003)
Google Scholar
Lanckriet, G., Cristianini, N., Barlett, P., El Ghaoui, L., Jordan, M.: Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research 3, 27–72 (2004)
MATH Google Scholar
Martín-Merino, M., Blanco, A., De Las Rivas, J.: Combining Dissimilarities in a Hyper Reproducing Kernel Hilbert Space for Complex Human Cancer Prediction. Journal of Biomedicine and Biotechnology, 1–9 (2009)
Google Scholar
Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)
MathSciNet Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2001)
MATH Google Scholar
Soon Ong, C., Smola, A., Williamson, R.: Learning the kernel with hyperkernels. Journal of Machine Learning Research 6, 1043–1071 (2005)
MathSciNet MATH Google Scholar
Woznica, A., Kalousis, A., Hilario, M.: Learning to Combine Distances for Complex Representations. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis, USA, pp. 1031–1038 (2007)
Google Scholar
Wu, G., Chang, E.Y., Panda, N.: Formulating distance functions via the kernel trick. In: ACM SIGKDD, Chicago, pp. 703–709 (2005)
Google Scholar
Xing, E., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning, with Application to Clustering with Side-Information. In: NIPS, vol. 15, pp. 505–512. MIT Press, Cambridge (2003)
Google Scholar
Xiong, H., Chen, X.-W.: Kernel-Based Distance Metric Learning for Microarray Data Classification. BMC Bioinformatics 7(299), 1–11 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Pontificia de Salamanca, C/Compañía 5, 37002, Salamanca, Spain
Manuel Martín-Merino

Authors

Manuel Martín-Merino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departament d’Enginyeria Electrònica, Universitat Politècnica de Catalunya (UPC), Campus Nord, Edificio C4, c/ Gran Capità s/n, 08034, Barcelona, Spain
Joan Cabestany
Department of Computer Architecture and Computer Technology, University of Granada, C/ Periodista Daniel Saucedo Aranda, 18071, Granada, Spain
Ignacio Rojas
Departamento Tecnologia Eletrónica, Universidad de Málaga, Campus de Teatinos, 29071, Málaga, Spain
Gonzalo Joya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martín-Merino, M. (2011). Fusing Heterogeneous Data Sources Considering a Set of Equivalence Constraints. In: Cabestany, J., Rojas, I., Joya, G. (eds) Advances in Computational Intelligence. IWANN 2011. Lecture Notes in Computer Science, vol 6691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21501-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-21501-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21500-1
Online ISBN: 978-3-642-21501-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics