Abstract
Similarity measurement is crucial for unsupervised learning and semi-supervised learning. Unsupervised methods need a similarity to do clustering. Semi-supervised algorithms need a similarity to take advantage of unlabeled data. In this paper, we develop a boosted similarity learning algorithm. The ensemble similarity is the weighted sum of a few component similarities. Each component similarity is learned form a graph G(V, E), where \(V=\{x_1, x_2,\ldots ,x_n\}\) represent the data and the edges E represent the distance (or similarity) between them. For a given graph, we propose “within graph-cluster scatter \(S_{w}\)” and “between graph-cluster scatter \(S_{b}\)” to analyze the discrimination of the graph. So the contributions of this paper are: (i) we develop a boosting similarity learning strategy based on a few graphs, so the proposed strategy can take advantage of a few graphs rather than only one; (ii) we propose “within graph-cluster scatter \(S_{w}\)” and “between graph-cluster scatter \(S_{b}\)” to measure the discrimination of a graph. Experimental results on both synthetic and public available data sets show that the proposed method outperforms the sate-of-the-arts.
Similar content being viewed by others
References
Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge
Zhang Z, Zhao M, Chow TWS (2015) Graph based constrained semi-supervised learning framework via label propagation over adaptive neighborhood. IEEE Trans Knowl Data Eng 27(9):2362–2376
Bellet A, Habrard A, Sebban M (2012) Similarity learning for provably accurate sparse linear classification. In: International conference on machine learning
Guo ZC, Ying Y (2014) Guaranteed classification via regularized similarity learning. Neural Comput 26(3):497–522
Chechik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11(2):1109–1135
Lim D, Lanckriet GRG (2014) Efficient learning of mahalanobis metrics for ranking. In: International conference on machine learning, pp 1980–1988
Qin X, Liu D, Wang D (2017) Heterogeneous similarity learning for more practical kinship verification. Neural Process Lett 12:1–17
Yi J, Jin R, Jain AK, Jain S, Yang T (2012) Semi-crowdsourced clustering: generalizing crowd labeling by robust distance metric learning. In: Advances in neural information processing systems, pp 1772–1780
Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning, with application to clustering with side-information. Adv Neural Inf Process Syst 15:505–512
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
Jordan MI, Bach FR (2004) Learning spectral clustering. Adv Neural Inf Process Syst 7(2):2006
Wu Z, Yin M, Zhou Y, Fang X, Xie S (2017) Robust spectral subspace clustering based on least square regression. Neural Process Lett 3:1–14
Bellet A, Habrard A, Sebban M (2013) A survey on metric learning for feature vectors and structured data. CoRR. arXiv:1306.6709
Kulis B (2012) Metric learning: a survey. Found Trends Mach Learn 5(4):287
Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: IEEE 11th international conference on computer vision, pp 1–7
Hoi SCH, Liu W, Chang S-F (2010) Semi-supervised distance metric learning for collaborative image retrieval and clustering. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 6(3):18
Xiang S, Nie F, Zhang C (2008) Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognit 41(12):3600–3612
Wang Q, Yuen PC, Feng G (2013) Semi-supervised metric learning via topology preserving multiple semi-supervised assumptions. Pattern Recognit 46(9):2576C2587
Chechik G, Sharma V, Shalit U, Bengio S (2009) Large scale online learning of image similarity through ranking. Springer, Berlin, pp 11–14
Wang Q, Lu M, Li J (2018) Similarity learning based on sparse representation for semi-supervised boosting. Int J Comput Intell Appl 17(2):1850011
Chen SB, Ding CH, Luo B (2014) Similarity learning of manifold data. IEEE Trans Cybern 45(9):1744–1756
Liu K, Bellet A, Sha F (2014) Similarity learning for high-dimensional sparse data. Eprint Arxiv, pp 653–662
Li J-H, Wang C-D, Li P-Z, Lai J-H (2018) Discriminative metric learning for multi-view graph partitioning. Pattern Recognit 75:199–213
Wang Q, Yuen PC, Feng G, Wang PS (2012) Similarity learning based on semi-supervised graph for classification. Int J Pattern Recognit Artif Intell 26(4):1250009
Carreira-Perpinán MA, Zemel RS (2005) Proximity graphs for clustering and manifold learning. Adv Neural Inf Process Syst 17:225–232
Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608
Zhang X, Li J, Yu H (2011) Local density adaptive similarity measurement for spectral clustering. Pattern Recognit Lett 32(2):352–358
Xia T, Cao J, Zhang YD, Li JT (2009) On defining affinity graph for spectral clustering through ranking on manifolds. Neurocomputing 72(1315):3203–3211
Wang QY, Yuen PC, Feng GC (2011) Similarity learning for semi-supervised multi-class boosting. In: Acoustics, 2011 IEEE international conference on speech and signal processing (ICASSP), pp 2164–2167
Valizadegan H, Jin R, Jain AK (2008) Semi-supervised boosting for multi-class classification. Mach Learn Knowl Discov Databases 522–537
Tanha J, Someren MV, Afsarmanesh H (2014) Boosting for multiclass semi-supervised learning. Pattern Recognit Lett 37(1):63–C77
Song E, Huang D, Ma G, Hung CC (2011) Semi-supervised multi-class Adaboost by exploiting unlabeled data. Expert Syst Appl 38(6):6720–6726
Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250
Huang D, Wang CD, Lai JH (2016) Locally weighted ensemble clustering. IEEE Trans Cybern 48(5):1460–1473
Huang D, Lai JH, Wang CD (2016) Ensemble clustering using factor graph. Pattern Recognit 50(C):131–142
Huang D, Lai JH, Wang CD (2016) Robust ensemble clustering using probability trajectories. IEEE Trans Knowl Data Eng 28(5):1312–1326
Wang Q, Lu M, Zhou B (2015) Boosted similarity learning based on discriminative graphs. In: Proceedings of 2015 IEEE international conference on progress in informatics and computing, pp 61–64
Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley-Interscience, Hoboken
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, Cambridge
Baudat G, Anouar F (2006) Generalized discriminant analysis using a kernel approach. Neural Comput 12(10):2385–2404
Mallapragada PK, Jin R, Jain AK, Liu Y (2009) Semiboost: boosting for semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 31(11):2000–2014
Chen K, Wang S (2011) Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. IEEE Trans Pattern Anal Mach Intell 33(1):129–143
Fischer B, Buhmann JM (2003) Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans Pattern Anal Mach Intell J 25(4):513–518
Dit-Yan Y, Hong C (2007) A kernel approach for semisupervised metric learning. IEEE Trans Neural Netw 18(1):141–149
Bache K, Lichman M (2013) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
Yeung D-Y, Chang H (2007) A kernel approach for semisupervised metric learning. IEEE Trans Neural Netw 18(1):141–149
Acknowledgements
The authors would like to thank the anonymous reviewers for their insightful comments which helped to improve the paper. They also gratefully thank Dr. Weifu Chen, Hao Fu and Xin Tang for helpful and informative discussion on the experiments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors would like to thank the Scientific Research Foundation of Hebei Education Department (No. QN2015080, No. QN2018126), Hebei University of Economics and Business Foundation (No. 2015KYQ07), NSF of China (No. 11401163, No. 61602148), Hebei province high level talent support program (Post doctoral research projects merit aid, No. B2014003013), Doctoral Fund of Hebei Normal University, China (No. L2012B01, No. L2012B02) and Postdoctoral Fund of Hebei Normal University for funding this object. Natural Science Foundation of Hebei Province (No. F2017207010). The authors also would like to thank the support program of youth top talent of Hebei province.
Rights and permissions
About this article
Cite this article
Wang, Q., Lu, M. Discriminative Graph Based Similarity Boosting. Neural Process Lett 50, 1303–1319 (2019). https://doi.org/10.1007/s11063-018-9918-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-018-9918-1