Multi-view local linear KNN classification: theoretical and experimental studies on image classification

Jiang, Zhibin; Bian, Zekang; Wang, Shitong

doi:10.1007/s13042-019-00992-9

Multi-view local linear KNN classification: theoretical and experimental studies on image classification

Original Article
Published: 02 August 2019

Volume 11, pages 525–543, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Zhibin Jiang¹^na1,
Zekang Bian¹^na1 &
Shitong Wang¹

438 Accesses
9 Citations
Explore all metrics

Abstract

When handling special multi-view scenarios where data from each view keep the same features, we may perhaps encounter two serious challenges: (1) samples from different views of the same class are less similar than those from the same view but different class, which sometimes happen in local way in both training and/or testing phases; (2) training an explicit prediction model becomes unreliable and even infeasible for test samples in multi-view scenarios. In this study, we prefer the philosophy of the k nearest neighbor method (KNN) to circumvent the second challenge. Without an explicit prediction model trained directly from the above multi-view data, a new multi-view local linear k nearest neighbor method (MV-LLKNN) is then developed to circumvent the two challenges so as to predict the label of each test sample. MV-LLKNN has its two reliable assumptions. One is the theoretically and experimentally provable assumption that any test sample can be well approximated by a linear combination of its neighbors in the multi-view training dataset. The other assumes that these neighbors should demonstrate their clustering property according to certain commonality-based similarity measure between the multi-view test sample and these multi-view neighbors so as to avoid the first challenge. MV-LLKNN can realize its effective prediction for a test multi-view sample by cheaply using both on-hand fast iterative shrinkage thresholding algorithm (FISTA) and KNN. Our theoretical analysis and experimental results about real multi-view face datasets indicate the effectiveness of MV-LLKNN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Mixed-Norm Regression for Visual Classification

A regularized approach for supervised multi-view multi-manifold learning from unlabeled data

Article 16 March 2019

Face image set classification with self-weighted latent sparse discriminative learning

Article 21 November 2020

References

Cleuziou G, Exbrayat M, Martin L, Sublemontier J-H (2009) CoFKM: a centralized method for multiple-view clustering. In Proceedings 9th IEEE ICDM. pp. 752–757
Huang X, Lei Z, Fan M, Wang X, Li SZ (2013) Regularized discriminative spectral regression method for heterogeneous face matching. IEEE Trans Image Process 22(1):353–362
MathSciNet MATH Google Scholar
Kan M, Shan S, Zhang H et al (2016) Multi-view discriminant analysis. IEEE Trans Pattern Anal Mach Intell 38(1):188–194
Google Scholar
Ding Z, Fu Y (2014) Low-rank common subspace for multi-view learning. In: Proceedings IEEE ICDM. pp. 110–119
Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44:2431–2442
Google Scholar
Jiang Yizhang, Chung Fu-Lai, Wang Shitong, Deng Zhaohong, Wang Jun, Qian Pengjiang (2015) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701
Google Scholar
Farquhar J, Hardoon D, Meng H, Shawe-Taylor J, Szedmak S (2006) Two viewlearning: SVM-2 K, theory and practice. Adv Neural Inf Process Syst 18:355–362
Google Scholar
Sun S (2013) Multi-view Laplacian support vector machines. Lect Notes Artif Intell 41(4):209–222
Google Scholar
Zhu F, Shao L, Lin M (2013) Multi-view action recognition using local similarity random forests and sensor fusion. Pattern Recognit Lett 34(1):20–24
Google Scholar
Xu Z, Sun S (2010) An algorithm on multi-view AdaBoost. In: Proceedings of 17th International conference on neural information processing, pp. 355–362
Google Scholar
Peng J, Luo P, Guan Z, Fan J (2017) Graph-regularized multi-view semantic subspace learning. Int J Mach Learn Cybern 3(4):1–17
Google Scholar
Xia T, Tao D, Mei T, Zhang Y (2010) Multiview spectral embedding. IEEE Trans Syst Man Cybern B Cybern 40(6):1438–1446
Google Scholar
Tzortzis GF, Likas AC (2012) Kernel-based weighted multi-view clustering. In: Proceedings of the 2012 IEEE 12th international conference on data mining, pp. 675–684
Tzortzis G, Likas A (2009) Convex mixture models for multi-view clustering. In: Proceedings of 19th international conference artificial neural networks, pp 205–214
Google Scholar
Zong L, Zhang X, Zhao L, Yu H, Zhao Q (2017) Multi-view clustering via multi-manifold regularized non-negative matrix factorization. Neural Netw 88:74–89
Google Scholar
Kakade SM, Foster DP (2007) Multi-view regression via canonical correlation analysis. In Proceedings of 20th annual conference on learning theory 2007, pp. 82–96
Merugu S, Rosset S, Perlich C (2006) A new multi-view regression approach with an application to customer wallet estimation. In: Proceedings 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 656–661
Kusakunniran W, Wu Q, Zhang J, Li H (2010) Support vector regression for multi-view gait recognition based on local motion feature selection. In: Proceedings IEEE conference CVPR, pp. 974–981
Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inform Fusion 38:43–54
Google Scholar
Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530
Google Scholar
Wright J, Yang AY, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Google Scholar
Liu Q, Liu C (2017) A novel locally linear KNN method with applications to visual recognition. IEEE Trans Neural Netw Learn Syst 28(9):2010–2021
MathSciNet Google Scholar
Zheng H, Zhu J, Yang Z, Jin Z (2017) Effective micro-expression recognition using relaxed K-SVD algorithm. Int J Mach Learn Cybern 8(6):2043–2049
Google Scholar
CandŁs E, Romberg J (2007) Sparsity and incoherence in compressive sampling. Inverse Prob 23(3):969
MathSciNet MATH Google Scholar
Lu X, Wu H, Yuan Y, Yan P, Li X (2013) Manifold regularized sparse NMF for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 51(5):2815–2826
Google Scholar
Mao W, Wang J, Xue Z (2017) An ELM-based model with sparse-weighting strategy for sequential data imbalance problem. Int J Mach Learn Cybern 8(4):1333–1345
Google Scholar
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings IEEE conference CVPR, pp. 1794–1801
Wang J, Yang J, Yu K, et al (2010) Locality-constrained linear coding for image classification. In: Proceedings of IEEE conference on CVPR, pp. 3360–3367
Gao S, Tsang IW-H, Chia L-T (2013) Laplacian sparse coding, hypergraph Laplacian sparse coding, and applications. IEEE Trans Pattern Anal Mach Intell 35(1):92–104
Google Scholar
Deng W, Hu J, Guo J (2012) Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans Pattern Anal Mach Intell 34(9):1864–1870
Google Scholar
Deng W, Hu J, Guo J (2013) In defense of sparsity based face recognition. In: Proceedings of IEEE conference CVPR, pp 399–406
Zhang Q, Li B (2010) Discriminative K-SVD for dictionary learning in face recognition. In: Proceedings of IEEE conference on CVPR, pp. 2691–2698
Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In: Proceedings of 9th ACM conference CIKM, pp. 86–93
Muslea I, Minton S, Knoblock C (2006) Active learning with multiple views. J Artif Intell Res 27:203–233
MathSciNet MATH Google Scholar
Sun S, Jin F (2011) Robust co-training. Int J Pattern Recognit Artif Intell 25(07):1113–1126
MathSciNet Google Scholar
Huang Chengquan, Chung Fu-Lai, Wang Shitong (2016) Multi-view L2-SVM and its multiview core vector machine. Neural Netw 75:110–125
MATH Google Scholar
Sun S, Chao G (2013) Multi-view maximum entropy discrimination. In: Proceedings of 23rd IJCAI, pp. 1706–1712
Chao G, Sun S (2016) Alternative multi-view maximum entropy discrimination. IEEE Trans Neural Netw Learn Syst 27(07):1445–1456
MathSciNet Google Scholar
Chao G, Sun S (2016) Consensus and complementarity based maximun entropy discrimination for multi-view classification. Inf Sci 367:296–310
MATH Google Scholar
Xu C, Tao D, Xu C (2013) A Survey on Multi-view Learning. Computer Science
Xia T, Tao D, Mei T, Zhang Y (2010) Multiview spectral embedding. IEEE Trans Syst Man Cybern Part B 40:61438–61446
Google Scholar
Xie B, Mu Y, Tao D, Huang K (2011) M-sne: multiview stochastic neighbor embedding. IEEE Trans Syst Man Cybern Part B 41(4):1088–1096
Google Scholar
Han Y, Wu F, Tao D et al (2012) Sparse unsupervised dimensionality reduction for multiple view data. IEEE Trans Circuts Syst Video 22(10):1485
Google Scholar
Jiang Z, Lin Z, Davis LS (2013) Label consistent K-SVD: learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 35(11):2651–2664
Google Scholar
Yang M, Zhang L, Feng X, Zhang D (2014) Sparse representation based Fisher discrimination dictionary learning for image classification. Int J Comput Vis 109(3):209–232
MathSciNet MATH Google Scholar
Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, New York
MATH Google Scholar
Nesterov Y (2004) Introductory lectures on convex optimization: a basic course. Springer, New York
MATH Google Scholar
Halldorsson GH, Benediktsson JA, Sveinsson JR (2003) Supportvector machines in multisource classification. In: Proceedings IGARSS, Toulouse, France, Jul. 2003, pp. 2054–2056
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
MathSciNet MATH Google Scholar
Sim T, Baker S, Bsat M (2002) The cmu pose, illumination, and expression (pie) database. In: Proceedings of the fifth IEEE international conference on Automatic Face Gesture Recognition. IEEE, pp 46–51
Cai D, He X, Han J (2007) Spectral regression for efficient regularized subspace learning. In: Proceedings of the 11th international conference on Computer Vision. IEEE, pp 1–8
https://www.cl.cam.ac.uk/Research/DTG/attarchive/facedatabase.html
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, pp. 138–142
Sharmanska V, Quadrianto N, Lampert CH (2013) Learning to rank using privileged information. In: Proceedings of 14th IEEE ICCV, pp 825–832
Google Scholar
Motiian S, Piccirilli M, Adjeroh DA, Doretto G (2016) Information bottleneck learning using privileged information for visual recognition: In: Proceeding of international conference on computer vis pattern recognition, June 2016, pp 1496–1505
Google Scholar
Parambath SP, Usunier N, Grandvalet Y (2014) Optimizing F-measures by cost-sensitive classification. In: Proceedings NIPS, pp 2123–2131
Google Scholar
Jiang Y, Deng Z, Chung F-L, Wang S (2017) Realizing two-view TSK fuzzy classification system by using collaborative learning. IEEE Trans Syst Man Cybern 47(1):145–160
Google Scholar
Jiang Y, Deng Z, Chung F-L, Wang G, Qian P, Choi K-S, Wang S (2017) Recognition of epileptic EEG signals using a novel multiview TSK fuzzy system. IEEE Trans Fuzzy Syst 25(1):3–20
Google Scholar
Wang X, Lu S, Zhai J (2008) Fast fuzzy multicategory SVM based on support vector domain description. Int J Pattern Recognit 22(1):109–120
Google Scholar
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cognit Neurosci 3(1):71–86
Google Scholar
Comaniciu D, Meer P (1999) Mean shift analysis and applications. In: Proceedings of 7th IEEE ICCV, pp 1197–1203
Google Scholar

Download references

Author information

Zhibin Jiang and Zekang Bian have the equal contributions to this study.

Authors and Affiliations

School of Digital Media, Jiangnan University, Wuxi, Jiangsu, People’s Republic of China
Zhibin Jiang, Zekang Bian & Shitong Wang

Authors

Zhibin Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zekang Bian
View author publications
You can also search for this author in PubMed Google Scholar
Shitong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shitong Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Proof of Theorem 1.

Proof:

If ${\mathbf{w}}^{k}$ is the representation obtained by MV-LLKNN on the multi-view data, then $J\left( {{\mathbf{w}}^{K} } \right) \le J\left( {\mathbf{0}} \right)$,

$$\begin{aligned} \left\| {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\mathbf{w}}^{k} } \right\|_{2}^{2} + \alpha \left\| {{\mathbf{w}}^{k} } \right\|_{1} + \beta \left\| {{\mathbf{w}}^{k} - \eta {\mathbf{s}}} \right\|_{2}^{2} \hfill \\ \le \left\| {{\mathbf{x}}^{k} } \right\|^{2} + \beta \left\| {\eta {\mathbf{s}}} \right\|^{2} \times \hfill \\ \end{aligned}$$

(39)

Since

$$\left\| {{\mathbf{w}}^{k} - \beta {\mathbf{s}}} \right\|_{2} \le \left( {\frac{1}{\beta } + \left\| {\eta {\mathbf{s}}} \right\|^{2} } \right)^{{\frac{1}{2}}}$$

(40)

we know that $\left\| {{\mathbf{w}}^{k} - \eta {\mathbf{s}}} \right\|_{2}$ is actually bounded by a small positive constant. That is to say, ${\mathbf{w}}^{k} \approx \eta {\mathbf{s}} + {\mathbf{const}}$.

The transformations in Eqs. (12) and (13) guarantees that each term in ${\mathbf{w}}^{k}$ satisfies that $0 \le w_{i}^{k} \le 1$ and $\sum\nolimits_{i = 1}^{m} {w_{i}^{k} } = 1$. It is worth noting that the transformations do not affect the classification results. Based on the similarity measure between the test sample and the training sample, MV-LLKNN₊ and MV-LLKNN_* are designed.

For MV-LLKNN₊, it can be approximated as follows:

$$\begin{aligned} c^{ * } & = \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {w_{i}^{k} } } \\ & \approx \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\eta s_{i} + \text{const}} } \\ & \propto \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\left( {1 - \frac{{\sum\limits_{l = 1}^{K} {\gamma_{l} \left\| {{\mathbf{x}}^{l} - {\mathbf{a}}_{i}^{l} } \right\|^{2} } }}{{2\sigma^{2} }}} \right)} } \\ \end{aligned}$$

(41)

In this study, let us consider the Epanechnikov kernel [61] : $h\left( u \right) = \frac{3}{4}\left( {1 - u^{2} } \right)$. Then

$$\begin{aligned} c^{ * } = \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {w_{i}^{k} } } \hfill \\ \, \propto \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\sum\limits_{l = 1}^{K} {h\left( {\frac{{{\mathbf{x}}^{l} - {\mathbf{a}}_{i}^{l} }}{\sigma }} \right)} } } \hfill \\ \end{aligned}$$

(42)

where $\sum\nolimits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\sum\nolimits_{l = 1}^{K} {h\left( {\frac{{{\mathbf{x}}^{l} - {\mathbf{a}}_{i}^{l} }}{\sigma }} \right)} }$ becomes the kernel density estimation of the conditional probability $p\left( {{\mathbf{x}}^{k} \left| c \right.} \right)\left( {k = 1,2, \ldots ,K} \right)$.

Since each view is assumed to be classified separately and independently. Therefore, if the prior probability $p\left( c \right)$ is the same for all the classes, then

$$\begin{aligned} c^{ * } & = \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {w_{i}^{k} } } \\ & \approx \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {p\left( {{\mathbf{x}}^{k} \left| c \right.} \right)} \\ & \propto \mathop {\arg \hbox{max} }\limits_{c} \sum\limits_{k = 1}^{K} {p\left( {c\left| {{\mathbf{x}}^{k} } \right.} \right)} \left( {{\text{i}} . {\text{e}} . , {\text{ Bayes classifier}}} \right) \\ \end{aligned}$$

(43)

Similarly, we have the following derivations for MV-LLKNN_*:

$$\begin{aligned} c^{ * } & = \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {w_{i}^{k} } } \\ & \approx \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\eta s_{i} + \text{const}} } \\ & \propto \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\prod\limits_{l = 1}^{K} {\left( {1 - \frac{{\left\| {{\mathbf{x}}^{l} - {\mathbf{a}}_{i}^{l} } \right\|^{2} }}{{\sigma^{2} }}} \right)^{{\gamma_{l} }} } } } \\ \end{aligned}$$

(44)

Then, we also consider another Epanechnikov kernel

$$\begin{aligned} c^{ * } = \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {w_{i}^{k} } } \hfill \\ \, \propto \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {\prod\limits_{l = 1}^{K} {h\left( {\frac{{{\mathbf{x}}^{l} - {\mathbf{a}}_{i}^{l} }}{\sigma }} \right)} } } \hfill \\ \end{aligned}$$

(45)

Therefore, if the prior probability $p\left( c \right)$ is the same for all the classes, then

$$\begin{aligned} c^{ * } & = \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {\sum\limits_{{{\mathbf{a}}_{i}^{k} \in {\mathbf{A}}_{c}^{k} }} {w_{i}^{k} } } \\ & \approx \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {p\left( {{\mathbf{x}}^{k} \left| c \right.} \right)} \\ & \propto \mathop {\arg \hbox{max} }\limits_{c} \prod\limits_{k = 1}^{K} {p\left( {c\left| {{\mathbf{x}}^{k} } \right.} \right)} \left( {{\text{i}} . {\text{e}} . , {\text{ Bayes classifier}}} \right) \\ \end{aligned}$$

(46)

In summary, from the perspective of density estimation, MV-LLKNN₊ and MV-LLKNN_* approximate the Bayes decision rule for minimum error and the approximation error mainly comes from ${\mathbf{w}}^{k} \approx \eta {\mathbf{s}} + {\mathbf{const}}$ and the kernel density estimation error.

Appendix 2

Proof of Theorem 2.

Proof

Let us observe Eq. (3) which is equivalent to

$$\begin{aligned} \mathop {arg\hbox{min} }\limits_{{\left[ {w_{1}^{k} ,w_{2}^{k} , \ldots ,w_{m}^{k} } \right]^{T} }} J\left( {\left[ {w_{1}^{k} ,w_{2}^{k} , \ldots ,w_{m}^{k} } \right]^{T} } \right) & = \sum\limits_{k = 1}^{K} {\left\| {{\mathbf{x}}^{k} - \sum\limits_{i = 1}^{m} {w_{i}^{k} {\mathbf{a}}_{i}^{k} } } \right\|_{2}^{2} } \\ & \quad + \alpha \sum\limits_{k = 1}^{K} {\sum\limits_{i = 1}^{m} {\left| {w_{i}^{k} } \right|} } + \beta \sum\limits_{k = 1}^{K} {\sum\limits_{i = 1}^{m} {\left( {w_{i}^{k} - \eta s_{i} } \right)^{2} } } \\ \end{aligned}$$

(47)

Let ${\tilde{\mathbf{w}}}^{k} \varvec{ = }\left[ {\tilde{w}_{1}^{k} ,\tilde{w}_{2}^{k} , \ldots ,\tilde{w}_{m}^{k} } \right]^{T}$ is the representation obtained by MV-LLKNN on the multi-view data, then we take the derivatives with respective $\tilde{w}_{i}^{k}$ and $\tilde{w}_{j}^{k}$:

$$\begin{aligned} \frac{\partial J}{{\partial \tilde{w}_{i}^{k} }} & = - 2\left( {{\mathbf{a}}_{i}^{k} } \right)^{T} \left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right) + \alpha sign\left( {\tilde{w}_{i}^{k} } \right) \\ & \quad + 2\beta \left( {\tilde{w}_{i}^{k * } - \eta s_{i} } \right) \\ \end{aligned}$$

(48)

$$\begin{aligned} \frac{\partial J}{{\partial \tilde{w}_{j}^{k} }} & = - 2\left( {{\mathbf{a}}_{j}^{k} } \right)^{T} \left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right) + \alpha sign\left( {\tilde{w}_{j}^{k} } \right) \\ & \quad + 2\beta \left( {\tilde{w}_{j}^{k} - \eta s_{j} } \right) \\ \end{aligned}$$

(49)

Let the above two derivatives to be zero. Since ${\text{sign}}(\tilde{w}_{i}^{k} ) = {\text{sign}}(\tilde{w}_{j}^{k} )$, then, $\frac{\partial J}{{\partial \tilde{w}_{i}^{k} }} - \frac{\partial J}{{\partial \tilde{w}_{j}^{k} }}$ is:

$$\begin{aligned} \beta \left( {\tilde{w}_{i}^{k} - \tilde{w}_{j}^{k} } \right) & = \left( {\left( {{\mathbf{a}}_{i}^{k} } \right)^{T} - \left( {{\mathbf{a}}_{j}^{k} } \right)^{T} } \right)\left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right) \\ & \quad + \beta \eta \left( {s_{i} - s_{j} } \right) \\ \end{aligned}$$

(50)

By $J\left( {{\tilde{\mathbf{w}}}^{k} } \right) \le J\left( {\mathbf{0}} \right), \, \left\| {{\mathbf{x}}^{k} } \right\|^{2} = 1$, we can get:

$$\begin{aligned} \left| {\tilde{w}_{i}^{k} - \tilde{w}_{j}^{k} } \right| & = \frac{1}{\beta }\left| {\left( {\left( {{\mathbf{a}}_{i}^{k} } \right)^{T} - \left( {{\mathbf{a}}_{j}^{k} } \right)^{T} } \right)\left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right) + \beta \eta \left( {s_{i} - s_{j} } \right)} \right| \\ & \le \frac{1}{\beta }\left| {\left( {\left( {{\mathbf{a}}_{i}^{k} } \right)^{T} - \left( {{\mathbf{a}}_{j}^{k} } \right)^{T} } \right)\left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right)} \right| + \eta \left| {s_{i} - s_{j} } \right| \\ & \le \frac{1}{\beta }\left( {\left( {{\mathbf{a}}_{i}^{k} } \right)^{T} - \left( {{\mathbf{a}}_{j}^{k} } \right)^{T} } \right)\left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right) + \eta \left| {s_{i} - s_{j} } \right| \\ & = \frac{1}{\beta }\sqrt {2\left( {1 - \delta_{k} } \right)} \left( {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right) + \eta \left| {s_{i} - s_{j} } \right| \\ \end{aligned}$$

(51)

Then, since

$$\left\| {{\mathbf{x}}^{k} - {\mathbf{A}}^{k} {\tilde{\mathbf{w}}}^{k} } \right\|_{2}^{2} \le \left\| {{\mathbf{x}}^{k} } \right\|^{2} + \beta \left\| {\eta {\mathbf{s}}} \right\|^{2}$$

(52)

so we have

$$\begin{aligned} \left| {\tilde{w}_{i}^{k} - \tilde{w}_{j}^{k} } \right| & \le \frac{1}{\beta }\sqrt {2\left( {1 - \delta_{k} } \right)\left( {\left\| {{\mathbf{x}}^{k} } \right\|^{2} + \beta \eta^{2} \left\| {\mathbf{s}} \right\|^{2} } \right)} + \eta \left| {s_{i} - s_{j} } \right| \\ & = \frac{G}{\beta }\sqrt {2\left( {1 - \delta_{k} } \right)} + \eta \left| {s_{i} - s_{j} } \right| \\ \end{aligned}$$

(53)

where $G = \sqrt {1 + \beta \eta^{2} \left\| {\mathbf{s}} \right\|^{2} }$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Z., Bian, Z. & Wang, S. Multi-view local linear KNN classification: theoretical and experimental studies on image classification. Int. J. Mach. Learn. & Cyber. 11, 525–543 (2020). https://doi.org/10.1007/s13042-019-00992-9

Download citation

Received: 27 March 2018
Accepted: 23 July 2019
Published: 02 August 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s13042-019-00992-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-view local linear KNN classification: theoretical and experimental studies on image classification

Abstract

Access this article

Similar content being viewed by others

Mixed-Norm Regression for Visual Classification

A regularized approach for supervised multi-view multi-manifold learning from unlabeled data

Face image set classification with self-weighted latent sparse discriminative learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Proof:

Appendix 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-view local linear KNN classification: theoretical and experimental studies on image classification

Abstract

Access this article

Similar content being viewed by others

Mixed-Norm Regression for Visual Classification

A regularized approach for supervised multi-view multi-manifold learning from unlabeled data

Face image set classification with self-weighted latent sparse discriminative learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Proof:

Appendix 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation