Tree-Based Structural Twin Support Tensor Clustering with Square Loss Function

Rastogi, Reshma; Sharma, Sweta

doi:10.1007/978-3-319-69900-4_4

Tree-Based Structural Twin Support Tensor Clustering with Square Loss Function

Conference paper
First Online: 01 November 2017

2638 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10597))

Abstract

Most of the real-life applications involving images, videos etc. deals with matrix data (second order tensor space). Tensor based clustering models can be utilized for identifying patterns in matrix data as they take advantage of structural information in multi-dimensional framework and reduce computational overheads as well. Despite such numerous advantages, tensor clustering has still remained relatively unexplored research area. In this paper, we propose a novel clustering technique, termed as Treebased Structural Least Squares Twin Support Tensor Clustering (Tree-SLSTWSTC), that builds a cluster model as a binary tree, where each node comprises of proposed Structural Least Squares Twin Support Tensor Machine (S-LSTWSTM) classifier that considers the structural risk minimization of data alongside a symmetrical L2-norm loss function. The proposed approach results in time-efficient learning. Initialization framework based on tensor $k{-}$means has been proposed and implemented in order to overcome the instability disseminated by random initialization. To validate the efficacy of the proposed framework, computational experiments have been performed with relevant tensor based models on face recognition and optical digit recognition datasets.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

In machine learning applications, particularly image processing, computer vision and bioinformatics, data is often represented in matrix form (second order tensor space). For example- a gray scale image is a order-2 tensor and a video is a order-3 tensor. Here, one of the critical task is to identify the hidden patterns in training data [1]. Most commonly, such data are converted in vector form so as to facilitate the use of vector based clustering or classification models. This arrangements, however, suffers from the limitations of under-representation and high dimensionality (sometimes over-fitting) problem leading to high training time complexity [2, 3].

Clustering is a powerful technique that aims to group together similar elements in same cluster while maximizing the segregation between dissimilar elements. Recently, in view of limitations of point-based clustering methods in dealing the data which is not distributed around several cluster points, plane based clustering methods such as Maximum Margin clustering (MMC) and Twin support vector Clustering (TWSVC) [4] have attracted considerable research interest. Taking motivation from TWSVC, we propose Treebased clustering framework for clustering second order tensor data. The main contributions of paper includes the following: First, we propose a modified tensor based LS-TWSTM named as Structural LSTWSTM (S-LSTWSTM) [5] classifier that formulates convex optimization problems as system of linear equations that takes care of structural risk associated with the data. Then, S-LSTWSTM has been extended to binary decision structure based clustering framework, termed as Tree-SLSTWSTC, which leads to fast and efficient cluster assignment in tensor framework. Finally, to make our Tree-SLSTWSTC more robust and stable, initialization technique based on Tensor k-means is proposed.

Experiments have been carried out on popular image datasets that establish the out-performance of our proposed algorithm over other vector and tensor based clustering techniques significantly.

The rest of the paper is organized as follows. Section 2 gives the background for our proposed approach. Section 3 discusses our proposed work. Experimental results have been shown in Sect. 4. Finally, Sect. 5 concludes our work and state possible future direction of work.

2 Related Work

Let $ X=\{ X_1, X_2,..., X_m\}$ be a training set of m data samples in second order tensor space i.e. $X_i \in \mathbb {R}^{n_1} \times \mathbb {R}^{n_2}$. Let $I_1$ represent the set of indices with $y_i=1$, and $I_2$ represent the set of indices with label $y_i=-1$.

2.1 Least Squares Twin Support Tensor Machine

Working on the tensor generalization of Twin Support Vector Machine [7], Zhao et al. [5] proposed Least Squares Twin Support Tensor Machine (LS-TWSTM) which aims to find a pair of non-parallel hyperplanes given by $f_1(X)=u_1^TXv_1+b_1$ and $f_2(x)=u_2^TXv_2+b_2$ where $u_1, u_2 \in \mathbb {R}^{n_1}$, $v_1, v_2 \in \mathbb {R}^{n_2}$ and $b_1, b_2 \in \mathbb {R}$. Following two QPPs are solved to find the corresponding non-parallel hyperplanes:

$$\begin{aligned} \text{(LS-TWSTM } \text{1) }&\underset{u_1, v_1, b_1, \xi _2}{min} \quad \frac{1}{2} \sum _{i \in I_1}^{}(u_1X_iv_1+b_1)^2+ c_1\sum _{j\in I_2}^{}\xi _{2j}^2 \\&\text{ subject } \text{ to } \,\,\, -(u_1^TX_jv_1+b_1)+\xi _{2j}=1, \quad j \in I_2. \end{aligned}$$

$$\begin{aligned} \text{(LS-TWSTM } \text{2) }&\underset{u_2, v_2, b_2, \xi _1}{min} \quad \frac{1}{2} \sum _{j \in I_2}^{}(u_2X_jv_2+b_2)^2+ c_2\sum _{i\in I_1}^{}\xi _{1i} \\&\text{ subject } \text{ to } \,\,\, (u_2^TX_iv_2+b_2)+\xi _{1i}= 1, \quad i \in I_1. \end{aligned}$$

Since the hyperplane parameters are interdependent, the problems are solved using alternate projection method [6]. A test point is assigned a label depending upon its proximity from two hyperplanes. Please refer to [5] for details.

3 Proposed Work

In this work, we first propose a novel tensor classifier termed as Structural Least Squares Twin Support Tensor Machine (S-LSTWSTM), which we further use in an unsupervised framework to propose a binary treebased clustering approach termed as Tree-SLSTWSTC.

3.1 Structural Least Squares Twin Support Tensor Machine

In the spirit of Least Squares Twin Support Tensor Machine (LS-TWSTM) [5], the proposed S-LSTWSTM seeks two non-parallel hyperplanes by considering the following optimization problems:

$$\begin{aligned} \text{(S-LSTWSTM } \text{1) }&\underset{u_1, v_1,b_1,\xi _2}{min} \quad \frac{1}{2}\sum _{i \in I_1}^{} (u_1X_iv_1 +e_1b_1)^2 +c_1 \sum _{j \in I_2}^{}\xi _{2j}^2 +c_2 (u_1^Tu_1+v_1^Tv_1+b_1^2) \nonumber \\&\text{ subject } \text{ to } \,\,\, (u_1^T{X_j}v_1 +b_1e_2)= e_2-\xi _{2j}~,~~~~j \in I_2, \end{aligned}$$

(1)

$$\begin{aligned} \text{(S-LSTWSTM } \text{2) }&\underset{u_2, v_2,b_2,\xi _1}{min} \quad \frac{1}{2} \sum _{j \in I_2}^{} (u_2{X_j}v_2 +e_2b_2)^2 +c_1 \sum _{i \in I_1}^{} \xi _{1i}^2 +c_2 (u_2^Tu_2+v_2^Tv_2+b_2^2) \nonumber \\&\text{ subject } \text{ to } \,\,\, (u_2^T{X_i}v_2 +b_2e_1)= e_1-\xi _{1i}, ~~~~i \in I_1, \end{aligned}$$

(2)

where $\xi _1$ and $\xi _2$ are error variables; and $e_1$ and $e_2$ are appropriate dimensional matrices of ones. The first term of Eqs. (1) and (2) calculates the empirical risk of the data. Thus, minimizing this term tends to keep the hyperplane close to the data matrices and the constraints require the hyperplane to be at unit distance from the other class. Further, S-LSTWSTM takes care of structural risk minimization (SRM) by introducing the term ($u_i^Tu_i+v_i^Tv_i+b_i^2$, $i=1,2$) in the objective function and thus improves the generalization ability. It also takes care of the possible ill-conditioning that might arise during matrix inversion.

Working on the lines on LS-TWSTM [5] for Eq. (1) and setting the gradient of objective function with respect to ($u_1$, $v_1$, $b_1$) to zero, indicates that $u_1$, $v_1$ and $b_1$ are inter-dependent and hence can not be solved independently. Therefore, we use alternating projection method [6].

For any given non-zero vector $u_k \in \mathbb {R}^{n_1}$, let $x_i^T={u_k}^TX_i$ and ${x}_j^T={u_k}^T{X}_j$, we then solve for the following modified optimization problem (obtained after substituting the value of $\xi _{2j}$ in the objective function):

$$\begin{aligned} \underset{v_k,b_k}{min}&\frac{1}{2} \sum _{i \in I_1}^{} (x_iv_k +b_k)^2 +c_1 \sum _{j \in I_2}^{} ||e_2-({x_j}v_k +b_ke_2)||^2 +c_2 (v_k^Tv_k+b_k^2). \end{aligned}$$

(3)

Differentiating Lagrangian corresponding to (3) with respect to $v_k$ and $b_k$, leads to the following system of linear equations:

$$\begin{aligned} \left[ {\begin{array}{cc} v_k \\ b_k \end{array}}\right] =-\left[ \frac{1}{c_1}H_1^TH_1+G_1^TG_1+c_2I\right] ^{-1}G_1^Te_2, \end{aligned}$$

(4)

where $H_1$ and $G_1$ are matrices of points $x_i$ and ${x_j}$ augmented with a column of ones; and I is an identity matrix of appropriate dimensions.

Once a non-zero vector $v_k \in \mathbb {R}^{n_2}$ is obtained, let $\hat{x}_i^T=X_i{v_k}$ and $\hat{x}_j^T={X}_j{v_k}$, we solve for the following modified optimization problem:

$$\begin{aligned} \underset{u_k,b_k}{min}&\frac{1}{2} \sum (\hat{x}_iu_k +b_k)^2 +c_1 ||(\hat{x}_ju_k +b_ke_2)-e_2|| +c_2 (u_k^Tu_k+b_k^2). \end{aligned}$$

(5)

Working on the lines as above, we obtain $(u_k, b_k)$ as follows

$$\begin{aligned} \left[ {\begin{array}{cc} u_k \\ b_k \end{array}}\right] =-\left[ \frac{1}{c_1}H_2^TH_2+G_2^TG_2+c_2I\right] ^{-1}G_2^Te_2, \end{aligned}$$

(6)

where $H_2$ and $G_2$ are matrices of points $\hat{x}_i$ and $\hat{x_j}$ augmented with a column of ones. The Eqs. (4) and (6) are solved alternatively until $u_k$, $v_k$ and $b_k$ converges.

On the similar lines as above, the solution of (2) is obtained. A new test point is assigned a class label similar to LS-TWSTM [5] based on proximity criteria.

3.2 Tree-based Structural Least Squares Twin Support Vector Clustering

Tree-SLSTWSTC algorithm creates a binary tree of clusters which partitions the data at multiple levels of the tree until desired number of clusters are obtained. Unlike TWSVC [4], Tree SLSTWSTC uses symmetric squared loss function at each internal node that handles the issue of premature convergence of cluster framework. The proposed algorithm Tree-SLSTWSVC starts with initial labels ($+1$, $-1$). By using the initial labels, the data X with m data matrices is divided into two clusters, A and B, of size $(n_1 \times n_2 \times m_1)$ and $(n_1 \times n_2 \times m_2)$ respectively (where m = $m_1$ + $m_2$). Each group is then individually partitioned further by considering inter-cluster relationship and is able to generate more stable results in lesser time. Tree-SLSTWSTC is summarized in Algorithm 1.

3.3 Initialization

In conventional plane-based clustering scenarios, the initial cluster labels for data are obtained by randomization which is highly unstable and inefficient technique. Here, we propose a novel tensor-based initialization algorithms which uses frobenius norm to find the distance between two order-2 tensors (matrices). For example, the distance between two data points $x^\alpha =x(n_1,n_2,1)$ and $x^\beta =x(n_1,n_2,2)$ is calculated as

$$\begin{aligned} d_({x^\alpha ,x^\beta })=\sqrt{\sum _{i=1}^{n_1}\sum _{j=1}^{n_2}(x^\alpha _{ij}-x^\beta _{ij})^2}. \end{aligned}$$

(7)

We have implemented Tensor k-means (Tk-means), which uses tensor data as input and return corresponding cluster labels in the spirit similar to vector based k-means algorithm. Similar to traditional k-means, iterative relocation algorithm is followed which minimize the mean squared error locally. Henceforth, the centroid of cluster is updated and the process is repeated until labels converges i.e. no more change in label is detected.

4 Experimental Results

To evaluate the performance of the proposed method, experiments were carried out on image dataset of face recognition and optical digit recognition systems. In order to prove competence of our proposed work, we used the Metric accuracy [4] and Learning time as the performance criteria.

For comparison of our proposed approach against other algorithms, we implemented conventional k-means and $k{-}$Nearest neighbour graph algorithm in tensor framework. Further, to minimize the effect of randomization (in k-means) and value of k (in NNG), the experiments were performed multiple times, and the best results are reported.

Table 1. Clustering results on face recognition and optical digit recognition application

Full size table

Table 1 summarizes the results of experiments on the above-mentioned datasets. It is clearly evident here that k-means initialization based Tree-SLSTWSTC outperforms other methods in terms of clustering performance as well as learning time. We have also discussed clustering results obtained from other approaches in Table 1. It can be observed that the prediction accuracy of Tree-SLSTWSTC is significantly better than these methods. Also, it should be noticed that these methods use vector-based representation for clustering.

5 Conclusions

Based on the recently proposed LS-TWSTM, in this paper, we have proposed a novel treebased tensor based clustering algorithm namely Treebased Structural Least Squares Twin Support Tensor Clustering (Tree-SLSTWSTC) which has the capability to directly deal with the real world matrix data (second order tensor space) resulting into improved generalization and reduced Computational complexity. Moreover, it also handles the premature convergence problem as it considers structural risk associated with data. For initializing cluster labels, we have proposed Tensor k-means algorithm which helps to overcome the instability incurred by random initialization. Experimental comparisons of proposed approach against other related approaches on face recognition and handwritten image dataset, establish the suitability of the proposed algorithms to deal with the tensor based data directly (as direct image input).

In future, the application of proposed approach in more challenging real-world applications with higher order tensor space like image segmentation and computer vision can be explored.

References

Khemchandani, R., Pal, A., Chandra, S.: Fuzzy least squares twin support vector clustering. Neural Comput. Appl. 1, 1–11 (2016)
Google Scholar
Cai, D., He, X., Wen, J.R., Han, J., Ma, W.Y.: Support tensor machines for text categorization (2006)
Google Scholar
Zhang, X., Gao, X., Wang, Y.: Twin support tensor machines for MCs detection. J. Electron. (China) 26(3), 318–325 (2009)
Article Google Scholar
Wang, Z., Shao, Y.H., Bai, L., Deng, N.Y.: Twin support vector machine for clustering. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2583–2588 (2015)
Article MathSciNet Google Scholar
Zhao, X., Shi, H., Lv, M., Jing, L.: Least squares twin support tensor machine for classification. J. Inf. Comput. Sci. 11(12), 4175–4189 (2014)
Article Google Scholar
Tao, D., Li, X., Wu, X., Hu, W., Maybank, S.J.: Supervised tensor learning. Knowl. Inf. Syst. 13(1), 142 (2007)
Article Google Scholar
Jayadeva, Khemchandani, R., Chandra, S.: Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 905–910 (2007)
Article MATH Google Scholar
Wang, X., Yang, C., Zhou, J.: Clustering aggregation by probability accumulation. Pattern Recogn. 42(5), 668–675 (2009)
Article MATH Google Scholar
Peng, Y., Zheng, W.L., Lu, B.L.: An unsupervised discriminative extreme learning machine and its applications to data clustering. Neurocomputing 174, 250–264 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Mathematics and Computer Science, South Asian University, New Delhi, India
Reshma Rastogi & Sweta Sharma

Authors

Reshma Rastogi
View author publications
You can also search for this author in PubMed Google Scholar
Sweta Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reshma Rastogi .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
B. Uma Shankar
Indian Statistical Institute, Kolkata, India
Kuntal Ghosh
Indian Statistical Institute, Kolkata, India
Deba Prasad Mandal
Indian Statistical Institute, Kolkata, India
Shubhra Sankar Ray
The Hong Kong Polytechnic University, Hong Kong, China
David Zhang
Indian Statistical Institute, Kolkata, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rastogi, R., Sharma, S. (2017). Tree-Based Structural Twin Support Tensor Clustering with Square Loss Function. In: Shankar, B., Ghosh, K., Mandal, D., Ray, S., Zhang, D., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2017. Lecture Notes in Computer Science(), vol 10597. Springer, Cham. https://doi.org/10.1007/978-3-319-69900-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-69900-4_4
Published: 01 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69899-1
Online ISBN: 978-3-319-69900-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)