Incremental Construction of Low-Dimensional Data Representations

Kuleshov, Alexander; Bernstein, Alexander

doi:10.1007/978-3-319-46182-3_5

Alexander Kuleshov¹⁷ &
Alexander Bernstein^17,18

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9896))

Included in the following conference series:

IAPR Workshop on Artificial Neural Networks in Pattern Recognition

2054 Accesses
6 Citations

Abstract

Various Dimensionality Reduction algorithms transform initial high-dimensional data into their lower-dimensional representations preserving chosen properties of the initial data. Typically, such algorithms use the solution of large-dimensional optimization problems, and the incremental versions are designed for many popular algorithms to reduce their computational complexity. Under manifold assumption about high-dimensional data, advanced manifold learning algorithms should preserve the Data manifold and its differential properties such as tangent spaces, Riemannian tensor, etc. Incremental version of the Grassmann&Stiefel Eigenmaps manifold learning algorithm, which has asymptotically minimal reconstruction error, is proposed in this paper and has significantly smaller computational complexity in contrast to the initial algorithm.

You have full access to this open access chapter, Download conference paper PDF

A geometric viewpoint of manifold learning

Article Open access 12 March 2015

Low-Dimensional Data Representation in Data Analysis

Adaptive Metric Dimensionality Reduction

Keywords

1 Introduction

The general goal of data analysis is to extract previously unknown information from a given dataset. Many data analysis tasks, such as pattern recognition, classification, clustering, prognosis, and others, deal with real-world data that are presented in high-dimensional spaces, and the ‘curse of dimensionality’ phenomena are often an obstacle to the use of many methods for solving these tasks.

Fortunately, in many applications, especially in pattern recognition, the real high-dimensional data occupy only a very small part in the high dimensional ‘observation space’ R^p; it means that an intrinsic dimension q of the data is small compared to the dimension p (usually, q << p) [1, 2]. Various dimensionality reduction (feature extraction) algorithms, whose goal is a finding of a low-dimensional parameterization of such high-dimensional data, transform the data into their low-dimensional representations (features) preserving certain chosen subject-driven data properties [3, 4].

The most popular model of high-dimensional data, which occupy a small part of observation space R^p, is Manifold model in accordance with which the data lie on or near an unknown Data manifold (DM) of known lower dimensionality q < p embedded in an ambient high-dimensional space R^p (Manifold assumption [5] about high-dimensional data). Typically, this assumption is satisfied for ‘real-world’ high-dimensional data obtained from ‘natural’ sources.

Dimensionality reduction under the manifold assumption about processed data are usually referred to as the Manifold learning [6, 7] whose goal is constructing a low-dimensional parameterization of the DM (global low-dimensional coordinates on the DM) from a finite dataset sampled from the DM. This parameterization produces an Embedding mapping from the DM to low-dimensional Feature space that should preserve specific properties of the DM determined by chosen optimized cost function which defines an ‘evaluation measure’ for the dimensionality reduction and reflects the desired properties of the initial data which should be preserved in their features.

Most manifold learning algorithms include the solution of large-dimensional global optimization problems and, thus, are computationally expensive. The incremental versions of many popular algorithms (Locally Linear Embedding, Isomap, Laplacian Eigenmaps, Local Tangent Space Alignment, Hessian Eigenmaps, etc. [6, 7]), which reduce their computational complexity, were developed [8–17].

The manifold learning algorithms are usually used as a first key step in solution of machine learning tasks: the low-dimensional features are used in reduced learning procedures instead of initial high-dimensional data avoiding the curse of dimensionality [18]: ‘dimensionality reduction may be necessary in order to discard redundancy and reduce the computational cost of further operations’ [19]. If the low-dimensional features preserve only specific properties of data, then substantial data losses are possible when using the features instead of the initial data. To prevent these losses, the features should preserve as much as possible available information contained in the high-dimensional data [20]; it means the possibility for recovering the initial data from their features with small reconstruction error. Such Manifold reconstruction algorithms result in both the parameterization and recovery of the unknown DM [21].

Mathematically [22], a ‘preserving the important information of the DM’ means that manifold learning algorithms should ‘recover the geometry’ of the DM, and ‘the information necessary for reconstructing the geometry of the manifold is embodied in its Riemannian metric (tensor)’ [23]. Thus, the learning algorithms should accurately recover Riemannian data manifold that is the DM equipped by Riemannian tensor.

Certain requirement to the recovery follows from the necessity of providing a good generalization capability of the manifold reconstruction algorithms and preserving local structure of the DM: the algorithms should preserve a differential structure of the DM providing proximity between tangent spaces to the DM and Recovered data manifold (RDM) [24]. In the Manifold theory [23, 25], the set composed of the manifold points equipped by tangent spaces at these points is called the Tangent bundle of the manifold; thus, a reconstruction of the DM, which ensures accurate reconstruction of its tangent spaces too, is referred to as the Tangent bundle manifold learning.

Earlier proposed geometrically motivated Grassmann&Stiefel Eigenmaps algorithm (GSE) [24, 26] solves the Tangent bundle manifold learning and recovers Riemannian tensor of the DM; thus, it solves the Riemannian manifold recovery problem.

The GSE, like most manifold learning algorithms, includes the solution of large-dimensional global optimization problems and, thus, is computationally expensive.

In this paper, we propose an incremental version of the GSE that reduces the solution of the computationally expensive global optimization problems to the solution of a sequence of local optimization problems solved in explicit form.

The rest of the paper is organized as follows. Section 2 contains strong definition of the Tangent bundle manifold learning and describes main ideas realized in its GSE-solution. The proposed incremental version of the GSE is presented in Sect. 3.

2 Tangent Bundle Manifold Learning

2.1 Definitions and Assumptions

Consider unknown q-dimensional Data manifold with known intrinsic dimension q

$$ {\mathbf{M}} = \{ {\text{X = g}}\left( {\text{y}} \right) \in {\text{R}}^{\text{p}} : {\text{ y}} \in {\mathbf{Y}} \subset {\text{R}}^{\text{q}} \} $$

covered by a single chart g and embedded in an ambient p-dimensional space R^p, q < p. The chart g is one-to-one mapping from open bounded Coordinate space $ {\mathbf{Y}} \subset {\text{R}}^{\text{q}} $ to the manifold M = g(Y) with differentiable inverse mapping h_g(X) = g⁻¹(X) whose values y = h_g(X) ∈ Y give low-dimensional coordinates (representations, features) of high-dimensional manifold-valued data X.

If the mappings h_g(X) and g(y) are differentiable and J_g(y) is p × q Jacobian matrix of the mapping g(y), than q-dimensional linear space L(X) = Span(J_g(h_g(X))) in R^p is tangent space to the DM M at the point X ∈ M; hereinafter, Span(H) is linear space spanned by columns of arbitrary matrix H.

The tangent spaces can be considered as elements of the Grassmann manifold Grass(p, q) consisting of all q-dimensional linear subspaces in R^p.

Standard inner product in R^p induces an inner product on the tangent space L(X) that defines Riemannian metric (tensor) Δ(X) in each manifold point X ∈ M smoothly varying from point to point; thus, the DM M is a Riemannian manifold (M, Δ).

Let $ {\mathbf{X}}_{\text{n}} = \left\{ {{\text{X}}_{ 1} ,{\text{X}}_{ 2} , \ldots ,{\text{X}}_{\text{n}} } \right\} $ be a dataset randomly sampled from the DM M according to certain (unknown) probability measure whose support coincides with M.

2.2 Tangent Bundle Manifold Learning Definition

Conventional manifold learning problem, called usually Manifold embedding problem [6, 7], is to construct a low-dimensional parameterization of the DM from given sample X _n, which produces an Embedding mapping $ {\text{h}}: {\mathbf{M}} \subset {\text{R}}^{\text{p}} \to {\mathbf{Y}}_{\text{h}} = {\text{h}}\left( {\mathbf{M}} \right) \subset {\text{R}}^{\text{q}} $ from the DM M to the Feature space (FS) Y _h ⊂ R^q, q < p, which preserves specific chosen properties of the DM.

Manifold reconstruction algorithm, which provides additionally a possibility of accurate recovery of original vectors X from their low-dimensional features y = h(X), includes a constructing of a Recovering mapping g(y) from the FS Y _h to the Euclidean space R^p in such a way that the pair (h, g) ensures approximate equalities

$$ {\text{r}}_{{{\text{h}},{\text{g}}}} \left( {\text{X}} \right) \equiv {\text{g}}\left( {{\text{h}}\left( {\text{X}} \right)} \right) \approx {\text{X}}\,{\text{for}}\,{\text{all}}\,{\text{points}}\,{\text{X}} \in {\mathbf{M}}. $$

(1)

The mappings (h, g) determine q-dimensional Recovered data manifold

$$ {\mathbf{M}}_{{{\text{h}},{\text{g}}}} = {\text{r}}_{{{\text{h}},{\text{g}}}} \left( {\mathbf{M}} \right) = \{ {\text{r}}_{{{\text{h}},{\text{g}}}} \left( {\text{X}} \right) \in {\text{R}}^{\text{p}} :{\text{X}} \in {\mathbf{M}}\} = \{ {\text{X}} = {\text{g}}\left( {\text{y}} \right) \in {\text{R}}^{\text{p}} :{\text{y}} \in {\mathbf{Y}}_{\text{h}} \subset {\text{R}}^{\text{q}} \} $$

(2)

which is embedded in the ambient space R^p, covered by a single chart g, and consists of all recovered values r_h,g(X) of manifold points X ∈ M. Proximities (1) imply manifold proximity M _h,g ≈ M meaning a small Hausdorff distance d_H(M _h,g, M) between the DM M and RDM M _h,g due inequality $ {\text{d}}_{\text{H}} ({\mathbf{M}}_{{{\text{h}},{\text{g}}}} ,{\mathbf{M}}) \le { \sup }_{{{\text{X}} \in {\mathbf{M}}}} |{\text{r}}_{{{\text{h}},{\text{g}}}} \left( {\text{X}} \right){-}{\text{X}}| $.

Let G(y) = J_g(y) be p × q Jacobian matrix of the mapping g(y) which determines q-dimensional tangent space L_h,g(X) to the RDM M _h,g at the point r_h,g(X) ∈ M _h,g:

$$ {\text{L}}_{{{\text{h}},{\text{g}}}} \left( {\text{X}} \right) = {\text{Span}}\left( {{\text{G}}\left( {{\text{h}}\left( {\text{X}} \right)} \right)} \right) $$

(3)

Tangent bundle manifold learning problem is to construct the pair (h, g) of mappings h and g from given sample X _n ensuring both the proximities (1) and proximities

$$ {\text{L}}_{\text{h,g}} \left( {\text{X}} \right) \approx {\text{L}}\left( {\text{X}} \right)\,{\text{for}}\,{\text{all}}\,{\text{points}}\,{\text{X}} \in {\mathbf{M;}} $$

(4)

proximities (4) are defined with use certain chosen metric on the Grass(p, q).

The matrix G(y) determines also metric tensor $ \Delta_{{{\text{h}},{\text{g}}}} \left( {\text{X}} \right) = {\text{G}}^{\text{T}} \left( {{\text{h}}\left( {\text{X}} \right)} \right) \times {\text{G}}\left( {{\text{h}}\left( {\text{X}} \right)} \right) $ on the RMD M _h,g which is q × q matrix consisting of inner products {(G_i(h(X)), G_j(h(X)))} between i^th and j^th columns G_i(h(X)) and G_j(h(X)) of the matrix G(h(X)). Thus, the pair (h, g) determines Recovered Riemannian manifold (M _h,g, Δ_h,g) that accurately approximates initial Riemannian data manifold (M, Δ).

2.3 Grassmann&Stiefel Eigenmaps: An Approach

Grassmann&Stiefel Eigenmaps algorithm gives the solution to the Tangent bundle manifold learning problem and consists of three successively performed parts: Tangent manifold learning, Manifold embedding, and Manifold recovery.

Tangent Manifold Learning Part.

A sample-based family H consisting of p × q matrices H(X) smoothly depending on X ∈ M is constructed to meet relations

$$ {\text{L}}_{\text{H}} \left( {\text{X}} \right) \equiv {\text{Span}}\left( {{\text{H}}\left( {\text{X}} \right)} \right) \approx {\text{L}}\left( {\text{X}} \right)\,{\text{for}}\,{\text{all}}\,{\text{X}} \in {\mathbf{M}} $$

(5)

in certain chosen metric on the Grassmann manifold. In next steps, the mappings h and g will be built in such a way that both the equalities (1) and

$$ {\text{G}}\left( {{\text{h}}\left( {\text{X}} \right)} \right) \approx {\text{H}}\left( {\text{X}} \right)\,{\text{for}}\,{\text{all}}\,{\text{points}}\,{\text{X}} \in {\mathbf{M}} $$

(6)

are fulfilled. Hence, linear space L_H(X) (5) approximates the tangent space L_h,g(X) (3) to the RDM M _h,g at the point r_h,g(X).

Manifold Embedding Part.

Given the family H already constructed, the embedding mapping y = h(X) is constructed as follows. The Taylor series expansions

$$ {\text{g}}({\text{h}}({\text{X}}^{{\prime }} )) - {\text{g}}\left( {{\text{h}}\left( {\text{X}} \right)} \right) \approx {\text{G}}\left( {{\text{h}}\left( {\text{X}} \right)} \right) \times ({\text{h}}({\text{X}}^{{\prime }} ) - {\text{h}}\left( {\text{X}} \right)) $$

(7)

of the mapping g at near points h(X′), h(X) ∈ Y _h, under the desired approximate equalities (1), (6) for the mappings h and g to be specified further, imply equalities:

$$ {\text{X}}^{{\prime }} - {\text{X}} \approx {\text{H}}\left( {\text{X}} \right) \times \left( {{\text{h}}\left( {{\text{X}}^{{\prime }} } \right) - {\text{h}}\left( {\text{X}} \right)} \right) $$

(8)

for near points X, X′ ∈ M. These equations considered further as regression equations allow constructing the embedding mapping h and the FS Y _h = h(M).

Manifold Reconstruction Step.

Given the family H and mapping h(X) already constructed, the expansion (7), under the desired proximities (1) and (6), implies relation

$$ {\text{g}}\left( {\text{y}} \right) \approx {\text{X }} + {\text{ H}}\left( {\text{X}} \right) \times \left( {{\text{y}}{-}{\text{h}}\left( {\text{X}} \right)} \right) $$

(9)

for near points y, h(X) ∈ Y _h which is used for constructing the mapping g.

2.4 Grassmann&Stiefel Eigenmaps: Some Details

Details of the GSE are presented below. The numbers {ε_i > 0} denote the algorithms parameters whose values are chosen depending on the sample size n (ε_i = ε_i,n) and tend to zero as n → ∞ with rate O(n^−1/(q+2)).

Step S1: Neighborhoods (Construction and Description).

The necessary preliminary calculations are performed at first step S1.

Euclidean Kernel.

Introduce Euclidean kernel K_E(X, X′) = I{|X′ – X| < ε₁} on the DM at points X, X′ ∈ M, here I{·} is indicator function.

Grassmann Kernel.

An applying the Principal Component Analysis (PCA) [27] to the points from the set U_n(X, ε₁) = {X′ ∈ X _n: |X′ – X| < ε₁} ∪ {X}, results in p × q orthogonal matrix Q_PCA(X) whose columns are PCA principal eigenvectors corresponding to the q largest PCA eigenvalues. These matrices determine q-dimensional linear spaces L_PCA(X) = Span(Q_PCA(X)) in R^p, which, under certain conditions, approximate the tangent spaces L(X):

$$ {\text{L}}_{\text{PCA}} \left( {\text{X}} \right) \approx {\text{L}}\left( {\text{X}} \right). $$

(10)

In what follows, we assume that sample size n is large enough to ensure a positive value of the q^th PCA-eigenvalue in sample points and provide proximities (10). To provide trade-off between ‘statistical error’ (depending on number n(X) of sample points in set U_n(X, ε₁)) and ‘curvature error’ (caused by deviation of the manifold-valued points from the ‘assumed in the PCA’ linear space) in (10), ball radius ε₁ should tend to 0 as n → ∞ with rate O(n^−1/(q+2)), providing, with high probability, the order O(n^−1/(q+2)) for the error in (10) [28, 29]; here ‘an event occurs with high probability’ means that its probability exceeds the value (1 – C_α/n^α) for any n and α > 0, and the constant C_α depends only on α.

Grassmann kernel K_G(X, X′) on the DM at points X, X′ ∈ M is defined as

$$ {\text{K}}_{\text{G}} \left( {{\text{X}},{\text{X}}^{{\prime }} } \right) = {\text{I}}\left\{ {{\text{d}}_{\text{BC}} \left( {{\text{L}}_{\text{PCA}} \left( {\text{X}} \right),{\text{L}}_{\text{PCA}} \left( {{\text{X}}^{{\prime }} } \right)} \right) < \varepsilon_{2} } \right\} \times {\text{K}}_{\text{BC}} \left( {{\text{L}}_{\text{PCA}} \left( {\text{X}} \right),{\text{L}}_{\text{PCA}} \left( {{\text{X}}^{{\prime }} } \right)} \right) $$

with use Binet-Cauchy kernel K_BC(L_PCA(X), L_PCA(X′)) = Det²[S(X, X′)] and Binet-Cauchy metric d_BC(L_PCA(X), L_PCA(X′)) = {1 − Det²[S(X, X′)]}^1/2 on the Grassmann manifold Grass(p, q) [30, 31], here S(X, X′) = $ {\text{Q}}_{\text{PCA}}^{\text{T}} ({\text{X}}) \times {\text{Q}}_{\text{PCA}} ({\text{X}}^{{\prime }} ) $.

Orthogonal p × p matrix $ \uppi_{\text{PCA}} ({\text{X}}) = {\text{Q}}_{\text{PCA}} \left( {\text{X}} \right) \times {\text{Q}}_{\text{PCA}}^{\text{T}} ({\text{X}}) $ is projector onto linear space L_PCA(X) which approximates projection matrix π(X) onto the tangent space L(X).

Aggregate Kernel.

Introduce the kernel K(X, X′) = K_E(X, X′) × K_G(X, X′), which reflects not only geometrical nearness between points X and X′ but also nearness between the linear spaces L_PCA(X) and L_PCA(X′) (and, thus (10), nearness between the tangent spaces L(X) and L(X′)), as a product of the Euclidean and Grassmann kernels.

Step S2: Tangent Manifold Learning.

The matrices H(X) will be constructed to meet the equalities L_H(X) = L_PCA(X) for all points X ∈ M that implies a representation

$$ {\text{H}}\left( {\text{X}} \right) = {\text{Q}}_{\text{PCA}} \left( {\text{X}} \right) \times {\text{v}}\left( {\text{X}} \right), $$

(11)

in which q × q matrices v(x) should provide a smooth depending H(X) on point X.

At first, the p × q matrices {H_i = Q_PCA(X_i) × v_i} are constructed to minimize a form

$$ \Delta_{{{\text{H}},{\text{n}}}} = \frac{1}{2}\sum\nolimits_{{{\text{i}},{\text{j}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}}_{\text{i}} ,{\text{X}}_{\text{j}} } \right) \times ||{\text{H}}_{\text{i}} - {\text{H}}_{\text{j}} ||_{\text{F}}^{2} } $$

(12)

over q × q matrices v₁, v₂, …, v_n, under normalizing constraint

$$ \sum\nolimits_{{{\text{i}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}}_{\text{i}} } \right) \times \left( {{\text{H}}_{\text{i}}^{\text{T}} \times {\text{H}}_{\text{i}} } \right)} = \sum\nolimits_{{{\text{i}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}}_{\text{i}} } \right) \times \left( {{\text{v}}_{\text{i}}^{\text{T}} \times {\text{v}}_{\text{i}} } \right) = {\text{K}} \times {\text{I}}_{\text{q}} } $$

(13)

used to avoid a degenerate solution; here $ {\text{K}}\left( {\text{X}} \right) = \sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}},{\text{X}}_{\text{j}} } \right)} $ and $ {\text{K}} = \sum\nolimits_{{{\text{i}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}}_{\text{i}} } \right)} $.

The quadratic form (12) and the constraint (13) take the forms (K – Tr(V ^T × Ф × V)) and V ^T × F × V = K × I_q, respectively, here V is (nq) × q matrix whose transpose consists of the consecutively written transposed q × q matrices v₁, v₂, …, v_n, Φ = ||Φ_ij|| and F = ||F_ij|| are nq × nq matrices consisting, respectively, of q × q matrices

$$ \{ \varPhi_{\text{ij}} = {\text{K}}\left( {{\text{X}}_{\text{i}} ,{\text{X}}_{\text{j}} } \right) \times {\text{S}}\left( {{\text{X}}_{\text{i}} ,{\text{X}}_{\text{j}} } \right)\} {\text{ and }}\{ {\text{F}}_{\text{ij}} = \delta_{\text{ij}} \times {\text{K}}\left( {{\text{X}}_{\text{i}} } \right) \times {\text{I}}_{\text{q}} \} . $$

Thus, a minimization (12), (13) is reduced to the generalized eigenvector problem

$$ {\varvec{\Phi}} \times {\mathbf{V}} = \lambda \times {\mathbf{F}} \times {\mathbf{V,}} $$

(14)

and (nq) × q matrix V, whose columns V₁, V₂, …, V_q ∈ R^nq are orthonormal eigenvectors corresponding to the q largest eigenvalues in the problem (14), determines the required q × q matrices v₁, v₂, …, v_n.

The value H(X) (11) at arbitrary point X ∈ M is chosen to minimize a form

$$ {\text{d}}_{{{\text{H}},{\text{n}}}} \left( {\text{H}} \right) = \sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{K}}({\text{X}},{\text{X}}_{\text{j}} ) \times ||{\text{Q}}_{\text{PCA}} ({\text{X}}) \times {\text{v}}({\text{X}}) - {\text{H}}_{\text{j}} ||_{\text{F}}^{2} } $$

(15)

over v(X) under condition Span(H) = L_PCA(X), whose solution is

$$ {\text{H}}\left( {\text{X}} \right) = {\text{Q}}_{\text{PCA}} \left( {\text{X}} \right) \times {\text{v}}\left( {\text{X}} \right) = {\text{Q}}_{\text{PCA}} \left( {\text{X}} \right) \times \frac{1}{{{\text{K}}({\text{X}})}}\sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}},{\text{X}}_{\text{j}} } \right) \times {\text{S}}\left( {{\text{X}},{\text{X}}_{\text{j}} } \right) \times {\text{v}}_{\text{j}} } . $$

(16)

It follows from above formulas that the q × p matrix

$$ {\text{G}}_{\text{h}} \left( {\text{X}} \right) = {\text{H}}^{ - } ({\text{X}}) \times \pi_{\text{PCA}} \left( {\text{X}} \right) = {\text{v}}^{ - 1} \left( {\text{X}} \right) \times {\text{Q}}_{\text{PCA}}^{\text{T}} ({\text{X}}) $$

estimates Jacobian matrix J_h(X) of Embedding mapping h(X) constructed afterward, here $ {\text{H}}^{ - } ({\text{X}}) $ is q × p pseudoinverse Moore-Penrose matrix of p × q matrix H(X) [32].

Step S3: Manifold Embedding.

Embedding mapping h(X) with already known (estimated) Jacobian G_h(X) is constructed to meet equalities (8) written for all pairs of near points X, X′ ∈ M which can be considered as regression equations.

At first, the vector set {h₁, h₂, …, h_n} ⊂ R^q is computed as a standard least squares solution in this regression problem by minimizing the residual

$$ \Delta_{{{\text{h}},{\text{n}}}} = \sum\nolimits_{{{\text{i}},{\text{j}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}}_{\text{i}} ,{\text{X}}_{\text{j}} } \right) \times \left| {{\text{X}}_{\text{j}} - {\text{X}}_{\text{i}} - {\text{H}}_{\text{i}} \times ({\text{h}}_{\text{j}} - {\text{h}}_{\text{i}} )} \right|^{2} } $$

(17)

over the vectors h₁, h₂, …, h_n under normalizing condition h₁ + h₂ + … + h_n = 0.

Then, considering the obtained vectors {h_j} as preliminary values of the mapping h(X) at sample points, choose the value

$$ {\text{h}}\left( {\text{X}} \right) = \frac{1}{{{\text{K}}({\text{X}})}}\sum\nolimits_{{{\text{i}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}},{\text{X}}_{\text{i}} } \right) \times \left\{ {{\text{h}}_{\text{i}} + {\text{G}}_{\text{h}} ({\text{X}}) \times \left( {{\text{X}} - {\text{X}}_{\text{i}} } \right)} \right\}} $$

(18)

for arbitrary point X ∈ M as a result of minimizing over h the residual

$$ {\text{d}}_{{{\text{h}},{\text{n}}}} \left( {\text{h}} \right) = \sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}},{\text{X}}_{\text{j}} } \right) \times \left| {{\text{X}}_{\text{j}} - {\text{X}} - {\text{H}}({\text{X}}) \times ({\text{h}}_{\text{j}} - {\text{h}})} \right|^{2} } . $$

(19)

The mapping (18) determines Feature sample Y _h,n = {y_h,i = h(X_i), i = 1, 2, …, n}.

Step S4: Manifold Recovery.

A kernel on the FS Y _h and, then, the recovering mapping g(y) and its Jacobian matrix G(y) are constructed in this step.

Kernel on the Feature Space.

It follows from (8) that proximities

$$ |{\text{X}}{-}{\text{X}}_{\text{i}} | \approx {\text{d}}\left( {{\text{y}},{\text{ y}}_{{{\text{h}},{\text{i}}}} } \right) = \{ \left( {{\text{y}}{-}{\text{y}}_{{{\text{h}},{\text{i}}}} } \right)^{\text{T}} \times [{\text{H}}^{\text{T}} ({\text{X}}_{\text{i}} ) \times {\text{H}}({\text{X}}_{\text{i}} )] \times \left( {{\text{y}}{-}{\text{y}}_{{{\text{h}},{\text{i}}}} } \right)\}^{ 1/ 2} $$

hold true for near points y = h(X) and y_h,i ∈ Y _h,n. Let u_E(y, ε₁) = {y_h,i: d(y, y_h,i) < ε₁} be a neighborhood of the feature y = h(X) consisting of sample features which are images of the sample points from U_n(X, ε₁).

An applying the PCA to the set h⁻¹(u_E(y, ε₁)) = {X_i: y_h,i ∈ u_E(y, ε₁)} results in the linear space L_PCA*(y) ∈ Grass(p, q) which meets proximity L_PCA*(h(X)) ≈ L_PCA(X).

Introduce feature kernel k(y, y_h,i) = I{y_h,i ∈ u_E(y, ε₁)} × K_G(L_PCA*(y), L_PCA*(y_h,i)) that meets equalities k(h(X), h(X′)) ≈ K(X, X′) for near points X ∈ M and X′ ∈ X _n.

Constructing the Recovering Mapping and its Jacobian.

The matrix G(y), which should meet both the conditions (6) and constraint Span(G(y)) = L_PCA*(y), is chosen by minimizing quadratic form $ \sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{k}}\left( {{\text{y}},{\text{y}}_{{{\text{h}},{\text{j}}}} } \right) \times ||{\text{G}}({\text{y}}) - {\text{H}}_{\text{j}} ||_{\text{F}}^{2} } $ over G, that results in

$$ {\text{G}}\left( {\text{y}} \right) =\uppi^{*}\left( {\text{y}} \right) \times \frac{1}{{{\text{k}}({\text{y}})}}\sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{k}}\left( {{\text{y}}, {\text{y}}_{{{\text{h}},{\text{j}}}} } \right) \times {\text{H}}_{\text{j}} ,} $$

(20)

here π*(y) is the projector onto the linear space L_PCA*(y) and $ {\text{k}}\left( {\text{y}} \right) = \sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{k}}\left( {{\text{y}},{\text{y}}_{{{\text{h}},{\text{j}}}} } \right)} $.

Based on expansions (9) written for features y_h,j ∈ u_E(y, ε₁), g(y) is chosen by minimizing quadratic form $ \sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{k}}\left( {{\text{y}},{\text{y}}_{{{\text{h}},{\text{j}}}} } \right) \times \left| {{\text{X}}_{\text{j}} - {\text{g}}\left( {\text{y}} \right) - {\text{G}}({\text{y}}) \times ({\text{y}}_{{{\text{h}},{\text{j}}}} - {\text{y}})} \right|^{2} } $ over g, thus

$$ {\text{g}}\left( {\text{y}} \right) = \frac{1}{{{\text{k}}({\text{y}})}}\sum\nolimits_{{{\text{j}} = 1}}^{\text{n}} {{\text{k}}\left( {{\text{y}}, {\text{y}}_{{{\text{h}},{\text{j}}}} } \right) \times \left\{ {{\text{X}}_{\text{j}} + {\text{G}}({\text{y}}) \times \left( {{\text{y}} - y_{{{\text{h}},{\text{j}}}} } \right)} \right\}} . $$

(21)

The constructed mappings (18), (21) allow recovering the DM M and its tangent spaces L(X) by the formulas (2) and (4).

2.5 Grassmann&Stiefel Eigenmaps: Some Properties

Under asymptotic n → ∞, when ε₁ = O(n^−1/(q+2)), relation d_H(M _h,g, M) = O(n^−2/(q+2)) hold true uniformly in points X ∈ M with high probability [33]. This rate coincides with the asymptotically minimax lower bound for the Hausdorff distance d_H(M _h,g, M) [34]; thus, the RDM M _h,g estimates the DM M with optimal rate of convergence.

The main computational complexity of the GSE-algorithm is in the second and third steps, in which global high-dimensional optimization problems are solved.

First problem is generalized eigenvector problem (14) with nq × nq matrices F and Φ. This problem is solved usually with use the Singular value decomposition (SVD) [32] whose computational complexity is O(n³) [35].

Second problem is regression problem (17) for nq-dimensional estimated vector. This problem is reduced to the solution of the linear least-square normal equations with nq × nq matrix whose computational complexity is O(n³) also [32].

Thus, the GSE has total computational complexity O(n³) and is computationally expensive under large sample size n.

3 Incremental Grassmann&Stiefel Eigenmaps

The incremental version of the GSE divides the most computationally expensive generalized eigenvector and regression problems into n local optimization procedures, each time k solved in explicit form for one new variable (matrix H_k and feature h_k) only, k = 1, 2, …, n.

The proposed incremental version includes an additional preliminary step S1⁺ performed after the Step S1, in which a weighted undirected sample graph Г(X _n) consisting of the sample points {X_i} as nodes is constructed and the shortest ways between arbitrary node chosen as an origin of the graph and all the other nodes are calculated.

The second and third steps S2 and S3 are replaced by common incremental step S2–3 in which the matrices {H_k} and features {h_k} are computed sequentially at the graph nodes, moving along the shortest paths starting from the chosen origin of the graph. Step S4 in the GSE remains unchanged in the incremental version.

3.1 Step S1⁺: Sample Graph

Introduce a weighted undirected sample graph Г(X _n) consisting of the sample points {X_i} as nodes. The edges in Г(X _n) connect the nodes X_i and X_j if and only when K(X_i, X_j) > 0; the lengths of such edge (X_i, X_j) equal to |X_i – X_j|/K(X_i, X_j).

Choose arbitrary node X₍₁₎ ∈ Г(X_n) as an origin of the graph. Using the Dijksra algorithm [36], compute the shortest paths between the chosen node and all the other nodes X₍₂₎, X₍₃₎, …, X_(n) writing in ascending order of the lengths of the shortest paths from the origin X₍₁₎. Denote Г_k a subgraph consisting of the nodes {X₍₁₎, X₍₂₎, …, X_(k)} and connected them edges.

Note.

The origin X₍₁₎ can be chosen as a node with minimal eccentricity; an eccentricity of some node equals to maximum of lengths of the shortest paths between the node under consideration and all the other nodes. But a calculation of the shortest ways between all nodes in the graph Г(X _n), which should be computed for this construction, require n-fold applying of the Dijksra algorithm.

3.2 Step S2–3: Incremental Tangent Manifold Learning and Manifold Embedding

Incremental version computes sequentially the matrices H(X) and h(X) at the points X₍₁₎, X₍₂₎, …, X_(n), starting from matrix H₍₁₎ and h₍₁₎ (initialization). Thus, step S2–3 consists of n substeps {S2–3_k, k = 1, 2, …, n} in which initialization substep is

Initialization substep S2–3₁.

Put v₍₁₎ = I_q and h₍₁₎ = 0; thus, H(X₍₁₎) = Q_PCA(X₍₁₎).

At the k-th substep S2–3_k, k > 1, when the matrices H_(j), j < k, have already computed, quadratic form Δ_H,k, similar to the form (12) but written only for the points X_i, X_j ∈ Г_k, is minimized over single unknown matrix H_(k) = Q_PCA(X_(k)) × v_(k). This problem, in turn, is reduced to a minimization over v_(k) of the form d_H,k(H_(k)), similar to the form d_H,n(H_(k)) (15) but written only for points X_j ∈ Г_k−1. Its solution v_(k), which is similar to the solution (16), is written in explicit form.

Let Δ_h,k be a quadratic form, similar to the form Δ_h,n (17) but written only for points X_i, X_j ∈ Г_k. The value h_(k), under the already computed values h_(j), j < k, is calculated by minimizing the quadratic form Δ_h,k over single vector h_(k). This problem, in turn, is reduced to a minimization over h_(k) the form d_h,k(h_(k)), similar to the form d_h,n(h_(k)) (19) but written only for points X_j ∈ Г_k−1; its solution, similar to the solution (18), is written in explicit form also.

Thus, the substeps S2–3_k, k = 1, 2, …, n, are:

Typical substep S2–3_k, 1 < k ≤ n.

Given {(H_(j), h_(j)), j < k} already obtained, put

$$ {\text{H}}_{{({\text{k}})}} = {\text{Q}}_{\text{PCA}} \left( {{\text{X}}_{{({\text{k}})}} } \right) \times {\text{v}}_{{({\text{k}})}} = {\text{Q}}_{\text{PCA}} \left( {{\text{X}}_{{({\text{k}})}} } \right) \times \frac{{\mathop \sum \nolimits_{\text{j < k}} {\text{K}}\left( {{\text{X}}_{(k)} , {\text{X}}_{{ ( {\text{j)}}}} } \right) \times {\text{S}}\left( {{\text{X}}_{{ ( {\text{k)}}}} , {\text{X}}_{{ ( {\text{j)}}}} } \right) \times {\text{v}}_{{ ( {\text{j)}}}} }}{{\mathop \sum \nolimits_{\text{j < k}} {\text{K}}\left( {{\text{X}}_{{ ( {\text{k)}}}} , {\text{X}}_{{ ( {\text{j)}}}} } \right)}}, $$

(22)

$$ {\text{h}}_{{({\text{k}})}} = \frac{{\mathop \sum \nolimits_{{{\text{j}} < k}} {\text{K}}\left( {{\text{X}}_{{({\text{k}})}} ,{\text{X}}_{{({\text{j}})}} } \right) \times \left\{ {{\text{h}}_{{({\text{j}})}} + {\text{v}}_{{({\text{k}})}}^{ - 1} \times {\text{Q}}_{\text{PCA}}^{\text{T}} ({\text{X}}_{{({\text{k}})}} ) \times ({\text{X}}_{{({\text{k}})}} - {\text{X}}_{{({\text{j}})}} )} \right\}}}{{\mathop \sum \nolimits_{{{\text{j}} < k}} {\text{K}}\left( {{\text{X}}_{{({\text{k}})}} ,{\text{X}}_{{({\text{j}})}} } \right)}}. $$

(23)

Given {(H_(k), h_(k)), k = 1, 2, …, n}, the value H(X) = Q_PCA(X) × v(X) and h(X) at arbitrary point X ∈ M are calculated with use formulas (16) and (18), respectively.

3.3 Incremental GSE: Properties

Computational Complexity.

Incremental GSE works mainly with sample data lying in a neighborhood of some point X contained in ε₁-ball U_n(X, ε₁) centered at X. The number n(X) of sample points fallen into this ball, under ε₁ = ε_1,n = O(n^−1/(q+2)), with high probability equals to n × O(n^−q/(q+2)) = O(n^2/(q+2)) uniformly on X ∈ M [37].

The sample graph Г(X _n) consists of V = n nodes and E edges connecting the graph nodes {X_k}. Each node X_k is connected with no more than n(X_k) other nodes, thus E < 0.5 × n × max_kn(X_k) = O(n^(q+4)/(q+2)) and, hence, Г(X _n) is sparse graph.

The running time of the Dijksra algorithm (Step S1⁺), which computes the shortest paths in the sparse connected graph Г(X _n), is O(E × lnV) = O(n^(q+4)/(q+2) × lnn) in the worst case; the Fibonacci heap improves this rate to O(E + V × lnV) = O(n^(q+4)/(q+2)) [38].

The running time of k-th Step S2–3_k (formulas (22) and (23)) is proportional to n(X_k); thus total running time of the Step S2–3 is n × O(n^−q/(q+2)) = O(n^(q+4)/(q+2)).

Therefore, the running time of the incremental version of the GSE is O(n^(q+4)/(q+2)), in contrast to the running time O(n³) of the initial GSE.

Accuracy.

It follows from (18), (21) that X - r_h,g(X) ≈ $ \left(\pi {_{\text{PCA}}^{\text{T}} \left( {\text{X}} \right) \times {\text{e}}({\text{X}})} \right) \times \left| \delta {({\text{X}})} \right| $, in which $ \delta \left( {\text{X}} \right) \, = {\text{X}} - \frac{1}{{{\text{K}}({\text{X}})}}\sum\nolimits_{{{\text{i}} = 1}}^{\text{n}} {{\text{K}}\left( {{\text{X}},{\text{X}}_{\text{i}} } \right) \times {\text{X}}_{\text{i}} } $ and e(X) = δ(X)/|δ(X)|. The first and second multipliers are majorized by the PCA-error in (10) and ε_1,n, respectively, each of them has rate O(n^−1/(q+2)). Thus, reconstruction error (X − r_h,g(X)) in the incremental GSE has the same asymptotically optimal rate O(n^−2/(q+2)) as in the original GSE.

4 Conclusion

The incremental version of the Grassmann&Stiefel Eigenmaps algorithm, which constructs the low-dimensional representations of high-dimensional data with asymptotically minimal reconstruction error, is proposed. This version has the same optimal convergence rate O(n^−2/(q+2)) of the reconstruction error and a significantly smaller computational complexity on the sample size n: running time O(n^(q+4)/(q+2)) of the incremental version in contrast to O(n³) of the original algorithm.

References

Donoho, D.L.: High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality. Lecture at the “Mathematical Challenges of the 21st Century” Conference of the AMS, Los Angeles (2000). http://www-stat.stanford.edu/donoho/Lectures/AMS2000/AMS2000.html
Verleysen, M.: Learning high-dimensional data. In: Ablameyko, S., Goras, L., Gori, M., Piuri, V. (eds.) Limitations and Future Trends in Neural Computation. NATO Science Series, III: Computer and Systems Sciences, vol. 186, pp. 141–162. IOS Press, Netherlands (2003)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation Learning: A Review and New Perspectives, pp. 1–64 (2014). arXiv:1206.5538v3[cs.LG]. Accessed 23 Apr 2014
Bernstein, A., Kuleshov, A.: Low-dimensional data representation in data analysis. In: El Gayar, N., Schwenker, F., Suen, C. (eds.) ANNPR 2014. LNCS, vol. 8774, pp. 47–58. Springer, Heidelberg (2014)
Google Scholar
Seung, H.S., Lee, D.D.: The manifold ways of perception. Science 290(5500), 2268–2269 (2000)
Article Google Scholar
Huo, X., Ni, X., Smith, A.K.: Survey of manifold-based learning methods. In: Liao, T.W., Triantaphyllou, E. (eds.) Recent Advances in Data Mining of Enterprise Data, pp. 691–745. World Scientific, Singapore (2007)
Google Scholar
Ma, Y., Fu, Y. (eds.): Manifold Learning Theory and Applications. CRC Press, London (2011)
Google Scholar
Law, M.H.C., Jain, A.K.: Nonlinear manifold learning for data stream. In: Berry, M., Dayal, U., Kamath, C., Skillicorn, D. (eds.) Proceedings of the 4th SIAM International Conference on Data Mining, Like Buena Vista, Florida, USA, pp. 33–44 (2004)
Google Scholar
Law, M.H.C., Jain, A.K.: Incremental nonlinear dimensionality reduction by manifold learning. IEEE Trans. Pattern Anal. Mach. Intell. 28(3), 377–391 (2006)
Article Google Scholar
Gao, X., Liang, J.: An improved incremental nonlinear dimensionality reduction for isometric data embedding. Inf. Process. Lett. 115(4), 492–501 (2015)
Article MathSciNet MATH Google Scholar
Saul, L.K., Roweis, S.T.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119–155 (2003)
MathSciNet MATH Google Scholar
Kouropteva, O., Okun, O., Pietikäinen, M.: Incremental locally linear embedding algorithm. In: Kalviainen, H., Parkkinen, J., Kaarna, A. (eds.) SCIA 2005. LNCS, vol. 3540, pp. 521–530. Springer, Heidelberg (2005)
Chapter Google Scholar
Kouropteva, O., Okun, O., Pietikäinen, M.: Incremental locally linear embedding. Pattern Recogn. 38(10), 1764–1767 (2005)
Article MATH Google Scholar
Schuon, S., Ðurković, M., Diepold, K., Scheuerle, J., Markward, S.: Truly incremental locally linear embedding. In: Proceedings of the CoTeSys 1st International Workshop on Cognition for Technical Systems, 6–8 October 2008, Munich, Germany, p. 5 (2008)
Google Scholar
Jia, P., Yin, J., Huang, X., Hu, D.: Incremental Laplacian eigenmaps by preserving adjacent information between data points. Pattern Recogn. Lett. 30(16), 1457–1463 (2009)
Article Google Scholar
Liu, X., Yin, J.-w., Feng, Z., Dong, J.: Incremental manifold learning via tangent space alignment. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS (LNAI), vol. 4087, pp. 107–121. Springer, Heidelberg (2006)
Google Scholar
Abdel-Mannan, O., Ben Hamza, A., Youssef, A.: Incremental line tangent space alignment algorithm. In: Proceedings of 2007 Canadian Conference on Electrical and Computer Engineering (CCECE 2007), 22–26 April 2007, Vancouver, pp. 1329–1332. IEEE (2007)
Google Scholar
Kuleshov, A., Bernstein, A.: Manifold learning in data mining tasks. In: Perner, P. (ed.) MLDM 2014. LNCS, vol. 8556, pp. 119–133. Springer, Heidelberg (2014)
Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Information Science and Statistics. Springer, New York (2007)
Book MATH Google Scholar
Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: rank-based criteria. Neurocomputing 72(7–9), 1431–1443 (2009)
Article Google Scholar
Bernstein, A.V., Kuleshov, A.P.: Data-based manifold reconstruction via tangent bundle manifold learning. In: ICML-2014, Topological Methods for Machine Learning Workshop, Beijing, 25 June 2014. http://topology.cs.wisc.edu/KuleshovBernstein.pdf
Perrault-Joncas, D., Meilă, M.: Non-linear Dimensionality Reduction: Riemannian Metric Estimation and the Problem of Geometric Recovery, pp. 1–25 (2013). arXiv:1305.7255v1[stat.ML]. Accessed 30 May 2013
Jost, J.: Riemannian Geometry and Geometric Analysis, 6th edn. Springer, Berlin (2011)
Book MATH Google Scholar
Bernstein, A.V., Kuleshov, A.P.: Manifold learning: generalizing ability and tangent proximity. Int. J. Softw. Inf. 7(3), 359–390 (2013)
Google Scholar
Lee, J.M.: Manifolds and Differential Geometry. Graduate Studies in Mathematics, vol. 107. American Mathematical Society, Providence (2009)
MATH Google Scholar
Bernstein, A.V., Kuleshov, A.P.: Tangent bundle manifold learning via Grassmann&Stiefel eigenmaps, pp. 1–25, December 2012. arXiv:1212.6031v1[cs.LG]
Jollie, T.: Principal Component Analysis. Springer, New York (2002)
Google Scholar
Singer, A., Wu, H.-T.: Vector diffusion maps and the connection Laplacian. Commun. Pure Appl. Math. 65(8), 1067–1144 (2012)
Article MathSciNet MATH Google Scholar
Tyagi, H., Vural, E., Frossard, P.: Tangent space estimation for smooth embeddings of Riemannian manifold, pp. 1–35 (2013). arXiv:1208.1065v2[stat.CO]. Accessed 17 May 2013
Hamm, J., Daniel, L.D.: Grassmann discriminant analysis: a unifying view on subspace-based learning. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 376–383 (2008)
Google Scholar
Wolf, L., Shashua, A.: Learning over sets using kernel principal angles. J. Mach. Learn. Res. 4, 913–931 (2003)
MathSciNet MATH Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computation, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Kuleshov, A., Bernstein, A., Yanovich, Y.: Asymptotically optimal method in manifold estimation. In: Abstracts of the XXIX-th European Meeting of Statisticians, 20–25 July 2013, Budapest, Hungary, p. 325 (2013). http://ems2013.eu/conf/upload/BEK086_006.pdf
Genovese, C.R., Perone-Pacifico, M., Verdinelli, I., Wasserman, L.: Minimax manifold estimation. J. Mach. Learn. Res. 13, 1263–1291 (2012)
MathSciNet MATH Google Scholar
Trefethen, L.N.: Bau III, David: Numerical Linear Algebra. SIAM, Philadelphia (1997)
Book Google Scholar
Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001)
MATH Google Scholar
Yanovich, Y.: Asymptotic properties of local sampling on manifolds. J. Math. Stat. (2016)
Google Scholar
Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. Assoc. Comput. Mach. 34(3), 596–615 (1987)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work is partially supported by the Russian Foundation for Basic Research, research project 16-29-09649 ofi-m.

Author information

Authors and Affiliations

Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Kuleshov & Alexander Bernstein
Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia
Alexander Bernstein

Authors

Alexander Kuleshov
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Bernstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Bernstein .

Editor information

Editors and Affiliations

Ulm University, Ulm, Germany
Friedhelm Schwenker
Ain Shams University , Cairo, Egypt
Hazem M. Abbas
Cairo University , Orman, Egypt
Neamat El Gayar
Universitá di Siena , Siena, Italy
Edmondo Trentin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kuleshov, A., Bernstein, A. (2016). Incremental Construction of Low-Dimensional Data Representations. In: Schwenker, F., Abbas, H., El Gayar, N., Trentin, E. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2016. Lecture Notes in Computer Science(), vol 9896. Springer, Cham. https://doi.org/10.1007/978-3-319-46182-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-46182-3_5
Published: 09 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46181-6
Online ISBN: 978-3-319-46182-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)