Abstract
Kernel principal component analysis (kPCA) has been proposed as a dimensionality-reduction technique that achieves nonlinear, low-dimensional representations of data via the mapping to kernel feature space. Conventionally, kPCA relies on Euclidean statistics in kernel feature space. However, Euclidean analysis can make kPCA inefficient or incorrect for many popular kernels that map input points to a hypersphere in kernel feature space. To address this problem, this paper proposes a novel adaptation of kPCA, namely kernel principal geodesic analysis (kPGA), for hyperspherical statistical analysis in kernel feature space. This paper proposes tools for statistical analyses on the Riemannian manifold of the Hilbert sphere in the reproducing kernel Hilbert space, including algorithms for computing the sample weighted Karcher mean and eigen analysis of the sample weighted Karcher covariance. It then applies these tools to propose novel methods for (i) dimensionality reduction and (ii) clustering using mixture-model fitting. The results, on simulated and real-world data, show that kPGA-based methods perform favorably relative to their kPCA-based analogs.
Chapter PDF
Similar content being viewed by others
Keywords
- Gaussian Mixture Model
- Reproduce Kernel Hilbert Space
- Kernel Principal Component Analysis
- Nonlinear Dimensionality Reduction
- Mercer Kernel
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Afsari, B.: Riemannian L p center of mass: Existence, uniqueness, and convexity. Proc. Am. Math. Soc. 139(2), 655–673 (2011)
Afsari, B., Tron, R., Vidal, R.: On the convergence of gradient descent for finding the Riemannian center of mass. SIAM J. Control and Optimization 51(3), 2230–2260 (2013)
Ah-Pine, J.: Normalized kernels as similarity indices. In: Proc. Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining, vol. 2, pp. 362–373 (2010)
Ahn, J., Marron, J.S., Muller, K., Chi, Y.Y.: The high-dimension, low-sample-size geometric representation holds under mild conditions. Biometrika 94(3), 760–766 (2007)
Amari, S., Nagaoka, H.: Methods of Information Geometry. Oxford Univ. Press (2000)
Aronszajn, N.: Theory of reproducing kernels. Trans. Amer. Math. Soc. 68(3), 337–404 (1950)
Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Mgn. Reson. Med. 56(2), 411–421 (2006)
Bache, K., Lichman, M.: UCI machine learning repository (2013), http://archive.ics.uci.edu/ml
Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.: Clustering on the unit hypersphere using von Mises-Fisher distributions. J. Mach. Learn. Res. 6, 1345–1382 (2005)
Berger, M.: A Panoramic View of Riemannian Geometry. Springer (2007)
Berman, S.: Isotropic Gaussian processes on the Hilbert sphere. Annals of Probability 8(6), 1093–1106 (1980)
Bhattacharya, R., Patrangenaru, V.: Large sample theory of intrinsic and extrinsic sample means on manifolds. I. Annals Stats. 31(1), 1–29 (2005)
Bhattacharya, R., Patrangenaru, V.: Large sample theory of intrinsic and extrinsic sample means on manifolds. II. Annals Stats. 33(3), 1225–1259 (2005)
Blanchard, G., Bousquet, O., Zwald, L.: Statistical properties of kernel principal component analysis. Machine Learning 66(3), 259–294 (2007)
Boothby, W.M.: An introduction to differentiable manifolds and Riemannian geometry, vol. 120. Academic Press (1986)
Bühlmann, P., Van De Geer, S.: Statistics for high-dimensional data: methods, theory and applications. Springer (2011)
Buss, S., Fillmore, J.: Spherical averages and applications to spherical splines and interpolation. ACM Trans. Graph. 20(2), 95–126 (2001)
Carter, K., Raich, R., Hero, A.: On local intrinsic dimension estimation and its applications. IEEE Trans. Signal Proc. 58(2), 650–663 (2010)
Charlier, B.: Necessary and sufficient condition for the existence of a Frechet mean on the circle. ESAIM: Probability and Statistics 17, 635–649 (2013)
Cherian, A., Sra, S., Banerjee, A., Papanikolopoulos, N.: Jensen-Bregman logDet divergence with application to efficient similarity search for covariance matrices. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2161–2174 (2012)
Courty, N., Burger, T., Marteau, P.-F.: Geodesic analysis on the Gaussian RKHS hypersphere. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 299–313. Springer, Heidelberg (2012)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society B(39), 1–38 (1977)
Eigensatz, M.: Insights into the geometry of the Gaussian kernel and an application in geometric modeling. Master thesis. Swiss Federal Institute of Technology (2006)
Felsberg, M., Kalkan, S., Krueger, N.: Continuous dimensionality characterization of image structures. Image and Vision Computing 27(6), 628–636 (2009)
Fletcher, T., Lu, C., Pizer, S., Joshi, S.: Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Med. Imag. 23(8), 995–1005 (2004)
Gonzalez, T.: Clustering to minimize the maximum intercluster distance. Theor. Comp. Sci. 38, 293–306 (1985)
Graf, A., Smola, A., Borer, S.: Classification in a normalized feature space using support vector machines. IEEE Trans. Neural Networks 14(3), 597–605 (2003)
Grauman, K., Darrell, T.: The pyramid match kernel: Efficient learning with sets of features. Journal of Machine Learning Research 8, 725–760 (2007)
Hein, M., Audibert, J.Y.: Intrinsic dimensionality estimation of submanifolds in Rd. In: Int. Conf. Mach. Learn., pp. 289–296 (2005)
Hoyle, D.C., Rattray, M.: Limiting form of the sample covariance eigenspectrum in PCA and kernel PCA. In: Int. Conf. Neural Info. Proc. Sys. (2003)
Kakutani, S., et al.: Topological properties of the unit sphere of a Hilbert space. Proceedings of the Imperial Academy 19(6), 269–271 (1943)
Karcher, H.: Riemannian center of mass and mollifier smoothing. Comn. Pure Appl. Math. 30(5), 509–541 (1977)
Kendall, W.S.: Probability, convexity and harmonic maps with small image I: uniqueness and fine existence. Proc. Lond. Math. Soc. 61, 371–406 (1990)
Krakowski, K., Huper, K., Manton, J.: On the computation of the Karcher mean on spheres and special orthogonal groups. In: Proc. Workshop Robotics Mathematics, pp. 1–6 (2007)
Lawrence, N.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2005)
Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomputing 72, 1432–1433 (2009)
Mardia, K., Jupp, P.: Directional Statistics. Wiley (2000)
Mas, A.: Weak convergence in the function autoregressive model. J. Multiv. Anal. 98, 1231–1261 (2007)
Nielsen, F., Bhatia, R.: Matrix Information Geometry. Springer (2013)
Peel, D., Whiten, W., McLachlan, G.: Fitting mixtures of Kent distributions to aid in joint set identification. J. Amer. Stat. Assoc. 96, 56–63 (2001)
Pennec, X.: Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. J. Mathematical Imaging and Vision 25(1), 127–154 (2006)
Raginsky, M., Lazebnik, S.: Estimation of intrinsic dimensionality using high-rate vector quantization. In: Proc. Adv. Neural Information Processing Systems, pp. 1–8 (2005)
de Ridder, D., Kuoropteva, O., Okun, O., Pietikainen, M., Duin, R.: Supervised locally linear embedding. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN/ICONIP 2003. LNCS, vol. 2714, pp. 333–341. Springer, Heidelberg (2003)
Samaria, F., Harter, A.: Parameterisation of a stochastic model for human face identification. In: Proc. IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994)
Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Computers 18(5), 401–409 (1969)
Scholkopf, B., Smola, A.: Learning with Kernels. MIT Press (2002)
Scholkopf, B., Smola, A., Muller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10, 1299–1319 (1998)
Shawe-Taylor, J., Williams, C., Cristianini, N., Kandola, J.: On the eigenspectrum of the Gram matrix and the generalisation error of kernel PCA. IEEE Trans. Info. Th. 51(7), 2510–2522 (2005)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Sommer, S., Lauze, F., Hauberg, S., Nielsen, M.: Manifold valued statistics, exact principal geodesic analysis and the effect of linear approximations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 43–56. Springer, Heidelberg (2010)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Walder, C., Schölkopf, B.: Diffeomorphic dimensionality reduction. In: Int. Conf. Neural Info. Prof. Sys., pp. 1713–1720 (2008)
Wang, J., Lee, J., Zhang, C.: Kernel trick embedded Gaussian mixture model. In: Gavaldá, R., Jantke, K.P., Takimoto, E. (eds.) ALT 2003. LNCS (LNAI), vol. 2842, pp. 159–174. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Awate, S.P., Yu, YY., Whitaker, R.T. (2014). Kernel Principal Geodesic Analysis. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44848-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-662-44848-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44847-2
Online ISBN: 978-3-662-44848-9
eBook Packages: Computer ScienceComputer Science (R0)