Skip to main content

Kernel Principal Component Analysis: Applications, Implementation and Comparison

  • Conference paper
  • 1020 Accesses

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 59))

Abstract

Kernel Principal Component Analysis (KPCA) is a dimension reduction method that is closely related to Principal Component Analysis (PCA). This report gives an overview of kernel PCA and presents an implementation of the method in MATLAB. The implemented method is tested in a transductive setting on two data bases. Two methods for labeling data points are considered, the nearest neighbor method and kernel regression, together with some possible improvements of the methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    An alternative formulation of the radial basis function is \(k(x,y) = \exp(-\frac{\| x - y \|^{2}}{2 \sigma^{2}})\), used in for example [11, 12]. In this report, the definition in (17) is consistently used.

  2. 2.

    Definition: \(\| x \|_{p} \triangleq (\sum^{d}_{i=1} | x_{i} |^{p})^{1/p}\), where p≥0.

  3. 3.

    Definition: \({\operatorname{argmin}}_{x} f(x) \triangleq \{x | \forall y: f(x) \leq f(y) \}\).

  4. 4.

    Available online: http://archive.ics.uci.edu/ml/.

References

  1. Achlioptas, D.: Database-friendly random projections. In: Proc. ACM Symp. on the Principles of Database Systems, pp. 274–281 (2001)

    Google Scholar 

  2. Aronszajn, N.: Theory of Reproducing Kernels. Defense Technical Information Center, Harvard University, Cambridge (1950)

    Google Scholar 

  3. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001), pp. 245–250 (2001)

    Google Scholar 

  4. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  5. Cucker, F., Smale, S.: On the mathematical foundations of learning. Bull. Am. Math. Soc. 39(1), 1–50 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  6. Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002)

    MATH  Google Scholar 

  7. Lee, J.-S., Oh, I.-S.: Binary classification trees for multi-class classification problems. In: ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, p. 770. IEEE Comput. Soc., Los Alamitos (2003)

    Google Scholar 

  8. Min, R., Bonner, A., Zhang, Z.: Modifying kernels using label information improves SVM classification performance (2007)

    Google Scholar 

  9. Narasimhamurthy, A.: Theoretical bounds of majority voting performance for a binary classification problem. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1988–1995 (2005)

    Article  Google Scholar 

  10. Pardalos, P.M., Hansen, P. (eds.): Data Mining and Mathematical Programming. Am. Math. Soc., Providence (2008)

    MATH  Google Scholar 

  11. Poggio, T., Smale, S.: The mathematics of learning: dealing with data. Not. Am. Math. Soc. 50, 537–544 (2003)

    MathSciNet  MATH  Google Scholar 

  12. Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)

    Article  Google Scholar 

  13. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

Download references

Acknowledgements

The main part of this work was done during the visit of the first named author at the Center for Applied Optimization, University of Florida, Gainesville, whose hospitality is greatly acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pando Georgiev .

Editor information

Editors and Affiliations

Appendix: Detailed Numerical Results

Appendix: Detailed Numerical Results

This section gives the detailed numerical results of the experiments that were presented in Sect. 5. All experiments were performed with randomly selected training sets. The sugar database is used in all the tests of this section.

1.1 A.1 Comparison of PCA and Kernel PCA

In these experiments, kernel PCA was compared to standard PCA. The two methods were applied to sugar data. Table 5 shows the results.

Table 5 Prediction accuracies of PCA and KPCA applied to sugar data

The following parameters were used in the experiments:

  • PCA20: 20 % training points, δ=0.

  • dPCA20: 20 % training points, δ=−1/3.

  • KPCA20: 20 % training points, σ=0.0006, γ=0.000006, δ=0.

  • dKPCA20: 20 % training points, σ=0.0006, γ=0.000006, δ=−1/3.

1.2 A.2 Comparison of δ=0 and δ=−1/3

These tests were performed to measure the impact that the parameter δ had on the prediction accuracy in kernel regression. Several other values of δ were tried as well (both positive and negative), but are not included in this report. δ=−1/3 appeared to be optimal for the sugar data. The results are shown in Table 6. Also, some tests were performed in which kernel regression with noncentered kernel matrices were used. The following parameters were used in the experiments:

  • KPCA20c: 20 % training points. σ=0.25, γ=0.7, δ=0. In these tests a non-centered kernel matrix is used.

  • KPCA40: 40 % training points. σ=0.0006, γ=0.000006, δ=0.

  • dKPCA40: 40 % training points. σ=0.0006, γ=0.000006, δ=−1/3.

  • KPCA60: 60 % training points. σ=0.0006, γ=0.000006, δ=0.

  • dKPCA60: 60 % training points. σ=0.0006, γ=0.000006, δ=−1/3.

Table 6 Comparison of prediction accuracies using δ=0 and δ=−1/3

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this paper

Cite this paper

Olsson, D., Georgiev, P., Pardalos, P.M. (2013). Kernel Principal Component Analysis: Applications, Implementation and Comparison. In: Goldengorin, B., Kalyagin, V., Pardalos, P. (eds) Models, Algorithms, and Technologies for Network Analysis. Springer Proceedings in Mathematics & Statistics, vol 59. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8588-9_9

Download citation

Publish with us

Policies and ethics