Abstract
Kernel Principal Component Analysis (KPCA) is a dimension reduction method that is closely related to Principal Component Analysis (PCA). This report gives an overview of kernel PCA and presents an implementation of the method in MATLAB. The implemented method is tested in a transductive setting on two data bases. Two methods for labeling data points are considered, the nearest neighbor method and kernel regression, together with some possible improvements of the methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
Definition: \(\| x \|_{p} \triangleq (\sum^{d}_{i=1} | x_{i} |^{p})^{1/p}\), where p≥0.
- 3.
Definition: \({\operatorname{argmin}}_{x} f(x) \triangleq \{x | \forall y: f(x) \leq f(y) \}\).
- 4.
Available online: http://archive.ics.uci.edu/ml/.
References
Achlioptas, D.: Database-friendly random projections. In: Proc. ACM Symp. on the Principles of Database Systems, pp. 274–281 (2001)
Aronszajn, N.: Theory of Reproducing Kernels. Defense Technical Information Center, Harvard University, Cambridge (1950)
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001), pp. 245–250 (2001)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Cucker, F., Smale, S.: On the mathematical foundations of learning. Bull. Am. Math. Soc. 39(1), 1–50 (2002)
Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002)
Lee, J.-S., Oh, I.-S.: Binary classification trees for multi-class classification problems. In: ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, Washington, DC, USA, p. 770. IEEE Comput. Soc., Los Alamitos (2003)
Min, R., Bonner, A., Zhang, Z.: Modifying kernels using label information improves SVM classification performance (2007)
Narasimhamurthy, A.: Theoretical bounds of majority voting performance for a binary classification problem. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1988–1995 (2005)
Pardalos, P.M., Hansen, P. (eds.): Data Mining and Mathematical Programming. Am. Math. Soc., Providence (2008)
Poggio, T., Smale, S.: The mathematics of learning: dealing with data. Not. Am. Math. Soc. 50, 537–544 (2003)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Acknowledgements
The main part of this work was done during the visit of the first named author at the Center for Applied Optimization, University of Florida, Gainesville, whose hospitality is greatly acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Detailed Numerical Results
Appendix: Detailed Numerical Results
This section gives the detailed numerical results of the experiments that were presented in Sect. 5. All experiments were performed with randomly selected training sets. The sugar database is used in all the tests of this section.
1.1 A.1 Comparison of PCA and Kernel PCA
In these experiments, kernel PCA was compared to standard PCA. The two methods were applied to sugar data. Table 5 shows the results.
The following parameters were used in the experiments:
-
PCA20: 20 % training points, δ=0.
-
dPCA20: 20 % training points, δ=−1/3.
-
KPCA20: 20 % training points, σ=0.0006, γ=0.000006, δ=0.
-
dKPCA20: 20 % training points, σ=0.0006, γ=0.000006, δ=−1/3.
1.2 A.2 Comparison of δ=0 and δ=−1/3
These tests were performed to measure the impact that the parameter δ had on the prediction accuracy in kernel regression. Several other values of δ were tried as well (both positive and negative), but are not included in this report. δ=−1/3 appeared to be optimal for the sugar data. The results are shown in Table 6. Also, some tests were performed in which kernel regression with noncentered kernel matrices were used. The following parameters were used in the experiments:
-
KPCA20c: 20 % training points. σ=0.25, γ=0.7, δ=0. In these tests a non-centered kernel matrix is used.
-
KPCA40: 40 % training points. σ=0.0006, γ=0.000006, δ=0.
-
dKPCA40: 40 % training points. σ=0.0006, γ=0.000006, δ=−1/3.
-
KPCA60: 60 % training points. σ=0.0006, γ=0.000006, δ=0.
-
dKPCA60: 60 % training points. σ=0.0006, γ=0.000006, δ=−1/3.
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Olsson, D., Georgiev, P., Pardalos, P.M. (2013). Kernel Principal Component Analysis: Applications, Implementation and Comparison. In: Goldengorin, B., Kalyagin, V., Pardalos, P. (eds) Models, Algorithms, and Technologies for Network Analysis. Springer Proceedings in Mathematics & Statistics, vol 59. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8588-9_9
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8588-9_9
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8587-2
Online ISBN: 978-1-4614-8588-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)