Abstract
This paper introduces an evolutionary tuning approach for a pipeline of preprocessing methods and kernel principal component analysis (PCA) employing evolution strategies (ES). A simple (1+1)-ES adapts the imputation method, various preprocessing steps like normalization and standardization, and optimizes the parameters of kernel PCA. A small experimental study on a benchmark data set with missing values demonstrates that the evolutionary kernel PCA pipeline can be tuned with relatively few optimization steps, which makes evolutionary tuning applicable to scenarios with very large data sets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
type numpy.nan in Python.
References
Das, S., Abraham, A., Konar, A.: Automatic clustering using an improved differential evolution algorithm. IEEE Trans. Syst. Man Cybern. Part A 38(1), 218–237 (2008)
Friedrichs, F., Igel, C.: Evolutionary tuning of multiple SVM parameters. Neurocomputing 64, 107–117 (2005)
Jolliffe, I.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (1986)
Karami, A., Johansson, R.: Choosing DBSCAN parameters automatically using differential evolution. Int. J. Comput. Appl. 91(7), 1–11 (2014)
Kramer, O.: Hybrid manifold clustering with evolutionary tuning. In: Mora, A.M., Squillero, G. (eds.) EvoApplications 2015. LNCS, vol. 9028, pp. 481–490. Springer, Cham (2015). doi:10.1007/978-3-319-16549-3_39
Kwok, J.T., Mak, B., Ho, S.K.: Eigenvoice speaker adaptation via composite kernel PCA. In: Neural Information Processing Systems (NIPS), pp. 1401–1408 (2003)
Lückehe, D., Kramer, O.: Alternating optimization of unsupervised regression with evolutionary embeddings. In: Mora, A.M., Squillero, G. (eds.) EvoApplications 2015. LNCS, vol. 9028, pp. 471–480. Springer, Cham (2015). doi:10.1007/978-3-319-16549-3_38
Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Genetic and Evolutionary Computation Conference (GECCO), pp. 485–492 (2016)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Neural Information Processing Systems (NIPS), pp. 2960–2968 (2012)
Sun, R., Tsung, F., Qu, L.: Integrating KPCA with an improved evolutionary algorithm for knowledge discovery in fault diagnosis. In: Leung, K.S., Chan, L.-W., Meng, H. (eds.) IDEAL 2000. LNCS, vol. 1983, pp. 174–179. Springer, Heidelberg (2000). doi:10.1007/3-540-44491-2_26
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Data Sets
A Data Sets
UCI Digits comprises handwritten Digits with \(d=64\). Friedman is a regression problem generated with scikit-learn and \(d=500\). The Wind data set is based on spatio-temporal time series data from the National Renewable Energy Laboratory (NREL) comprising 11 three MW turbines for three years in a 10-minute resolution, resulting in \(d=11\) dimensions. The Image data set contains image segmentation data with \(d=19\) dimensions.
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kramer, O. (2017). Evolving Kernel PCA Pipelines with Evolution Strategies. In: Kern-Isberner, G., Fürnkranz, J., Thimm, M. (eds) KI 2017: Advances in Artificial Intelligence. KI 2017. Lecture Notes in Computer Science(), vol 10505. Springer, Cham. https://doi.org/10.1007/978-3-319-67190-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-67190-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67189-5
Online ISBN: 978-3-319-67190-1
eBook Packages: Computer ScienceComputer Science (R0)