Abstract
Principal component analysis (PCA) is an effective and well-known method for reducing high-dimensional data sets. Recently, KPCA (Kernel PCA), a nonlinear form of PCA, has been introduced into many fields. In this paper, we propose a new gene selection, namely Custom Kernel principal component analysis (C-KPCA). The new kernel function for KPCA is created by combining a set of kernel functions. First, Singular Value Decomposition (SVD) is used to reduce the dimension of microarray data. Input space is then mapped to a higher-dimensional feature space using the proposed custom kernel function. The main objective of our method is to extract nonlinear features for classification process. In order to test the accuracy of our method, a number of experiments are carried out on four binary gene datasets: Colon Tumor, Leukemia, Lymphoma, and Prostate. The experimental results show that our proposed method results in a higher prediction rate as comparing with several recently published algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wang, Y., Tetko, I.V., Hall, M.A., Frank, E., Facius, A., Mayer, K.F.X., Mewes, H.W.: Gene selection from microarray data for cancer classification - A machine learning approach. Comput. Biol. Chem. 29(1), 37–46 (2005)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining (1998)
Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining, p. 500. Addison Wesley (2005)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, p. 680. John Wiley Section, New York (2001)
Kirby, M., Sirovich, L.: Application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Trans. Pattern Anal. Mach. Intell. 12(1), 103–108 (1990)
Swets, D.L.: Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 18(8), 831–836 (1996)
Comon, P.: Independent component analysis, A new concept? Signal Processing 36(3), 287–314 (1994)
Scholkopf, B., Smola, A., Muller, K.: Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Comput. 10, 1299–1319 (1998)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst., 849–856 (2001)
Liu, Z., Chen, D., Bensmail, H.: Gene expression data classification with kernel principal component analysis. J. Biomed. Biotechnol. 2005(2), 155–159 (2005)
Pochet, N., De Smet, F., Suykens, J.A.K., De Moor, B.L.R.: Systematic benchmarking of microarray data classification: Assessing the role of non-linearity and dimensionality reduction. Bioinformatics 20(17), 3185–3195 (2004)
Czajkowski, M., Grześ, M., Kretowski, M.: Multi-test decision tree and its application to microarray data classification. Artif. Intell. Med. 61(1), 35–44 (2014)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-Based Learning Algorithms. Mach. Learn. 6(1), 37–66 (1991)
Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: A gene selection method for cancer classification using Support Vector Machines. Mach. Learn. 46, 389–422 (2002)
Vapnik, V.: The Nature of Statistical Learning Theory, vol. 8 (1995)
Mundra, P.A., Rajapakse, J.C.: SVM-RFE with MRMR filter for gene selection. IEEE Trans. Nanobioscience 9, 31–37 (2010)
Kim, S.: Margin-maximized redundancy-minimized SVM-RFE for diagnostic classification of mammograms. In: 2011 IEEE Int. Conf. Bioinforma. Biomed. Work., pp. 562–569 (2011)
Tong, D.L., Schierz, A.C.: Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data. Artif. Intell. Med. 53(1), 47–56 (2011)
Vimaladevi, M., Kalaavathi, B.: Cancer Classification using Hybrid Fast Particle Swarm Optimization with Backpropagation Neural Network 3(11), 8410–8414 (2014)
Duan, K.B., Rajapakse, J.C., Wang, H., Azuaje, F.: Multiple SVM-RFE for gene selection in cancer classification with expression data. IEEE Trans. Nanobioscience 4(3), 228–233 (2005)
Yoon, S., Kim, S.: AdaBoost-based multiple SVM-RFE for classification of mammograms in DDSM. BMC Med. Inform. Decis. Mak. 9(Suppl 1), S1 (2009)
Bishop, C.M.C.C.M.: Pattern Recognition and Machine Learning 4(4) (2006)
Williams, C.K.I.: Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond 98(462) (2003)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. (2009)
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Alter, O., Brown, P.O., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. U.S.A. 97(18), 10101–10106 (2000)
Nello Cristianini, J.S.-T.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2000)
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. J. Am. Stat. Assoc. 97(457), 77–87 (2002)
Alshamlan, H.M., Badr, G.H., Alohali, Y.A.: Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification. Comput. Biol. Chem. 56, 49–60 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ha, VS., Nguyen, HN. (2016). C-KPCA: Custom Kernel PCA for Cancer Classification. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)