Abstract
There are three fundamental goals in constructing a good high-dimensional classifier: high accuracy, interpretable feature selection, and efficient computation. In the past 15 years, several popular high-dimensional classifiers have been developed and studied in the literature. These classifiers can be roughly divided into two categories: sparse penalized margin-based classifiers and sparse discriminant analysis. In this chapter we give a comprehensive review of these popular high-dimensional classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bartlett P, Jordan M, McAuliffe J (2006) Convexity, classification and risk bounds. J Am Stat Assoc 101:138–156
Bradley P, Mangasarian O (1998) Feature selection via concave minimization and support vector machines. In: Machine learning proceedings of the fifteenth international conference (ICML’98), Citeseer, pp 82–90
Breiman L (2001) Random forests. Mach Learn 45:5–32
Cai T, Liu W (2011) A direct estimation approach to sparse linear discriminant analysis. J Am Stat Assoc 106:1566–1577
Clemmensen L, Hastie T, Witten D, Ersboll B (2011) Sparse discriminant analysis. Technometrics 53:406–413
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression (with discussion). Ann Stat 32:407–499
Fan J, Fan Y (2008) High dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Fan J, Feng Y, Tong X (2012) A ROAD to classification in high dimensional space. J R Stat Soc Ser B 74:745–771
Fan J, Xue L, Zou H (2014) Strong oracle optimality of folded concave penalized estimation. Ann Stat 42:819–849
Fisher R (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Friedman J (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22
Freund Y, Schapire R, (1997) A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J Comput Syst Sci 55(1):119–139
Graham K, de LasMorenas A, Tripathi A, King C, Kavanah M, Mendez J, Stone M, Slama J, Miller M, Antoine G et al (2010) Gene expression in histologically normal epithelium from breast cancer patients and from cancer-free prophylactic mastectomy patients shares a similar profile. Br J Cancer 102:1284–1293
Hand DJ (2006) Classifier technology and the illusion of progress. Stat Sci 21:1–14
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall/CRC, Boca Raton
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York
Hunter D, Lange K (2004) A tutorial on MM algorithms. Am Stat 58:30–37
Lin Y, Jeon Y (2003) Discriminant analysis through a semiparametric model. Biometrika 90:379–392
Liu J, Ji S, Ye J (2009) SLEP: sparse learning with efficient projections. Arizona State University. http://www.public.asu.edu/jye02/Software/SLEP
Mai Q, Zou H (2015) Sparse semiparametric discriminant analysis. J Multivar Anal 135:175–188
Mai, Q, Zou H, Yuan M (2012) A direct approach to sparse discriminant analysis in ultra-high dimensions. Biometrika 99:29–42
Mai Q, Yang Y, Zou H (2018, to appear) Multiclass sparse discriminant analysis. Stat Sin
Meier L, van de Geer S, Buhlmann P (2008) The group lasso for logistic regression. J R Stat Soc Ser B 70:53–71
Michie D, Spiegelhalter D, Taylor C (1994) Machine learning, neural and statistical classification, 1st edn. Ellis Horwood, Upper Saddle River
Mika S, Räsch G, Weston J, Schölkopf B, Müller KR (1999) Fisher discriminant analysis with kernels. Neural Netw Signal Process IX:41–48
Nesterov Y (2004) Introductory lectures on convex optimization: a basic course. Springer, New York
Nesterov Y (2007) Gradient methods for minimizing composite objective function. Technical Report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain (UCL)
Singh D, Febbo P, Ross K, Jackson D, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie J, Lander E, Loda M, Kantoff P, Golub T, Sellers W (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58: 267–288
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99:6567–6572
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109:475–494
Vapnik V (1996) The nature of statistical learning. Springer, New York
Wang L, Shen X (2006) Multicategory support vector machines, feature selection and solution path. Stat Sin 16:617–634
Wang L, Zhu J, Zou H (2008) Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24:412–419
Witten D, Tibshirani R (2011) Penalized classification using Fisher’s linear discriminant. J R Stat Soc Ser B 73:753–772
Wu M, Zhang L, Wang Z, Christiani D, Lin X (2008) Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set pathway and gene selection. Bioinformatics 25:1145–1151
Yang Y, Zou H (2013) An efficient algorithm for computing the HHSVM and its generalizations. J Comput Graph Stat 22:396–415
Yang Y, Zou H (2015) A fast unified algorithm for solving group-lasso penalized learning problems. Stat Comput 25(6):1129–1141
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67
Zhang C-H (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942
Zhang HH, Ahn J, Lin X, Park C (2006) Gene selection using support vector machines with nonconvex penalty. Bioinformatics 2:88–95
Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14:185–205
Zhu J, Rosset S, Hastie T, Tibshirani R (2003) 1-Norm support vector machine. In: Neural information processing systems. MIT Press, Cambridge, p 16
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320
Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann Stat 36:1509–1533
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Zou, H. (2018). High-Dimensional Classification. In: Härdle, W., Lu, HS., Shen, X. (eds) Handbook of Big Data Analytics. Springer Handbooks of Computational Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18284-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-18284-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18283-4
Online ISBN: 978-3-319-18284-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)