Skip to main content

High-Dimensional Classification

  • Chapter
  • First Online:
Handbook of Big Data Analytics

Part of the book series: Springer Handbooks of Computational Statistics ((SHCS))

  • 4441 Accesses

Abstract

There are three fundamental goals in constructing a good high-dimensional classifier: high accuracy, interpretable feature selection, and efficient computation. In the past 15 years, several popular high-dimensional classifiers have been developed and studied in the literature. These classifiers can be roughly divided into two categories: sparse penalized margin-based classifiers and sparse discriminant analysis. In this chapter we give a comprehensive review of these popular high-dimensional classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bartlett P, Jordan M, McAuliffe J (2006) Convexity, classification and risk bounds. J Am Stat Assoc 101:138–156

    Article  MathSciNet  Google Scholar 

  • Bradley P, Mangasarian O (1998) Feature selection via concave minimization and support vector machines. In: Machine learning proceedings of the fifteenth international conference (ICML’98), Citeseer, pp 82–90

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Google Scholar 

  • Cai T, Liu W (2011) A direct estimation approach to sparse linear discriminant analysis. J Am Stat Assoc 106:1566–1577

    Article  MathSciNet  Google Scholar 

  • Clemmensen L, Hastie T, Witten D, Ersboll B (2011) Sparse discriminant analysis. Technometrics 53:406–413

    Article  MathSciNet  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression (with discussion). Ann Stat 32:407–499

    Google Scholar 

  • Fan J, Fan Y (2008) High dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637

    Article  MathSciNet  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MathSciNet  Google Scholar 

  • Fan J, Feng Y, Tong X (2012) A ROAD to classification in high dimensional space. J R Stat Soc Ser B 74:745–771

    Google Scholar 

  • Fan J, Xue L, Zou H (2014) Strong oracle optimality of folded concave penalized estimation. Ann Stat 42:819–849

    Article  MathSciNet  Google Scholar 

  • Fisher R (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188

    Article  Google Scholar 

  • Friedman J (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

    Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22

    Google Scholar 

  • Freund Y, Schapire R, (1997) A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  Google Scholar 

  • Graham K, de LasMorenas A, Tripathi A, King C, Kavanah M, Mendez J, Stone M, Slama J, Miller M, Antoine G et al (2010) Gene expression in histologically normal epithelium from breast cancer patients and from cancer-free prophylactic mastectomy patients shares a similar profile. Br J Cancer 102:1284–1293

    Article  Google Scholar 

  • Hand DJ (2006) Classifier technology and the illusion of progress. Stat Sci 21:1–14

    Article  MathSciNet  Google Scholar 

  • Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall/CRC, Boca Raton

    Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York

    Google Scholar 

  • Hunter D, Lange K (2004) A tutorial on MM algorithms. Am Stat 58:30–37

    Article  MathSciNet  Google Scholar 

  • Lin Y, Jeon Y (2003) Discriminant analysis through a semiparametric model. Biometrika 90:379–392

    Article  MathSciNet  Google Scholar 

  • Liu J, Ji S, Ye J (2009) SLEP: sparse learning with efficient projections. Arizona State University. http://www.public.asu.edu/jye02/Software/SLEP

  • Mai Q, Zou H (2015) Sparse semiparametric discriminant analysis. J Multivar Anal 135:175–188

    Article  MathSciNet  Google Scholar 

  • Mai, Q, Zou H, Yuan M (2012) A direct approach to sparse discriminant analysis in ultra-high dimensions. Biometrika 99:29–42

    Article  MathSciNet  Google Scholar 

  • Mai Q, Yang Y, Zou H (2018, to appear) Multiclass sparse discriminant analysis. Stat Sin

    Google Scholar 

  • Meier L, van de Geer S, Buhlmann P (2008) The group lasso for logistic regression. J R Stat Soc Ser B 70:53–71

    Article  MathSciNet  Google Scholar 

  • Michie D, Spiegelhalter D, Taylor C (1994) Machine learning, neural and statistical classification, 1st edn. Ellis Horwood, Upper Saddle River

    Google Scholar 

  • Mika S, Räsch G, Weston J, Schölkopf B, Müller KR (1999) Fisher discriminant analysis with kernels. Neural Netw Signal Process IX:41–48

    Google Scholar 

  • Nesterov Y (2004) Introductory lectures on convex optimization: a basic course. Springer, New York

    Google Scholar 

  • Nesterov Y (2007) Gradient methods for minimizing composite objective function. Technical Report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain (UCL)

    Google Scholar 

  • Singh D, Febbo P, Ross K, Jackson D, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie J, Lander E, Loda M, Kantoff P, Golub T, Sellers W (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58: 267–288

    Google Scholar 

  • Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99:6567–6572

    Article  Google Scholar 

  • Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109:475–494

    Article  MathSciNet  Google Scholar 

  • Vapnik V (1996) The nature of statistical learning. Springer, New York

    Google Scholar 

  • Wang L, Shen X (2006) Multicategory support vector machines, feature selection and solution path. Stat Sin 16:617–634

    Google Scholar 

  • Wang L, Zhu J, Zou H (2008) Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24:412–419

    Article  Google Scholar 

  • Witten D, Tibshirani R (2011) Penalized classification using Fisher’s linear discriminant. J R Stat Soc Ser B 73:753–772

    Article  MathSciNet  Google Scholar 

  • Wu M, Zhang L, Wang Z, Christiani D, Lin X (2008) Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set pathway and gene selection. Bioinformatics 25:1145–1151

    Article  Google Scholar 

  • Yang Y, Zou H (2013) An efficient algorithm for computing the HHSVM and its generalizations. J Comput Graph Stat 22:396–415

    Article  MathSciNet  Google Scholar 

  • Yang Y, Zou H (2015) A fast unified algorithm for solving group-lasso penalized learning problems. Stat Comput 25(6):1129–1141

    Article  MathSciNet  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67

    Article  MathSciNet  Google Scholar 

  • Zhang C-H (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942

    Article  MathSciNet  Google Scholar 

  • Zhang HH, Ahn J, Lin X, Park C (2006) Gene selection using support vector machines with nonconvex penalty. Bioinformatics 2:88–95

    Article  Google Scholar 

  • Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14:185–205

    Article  MathSciNet  Google Scholar 

  • Zhu J, Rosset S, Hastie T, Tibshirani R (2003) 1-Norm support vector machine. In: Neural information processing systems. MIT Press, Cambridge, p 16

    Google Scholar 

  • Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MathSciNet  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320

    Article  MathSciNet  Google Scholar 

  • Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann Stat 36:1509–1533

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Zou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zou, H. (2018). High-Dimensional Classification. In: Härdle, W., Lu, HS., Shen, X. (eds) Handbook of Big Data Analytics. Springer Handbooks of Computational Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18284-1_9

Download citation

Publish with us

Policies and ethics