Abstract
We consider the supervised classification problem in the high-dimensional setting. High-dimensionality makes the application of most classification difficult. We present a novel approach to the sparse linear discriminant analysis (LDA) based on its optimal scoring interpretation and the zero-norm. The difficulty in treating the zero-norm is overcome by using an appropriate continuous approximation such that the resulting problem can be formulated as a DC (Difference of Convex functions) program to which DCA (DC Algorithms) is investigated. The computational results on both simulated data and real microarray cancer data show the efficiency of the proposed algorithm in feature selection as well as classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bradley, P.S., Mangasarian, O.L.: Feature selection via mathematical programming. In: Proceeding of International Conference on Machine Learning, ICML 1998 (2008)
Clemmensen, L., Hastie, T., Witten, D., Ersbøll, B.: Sparse discriminant analysis. Technometrics 53(4), 406–413 (2011)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annal of Eugenics 7, 179–188 (1936)
Friedman, J., Hastie, T., Hoefling, H., Tibshirani, R.: Pathwise coordinate optimization. The Anals of Applied Statistics 1, 302–332 (2007)
Grosenick, L., Greer, S., Knutson, B.: Interpretable classifers for fmri improve prediction of purchases. IEEE Transactions on Neural Systems and Rehabilitation Engineering 16(6), 539–547 (2008)
Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1), 86–100 (2007)
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. The Annals of Statistics 23(1), 73–102 (1995)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009)
Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC programming approach for feature selection in support vector machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259–278 (2008)
Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for hierarchical clustering. European Journal of Operational Research 183, 1067–1085 (2007)
Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23–46 (2005)
Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Exact penalty and error bounds in DC programming. Journal of Global Optimization 52(3), 509–535 (2012)
Le Thi, H.A., Pham Dinh, T., Le Hoai, M., Vo Xuan, T.: DC approximation approaches for sparse optimization. To appear in European Journal of Operational Research (2014)
Leng, C.: Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Computational Biology and Chemistry 32, 417–425 (2008)
Liu, Y., Shen, X.: Multicategory \(\psi \)-learning. Journal of the American Statistical Association 101, 500–509 (2006)
Liu, Y., Shen, X., Doss, H.: Multicategory \(\psi \)-learning and support vector machine: Computational tools. Journal of Computational and Graphical Statistics 14, 219–236 (2005)
Peleg, D., Meir, R.: A bilinear formulation for vector sparsity optimization. Signal Processing 88(2), 375–389 (2008)
DPham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: Theory, algorithms and applications. Acta Mathematica Vietnamica 22(1), 289–355 (1997)
Pham Dinh, T., Le Thi, H.A.: A DC optimization algorithm for solving the trust-region subproblem. SIAM. Journal of Optimization 8(2), 476–505 (1998)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. 58, 267–288 (1996)
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. 99, 6567–6572 (2002)
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Statistical Science 18(1), 104–117 (2003)
Witten, D., Tibshirani, R.: Penalized classification using Fisher’s linear discriminant. Journal Royal Statistical Society B 73, 753–772 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Thi, H.A.L., Phan, D.N. (2015). A DC Programming Approach for Sparse Optimal Scoring. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-18032-8_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18031-1
Online ISBN: 978-3-319-18032-8
eBook Packages: Computer ScienceComputer Science (R0)