A DC Programming Approach for Sparse Optimal Scoring

Thi, Hoai An Le; Phan, Duy Nhat

doi:10.1007/978-3-319-18032-8_34

Hoai An Le Thi¹⁰ &
Duy Nhat Phan¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

4161 Accesses

Abstract

We consider the supervised classification problem in the high-dimensional setting. High-dimensionality makes the application of most classification difficult. We present a novel approach to the sparse linear discriminant analysis (LDA) based on its optimal scoring interpretation and the zero-norm. The difficulty in treating the zero-norm is overcome by using an appropriate continuous approximation such that the resulting problem can be formulated as a DC (Difference of Convex functions) program to which DCA (DC Algorithms) is investigated. The computational results on both simulated data and real microarray cancer data show the efficiency of the proposed algorithm in feature selection as well as classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bradley, P.S., Mangasarian, O.L.: Feature selection via mathematical programming. In: Proceeding of International Conference on Machine Learning, ICML 1998 (2008)
Google Scholar
Clemmensen, L., Hastie, T., Witten, D., Ersbøll, B.: Sparse discriminant analysis. Technometrics 53(4), 406–413 (2011)
Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annal of Eugenics 7, 179–188 (1936)
Article Google Scholar
Friedman, J., Hastie, T., Hoefling, H., Tibshirani, R.: Pathwise coordinate optimization. The Anals of Applied Statistics 1, 302–332 (2007)
Article MATH Google Scholar
Grosenick, L., Greer, S., Knutson, B.: Interpretable classifers for fmri improve prediction of purchases. IEEE Transactions on Neural Systems and Rehabilitation Engineering 16(6), 539–547 (2008)
Article Google Scholar
Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1), 86–100 (2007)
Article MATH Google Scholar
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. The Annals of Statistics 23(1), 73–102 (1995)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009)
Google Scholar
Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC programming approach for feature selection in support vector machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259–278 (2008)
Google Scholar
Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for hierarchical clustering. European Journal of Operational Research 183, 1067–1085 (2007)
Google Scholar
Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23–46 (2005)
Google Scholar
Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Exact penalty and error bounds in DC programming. Journal of Global Optimization 52(3), 509–535 (2012)
Google Scholar
Le Thi, H.A., Pham Dinh, T., Le Hoai, M., Vo Xuan, T.: DC approximation approaches for sparse optimization. To appear in European Journal of Operational Research (2014)
Google Scholar
Leng, C.: Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Computational Biology and Chemistry 32, 417–425 (2008)
Article MATH MathSciNet Google Scholar
Liu, Y., Shen, X.: Multicategory \(\psi \)-learning. Journal of the American Statistical Association 101, 500–509 (2006)
Article MathSciNet Google Scholar
Liu, Y., Shen, X., Doss, H.: Multicategory \(\psi \)-learning and support vector machine: Computational tools. Journal of Computational and Graphical Statistics 14, 219–236 (2005)
Article MathSciNet Google Scholar
Peleg, D., Meir, R.: A bilinear formulation for vector sparsity optimization. Signal Processing 88(2), 375–389 (2008)
Article MATH Google Scholar
DPham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: Theory, algorithms and applications. Acta Mathematica Vietnamica 22(1), 289–355 (1997)
Google Scholar
Pham Dinh, T., Le Thi, H.A.: A DC optimization algorithm for solving the trust-region subproblem. SIAM. Journal of Optimization 8(2), 476–505 (1998)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. 58, 267–288 (1996)
MATH MathSciNet Google Scholar
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. 99, 6567–6572 (2002)
Article Google Scholar
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Statistical Science 18(1), 104–117 (2003)
Article MATH MathSciNet Google Scholar
Witten, D., Tibshirani, R.: Penalized classification using Fisher’s linear discriminant. Journal Royal Statistical Society B 73, 753–772 (2011)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Theoretical and Applied Computer Science EA 3097, University of Lorraine, Ile de Saulcy, 57045, Metz, France
Hoai An Le Thi & Duy Nhat Phan

Authors

Hoai An Le Thi
View author publications
You can also search for this author in PubMed Google Scholar
Duy Nhat Phan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hoai An Le Thi .

Editor information

Editors and Affiliations

Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Tru Cao
Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Japan Advanced Institute of Science and Technology, Nomi City, Japan
Tu-Bao Ho
The University of Hong Kong, Hong Kong, Hong Kong SAR
David Cheung
Osaka University, Osaka, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thi, H.A.L., Phan, D.N. (2015). A DC Programming Approach for Sparse Optimal Scoring. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-18032-8_34
Published: 09 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18031-1
Online ISBN: 978-3-319-18032-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics