Abstract
In this paper, we deal with the multiclass logistic regression problem, one of the most popular supervised classification method. We aim at developing an efficient method to solve this problem for large-scale datasets, i.e. large number of features and large number of instances. To deal with a large number of features, we consider feature selection method evolving the \(l_{\infty ,0}\) regularization. The resulting optimization problem is non-convex for which we develop a stochastic version of DCA (Difference of Convex functions Algorithm) to solve. This approach is suitable to handle datasets with very large number of instances. Numerical experiments on several benchmark datasets and synthetic datasets illustrate the efficiency of our algorithm and its superiority over well-known methods, with respect to classification accuracy, sparsity of solution as well as running time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amaldi, E., Kann, V.: On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor. Comput. Sci. 209(1), 237–260 (1998)
Cox, D.: The regression analysis of binary sequences (with discussion). J. Roy. Stat. Soc. B 20, 215–242 (1958)
Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the l 1-ball for learning in high dimensions. In: Proceedings of the 25th International Conference on Machine Learning, pp. 272–279. ACM (2008)
Geusebroek, J.M., Burghouts, G.J., Smeulders, A.W.: The Amsterdam library of object images. Int. J. Comput. Vis. 61(1), 103–112 (2005)
Kim, J., Kim, Y., Kim, Y.: A gradient-based optimization algorithm for LASSO. J. Comput. Graph. Stat. 17(4), 994–1009 (2008)
King, G., Zeng, L.: Logistic regression in rare events data. Polit. Anal. 9, 137–163 (2001)
Le, H.M., Le Thi, H.A., Nguyen, M.C.: Sparse semi-supervised support vector machines by DC programming and DCA. Neurocomputing 153, 62–76 (2015)
Le Thi, H.A., Le, H.M., Nguyen, V.V., Pham Dinh, T.: A DC programming approach for feature selection in support vector machines learning. Adv. Data Anal. Classif. 2(3), 259–278 (2008)
Le Thi, H.A., Nguyen, M.C.: Efficient algorithms for feature selection in multi-class support vector machine. In: Advanced Computational Methods for Knowledge Engineering, pp. 41–52. Springer, Heidelberg (2013)
Le Thi, H.A., Nguyen, T.B.T., Le, H.M.: Sparse signal recovery by difference of convex functions algorithms. In: Intelligent Information and Database Systems, pp. 387–397. Springer, Heidelberg (2013)
Le Thi, H.A., Pham Dinh, T.: The DC (Difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization Problems. Ann. Oper. Res. 133(1–4), 23–46 (2005)
Le Thi, H.A., Pham Dinh, T., Le, H.M., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)
Le Thi, H.A., Phan, D.N.: DC Programming and DCA for Sparse Optimal Scoring Problem. Neurocomput. 186(C), 170–181 (2016)
Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnamica 22(1), 289–355 (1997)
Pham Dinh, T., Le Thi, H.A.: A D.C. Optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
Phan, D.N.: Algorithmes basés sur la programmation DC et DCA pour l’apprentissage avec la parcimonie et l’apprentissage stochastique en grande dimension. Université de Lorraine, Thèse de doctorat (2016)
Quattoni, A., Carreras, X., Collins, M., Darrell, T.: An Efficient Projection for L1, \(\infty \) Regularization. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 857–864. ICML 2009. ACM, New York (2009)
Turlach, B.A., Venables, W.N., Wright, S.J.: Simultaneous variable selection. Technometrics 47(3), 349–363 (2005)
Verma, J.P.: Logistic regression: developing a model for risk analysis. In: Data Analysis in Management with SPSS Software, pp. 413–442. Springer, India (2013)
Wang, L., Chen, G., Li, H.: Group SCAD regression analysis for microarray time course gene expression data. Bioinformatics 23(12), 1486–1494 (2007)
Wei, F., Zhu, H.: Group coordinate descent algorithms for nonconvex penalized regression. Comput. Stati. Data Anal. 56(2), 316–326 (2012)
Witten, D.M., Tibshirani, R.: Penalized classification using fisher’s linear discriminant. J. Roy. Stat. Soc. B 73(5), 753–772 (2011)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. B 68, 49–67 (2006)
Zhang, H.H., Liu, Y., Wu, Y., Zhu, J.: Variable selection for the multicategory SVM via adaptive sup-norm regularization. Electron. J. Stat. 2, 149–167 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B. (2018). Stochastic DCA for Sparse Multiclass Logistic Regression. In: Le, NT., van Do, T., Nguyen, N., Thi, H. (eds) Advanced Computational Methods for Knowledge Engineering. ICCSAMA 2017. Advances in Intelligent Systems and Computing, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-61911-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-61911-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61910-1
Online ISBN: 978-3-319-61911-8
eBook Packages: EngineeringEngineering (R0)