Advertisement

Lobachevskii Journal of Mathematics

, Volume 39, Issue 9, pp 1179–1187 | Cite as

Quadratic Programming Optimization with Feature Selection for Nonlinear Models

  • R. V. IsachenkoEmail author
  • V. V. Strijov
Part 1. Special issue “High Performance Data Intensive Computing” Editors: V. V. Voevodin, A. S. Simonov, and A. V. Lapin
  • 4 Downloads

Abstract

The paper is devoted to the problem of constructing a predictive model in the high-dimensional feature space. The space is redundant, there is multicollinearity in the design matrix columns. In this case the model is unstable to changes in data or in parameter values. To build a stable model, the authors solve the dimensionality reduction problem for the feature space. It is proposed to use feature selection methods during parameter optimization process. The idea is to select the active set of model parameters which have to be optimized in the current optimization step. Quadratic programming feature selection is used to find the active set of parameters. The algorithm maximizes the relevance of model parameters to the residuals and makes them pairwise independent. Nonlinear regression and logistic regression models are investigated. We carried out the experiment to show how the proposed method works and compare it with other methods. The proposed algorithm achieves the less error and greater stability with comparison to the other methods.

Keywords and phrases

quadratic programming feature selection nonlinear regression logistic regression Newton method 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Y. Nesterov, “A method of solving a convex programming problem with convergence rate O(1/k2),” Sov. Math. Dokl. 27, 372–376 (1983).zbMATHGoogle Scholar
  2. 2.
    J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” J. Machine Learn. Res. 12, 2121–2159 (2011).MathSciNetzbMATHGoogle Scholar
  3. 3.
    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412 6980 (2014).Google Scholar
  4. 4.
    I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, Boston, 2016).zbMATHGoogle Scholar
  5. 5.
    M. Avriel, Nonlinear Programming: Analysis and Methods (Courier, 2003).zbMATHGoogle Scholar
  6. 6.
    B. Blaschke, A. Neubauer, and O. Scherzer, “On convergence rates for the iteratively regularized gaussnewton method,” J. Numer. Anal. 17, 421–436 (1997).MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    A. Botev, H. Ritter, and D. Barber, “Practical gauss-newton optimisation for deep learning,” in Proceedings of the International Conference on Machine Learning, 2017, pp. 557–565.Google Scholar
  8. 8.
    J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: a data perspective,” ACMComputing Surveys 50 (6), 94 (2017).Google Scholar
  9. 9.
    C. Ding and H. Peng, “Minimum redundancy feature selection from microarray gene expression data,” J. Bioinform. Comput. Biol. 3, 185–205 (2005).CrossRefGoogle Scholar
  10. 10.
    M. Yamada, A. Saha, H. Ouyang, D. Yin, and Y. Chang, “N3LARS: minimum redundancy maximum relevance feature selection for large and high-dimensional data,” arXiv:1411. 2331 (2014).Google Scholar
  11. 11.
    H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. Pattern Anal. Machine Intell. 27, 1226–1238 (2005).CrossRefGoogle Scholar
  12. 12.
    A. Katrutsa and V. Strijov, “Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria,” Expert Syst. Appl. 76, 1–11 (2017).CrossRefGoogle Scholar
  13. 13.
    I. Rodriguez-Lujan, R, Huerta, C. Elkan, and C. S. Cruz, “Quadratic programming feature selection,” J. Machine Learn. Res. 11, 1491–1516 (2010).MathSciNetzbMATHGoogle Scholar
  14. 14.
    D. Bertsimas, A. King, R. Mazumder, et al., “Best subset selection via a modern optimization lens,” Ann. Statistics 44, 813–852 (2016).MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    D. Dheeru and E. Karra Taniskidou, “UCI machine learning repository” (2017).Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2018

Authors and Affiliations

  1. 1.Moscow Institute of Physics and Technology (State University)Dolgoprudnyi, Moscow oblastRussia
  2. 2.Skolkovo Institute of Science and TechnologyMoscowRussia
  3. 3.A.A. Dorodnicyn Computing CentreRussian Academy of SciencesMoscowRussia

Personalised recommendations