Over the past few decades, many classifier methods are suggested for credit risk evaluation. With ever-increasing amounts of data, for multi-criteria optimization classifier (MCOC) and other traditional classification methods, owing to the correlation among different features in data these classifiers often give the poor predictive performance. Thus, some dimensionality reduction techniques are firstly used to find important features; then, these classifier models are built on the reduced data set. However, because feature selection and classification are carried out in different feature spaces, the purpose of increasing predictive accuracy and interpretability is difficult to achieve truly. It is therefore important to research the new classifier methods with simultaneous classification and feature selection so as to improve the predictive accuracy and obtain the interpretable results. In this paper, we propose a novel sparse multi-criteria optimization classifier (SMCOC) based on one-norm regularization, linear and nonlinear programming, respectively, and construct the corresponding algorithm. The experimental results of credit risk evaluation and the comparison with linear and quadratic MCOCs, logistic regression and support vector machines have shown that the proposed SMCOC can enhance the separation of different credit applicants, the efficiency of credit scoring, the interpretability of risk evaluation model and the generalization power of risk prediction for new credit applicants.
Feature sparsification Kernel function Multi-criteria optimization Classification Credit risk
This is a preview of subscription content, log in to check access.
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This research has been partially supported by the Science Foundation of Ludong University (LY2010013) and the Natural Science Foundation of Shandong (ZR2012FL13, ZR2016FM15).
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
The article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent was obtained from all individual participants included in the study.
Baesens B, Egmont-Petersen M, Castelo R, Vanthienen J (2002) Learning Bayesian network classifiers for credit scoring using markov chain Monte Carlo search. In: 16th international conference on pattern recognition (ICPR’02), vol 3, pp 49–52Google Scholar
Chen R, Zhang Z, Wu D, Zhang P, Zhang X, Wang Y, Shi Y (2011) Prediction of protein interaction hot spots using rough set-based multiple criteria linear programming. J Theor Biol 269:174–180CrossRefzbMATHGoogle Scholar
Marinakis Y, Marinaki M, Doumpos M, Matsatsinis N, Zopounidis C (2008) Optimization of nearest neighbor classifiers via metaheuristic algorithms for credit risk assessment. J Glob Optim 42(2):279–293MathSciNetCrossRefzbMATHGoogle Scholar
Martens D, Baesens B, Van Gestel T, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. Eur J Oper Res 183:1466–1476CrossRefzbMATHGoogle Scholar
Nebojsa N, Nevenka Z, Djordje S, Iva J (2013) The application of brute force logistic regression to corporate credit scoring models: evidence from Serbian financial statements. Expert Syst Appl 40(15):5932–5944CrossRefGoogle Scholar
Ong C-S, Huang J-J, Tzeng G-H (2005) Building credit scoring models using genetic programming. Expert Syst Appl 29(1):41–47CrossRefGoogle Scholar
Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064CrossRefGoogle Scholar
Pavlenko T, Chernyak O (2010) Credit risk modeling using Bayesian networks. Int J Intell Syst 25(4):326–344zbMATHGoogle Scholar
Raymond A (2007) The credit scoring toolkit: theory and practice for retail credit risk management and decision automation. Oxford University Press, OxfordGoogle Scholar
Schebesch KB, Stecking R (2005) Support vector machines for credit scoring: extension to non standard cases. In: Innovations in classification, data science, and information systems, pp 498–505Google Scholar
Shi Y (2010) Multiple criteria optimization based data mining methods and applications: a systematic survey. Knowl Inf Syst 24(3):369–391MathSciNetCrossRefGoogle Scholar
Shi Y, Wise M, Luo M, Lin Y (2001) Data mining in credit card Portfolio management: a multiple criteria decision making approach. In: Koksalan M, Zionts S (eds) Advance in multiple criteria decision making in the New Millennium. Springer, Berlin, pp 427–436CrossRefGoogle Scholar
Shigeo A (2010) Support vector machines for pattern classification, 2nd edn. Springer, BerlinzbMATHGoogle Scholar
Sohn SY, Kim JW (2012) Decision tree-based technology credit scoring for start-up firms: Korean case. Expert Syst Appl 39(4):4007–4012CrossRefGoogle Scholar
Wiginton JC (1980) A note on the comparison of logit and discriminant models of consumer credit behaviour. J Financ Quant Anal 15:757–770CrossRefGoogle Scholar
Zhang Z, Shi Y, Gao G (2009) A rough set-based multiple criteria linear programming approach for the medical diagnosis and prognosis. Expert Syst Appl 36(5):8932–8937CrossRefGoogle Scholar
Zhang D, Zhou X, Leung SCH, Zheng J (2010) Vertical bagging decision trees model for credit scoring. Expert Syst Appl 37(12):7838–7843CrossRefGoogle Scholar
Zhang Z, Gao G, Yue J, Duan Y, Shi Y (2014a) Multi-criteria optimization classifier using fuzzification, kernel and penalty factors for predicting protein interaction hot spots. Appl Soft Comput 18:115–125CrossRefGoogle Scholar
Zhang Z, Gao G, Shi Y (2014b) Credit risk evaluation using multi-criteria optimization classifier with kernel, fuzzification and penalty factors. Eur J Oper Res 237:335–348MathSciNetCrossRefzbMATHGoogle Scholar
Zhang Z, Gao G, Tian Y (2015) Multi-kernel multi-criteria optimization classifier with fuzzification and penalty factors for predicting biological activity. Knowl Based Syst 89:301–313CrossRefGoogle Scholar