Advertisement

A Metric Framework for Quantifying Data Concentration

  • Peter MiticEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11872)

Abstract

Poor performance of artificial neural nets when applied to credit-related classification problems is investigated and contrasted with logistic regression classification. We propose that artificial neural nets are less successful because of the inherent structure of credit data rather than any particular aspect of the neural net structure. Three metrics are developed to rationalise the result with such data. The metrics exploit the distributional properties of the data to rationalise neural net results. They are used in conjunction with a variant of an established concentration measure that differentiates between class characteristics. The results are contrasted with those obtained using random data, and are compared with results obtained using logistic regression. We find, in general agreement with previous studies, that logistic regressions out-perform neural nets in the majority of cases. An approximate decision criterion is developed in order to explain adverse results.

Keywords

Copula Hypersphere Cluster Herfindahl-Hirschman HHI Credit Concentration Decision criterion Tensorflow Neural Net 

References

  1. 1.
    Yala, A., Lehman, C., Schuster, T., Portnoi, T., Barzilay, R.: A Deep Learning Mammography-based Model for Improved Breast Cancer Risk Prediction (2019). Radiology https://doi.org/10.1148/radiol.2019182716. 07 May 2019CrossRefGoogle Scholar
  2. 2.
    Louzada, F., Ara, A., Fernandes, G.B.: Classification methods applied to credit scoring. Surv. Oper. Res. Manage. Sci. 21(2), 117–134 (2016).  https://doi.org/10.1016/j.sorms.2016.10.001CrossRefGoogle Scholar
  3. 3.
    Atiya, A.F.: Bankruptcy prediction for credit risk using neural networks. IEEE Trans. Neural Networks 12(4), 929–935 (2001)CrossRefGoogle Scholar
  4. 4.
    Bredart, X.: Bankruptcy prediction model using neural networks. Account. Finance Res. 3(2), 124–128 (2014)Google Scholar
  5. 5.
    West, D.: Credit scoring models. Comput. Oper. Res. 27(11), 1131–1152 (2000).  https://doi.org/10.1016/S0305-0548(99)00149-5CrossRefzbMATHGoogle Scholar
  6. 6.
    Lessmann, S., Baesens, B., Seow, H., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring, Eur. J. Oper. Res. (2015).  https://doi.org/10.1016/j.ejor.2015.05.030CrossRefGoogle Scholar
  7. 7.
    Kvamme, H., Sellereite, N., Aas, K., Sjursen, S.: Predicting mortgage default using convolutional works. Expert Syst. Appl. 102, 207–217 (2018).  https://doi.org/10.1016/j.eswa.2018.02.029CrossRefGoogle Scholar
  8. 8.
    Addo, P.M., Guegan, D., Hassani, B.: Credit risk analysis using machine and deep learning models. Risks 6(38) (2018).  https://doi.org/10.3390/risks6020038CrossRefGoogle Scholar
  9. 9.
    Munkhdalai, L., Munkhdalai, T., Namsrai, O., Lee, J.Y., Ryu, K.H.: An empirical comparison of machine-learning methods on bank client credit assessments. Sustainability 11(699) (2019).  https://doi.org/10.3390/su11030699CrossRefGoogle Scholar
  10. 10.
    Yampolskiy, R.V.: Predicting future AI failures from historic examples. Foresight (2018).  https://doi.org/10.1108/FS-04-2018-0034CrossRefGoogle Scholar
  11. 11.
    Bikker, J.A., Haaf, K.: Measures of competition and concentration in the banking industry. Econ. Finan. Model. 9(2), 53–98 (2002)Google Scholar
  12. 12.
    Demarta, S., McNeil, A.J.: The t-copula and related copulas. Int. Stat. Rev. 73(1), 111–129 (2005)CrossRefGoogle Scholar
  13. 13.
    Rodriguez, C.: Measuring financial contagion: a copula approach. J. Empirical Financ. 14(3), 401–423 (2007).  https://doi.org/10.1016/j.jempfin.2006.07.002CrossRefGoogle Scholar
  14. 14.
    Dua, D. and Graff, C.: UCI Machine Learning Repository Irvine CA (2019). http://archive.ics.uci.edu/ml
  15. 15.
    Li, M., Mickel, A., Taylor, S.: Should this loan be approved or denied? J. Stat. Educ. 26(1), 55–66 (2018).  https://doi.org/10.1080/10691898.2018.1434342CrossRefGoogle Scholar
  16. 16.
    Yeh, I.C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2), 2473–2480 (2009)CrossRefGoogle Scholar
  17. 17.
    Zieba, M., Tomczak, S.K., Tomczak, J.M.: Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Syst. Appl. 58(1), 93–101 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.RiskSantander UKLondonUK
  2. 2.Department of Computer ScienceUCLLondonUK
  3. 3.Laboratoire d’Excellence sur la Régulation Financière (LabEx ReFi)ParisFrance

Personalised recommendations