Bankruptcy Prediction: A Comparison of Some Statistical and Machine Learning Techniques

Peña, Tonatiuh; Martínez, Serafín; Abudu, Bolanle

doi:10.1007/978-3-642-16943-4_6

Tonatiuh Peña³,
Serafín Martínez⁴ &
Bolanle Abudu⁵

Part of the book series: Dynamic Modeling and Econometrics in Economics and Finance ((DMEF,volume 13))

1655 Accesses
4 Citations

Abstract

We are interested in forecasting bankruptcies in a probabilistic way. Specifically, we compare the classification performance of several statistical and machine-learning techniques, namely discriminant analysis (Altman’s Z-score), logistic regression, least-squares support vector machines and different instances of Gaussian processes (GP’s)—that is GP classifiers, Bayesian Fisher discriminant and Warped GPs. Our contribution to the field of computational finance is to introduce GPs as a competitive probabilistic framework for bankruptcy prediction. Data from the repository of information of the US Federal Deposit Insurance Corporation is used to test the predictions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The work by Estrella et al. (2000) has a similar scope to ours.
2.
Identifying a disease.
3.
Estimating the prospect of recovery.
4.
Some human remains discovered in a burial site in Egypt were required to be sexed, i.e. determined whether they belonged to female or male specimens (Fisher 1936).
5.
We recall that x is a vector of observed features obtained through indirect means whereas y is a canonical variable representing the class.
6.
The response function is the inverse of the link function used in statistics.
7.
We have omitted dependencies on x ^⋆ to keep the notation uncluttered.
8.
As expressed by Rasmussen and Williams (2006), the characteristic length scales can be loosely interpreted as the distance required to move along each axes in order to have uncorrelated inputs.
9.
We thank the Centre for Computational Finance and Economic Agents (CCFEA).

References

Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23(4), 589–609.
Article Google Scholar
Atiya, A. F. (2001). Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Transactions on Neural Networks, 12, 929–935.
Article Google Scholar
Back, B., Laitinen, T., Sere, K., & van Wezel, M. (1996). Choosing bankruptcy predictors using discriminant analysis, logit analysis, and genetic algorithms (Technical Report 40). Turku Centre for Computer Science, September 1996.
Google Scholar
Beaver, W. H. (1966). Financial ratios as predictors of failures. Journal of Accounting Research, 4, 71–111.
Article Google Scholar
Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press.
Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Information science and statistics. New York: Springer.
Google Scholar
Box, G. E., & Tiao, G. C. (1973). Bayesian inference in statistical analysis. Wiley classics library, published 1992. New York: Wiley.
Google Scholar
Chen, S.-H. (Ed.) (2002). Genetic algorithms and genetic programming in computational finance. Dordrecht: Kluwer Academic.
Google Scholar
Cortes, C., & Vapnik, V. V. (1995). Support vector networks. Machine Learning, 20, 273–297.
Google Scholar
Efron, B. (1979). Bootstrap methods: another look at the Jackknife. The Annals of Statistics, 7, 1–26.
Article Google Scholar
Estrella, A., Park, S., & Peristiani, S. (2000). Capital ratios as predictors of bank failure. Federal reserve bank of New York economic policy review, pp. 33–52, July 2000.
Google Scholar
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179.
Google Scholar
Grimmett, G., & Stirzaker, D. (2004). Probability and random processes (3rd ed.). Oxford: Oxford University Press.
Google Scholar
Joos, P., Vanhoof, K., Ooghe, H., & Sierens, N. (1998). Credit classification: a comparison of logit models and decision trees. In 10th European conference on machine learning. Proceedings notes of the workshop on application of machine learning and data mining in finance (pp. 59–72). 24 April 1998, Chemnitz, Germany.
Google Scholar
Credit Metrics—Technical Document (1997). JP Morgan. New York, April 1997.
Google Scholar
Kimeldorf, G. S., & Wahba, G. (1970). A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics, 41(2), 495–502.
Article Google Scholar
Krige, D. G. (1996). Two-dimensional weighting moving average trend surfaces for ore evaluation. Journal of the South African Institute of Mining and Metallurgy.
Google Scholar
Mackay, D. J. C. (1995). Probable networks and plausible predictions—a review of practical Bayesian methods for supervised neural networks. Network: Computation in Neural Systems, 6(3), 469–505.
Article Google Scholar
Mackay, D. J. C. (1998). Introduction to Gaussian processes. In C. M. Bishop (Ed.), NATO ASI Series: Vol. 168. Neural networks and machine learning (pp. 133–165). Berlin: Springer.
Google Scholar
Mackay, D. J. C. (2003). Information theory, learning and inference algorithms. Cambridge: Cambridge University Press.
Google Scholar
MacLachlan, G. J. (1991). Discriminant analysis and pattern recognition. New York: Wiley.
Google Scholar
Minka, T. P. (2001). A family of algorithms for approximate Bayesian inference. PhD thesis, Massachusetts Institute of Technology.
Google Scholar
Neal, R. M. (1996). Bayesian learning for neural networks. New York: Springer.
Google Scholar
O’Hagan, A. (1978). Curve fitting and optimal design for prediction. Journal of the Royal Statistical Society, Series B (Methodological), 40(1), 1–42.
Google Scholar
Park, C., & Han, I. (2002). A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction. Expert Systems with Applications, 23, 255–264.
Article Google Scholar
Peña Centeno, T., & Lawrence, N. D. (2006). Optimising kernel parameters and regularisation coefficients for non-linear discriminant analysis. Journal of Machine Learning Research, 7, 455–491.
Google Scholar
Quintana, D., Saez, Y., Mochon, A., & Isasi, P. (2007). Early bankruptcy prediction using ENPC. Journal of Applied Intelligence, ISSN 0924-669X.
Google Scholar
Rasmussen, C. E. (2004). Gaussian processes in machine learning. In O. Bousquet, U. von Luxburg, & G. Rätsch (Eds.), Lecture notes in computer science/artificial intelligence: Vol. 3176. Advanced lectures on machine learning. Berlin: Springer.
Chapter Google Scholar
Rasmussen, C. E., & Williams, C. K. (2006). Adaptive computation and machine learning. Gaussian processes for machine learning. Cambridge: MIT Press. http://www.GaussianProcess.org/gpml.
Google Scholar
Rätsch, G., Onoda, T., & Müller, K.-R. (1998) Soft margins for AdaBoost (Technical Report NC-TR-98-021). Royal Holloway College, University of London, London, UK.
Google Scholar
Seeger, M. (2004). Gaussian processes for machine learning. International Journal of Neural Systems, 14(2), 69–106.
Article Google Scholar
Serrano-Cinca, C., Martin, C. B., & Gallizo, J. (1993). Artificial neural networks in financial statement analysis: ratios versus accounting data. In 16th annual congress of the European accounting association, Turku, Finland, 28–30 Apr.
Google Scholar
Shin, K.-S., & Lee, Y.-J. (2002). A genetic algorithm application in bankruptcy prediction modeling. Expert Systems with Applications, 23, 321–328.
Article Google Scholar
Snelson, E., Rasmussen, C. E., & Ghahramani, Z. (2003). Warped Gaussian processes. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems 16. Cambridge: MIT Press.
Google Scholar
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, 36, 111–147.
Google Scholar
Suykens, J. A. & Vandewalle, J. (1999). Least squares support vector machines. Neural Processing Letters, 9(3), 293–300.
Article Google Scholar
Suykens, J. A., Van Gestel, T., Brabanter, J. D., Moor, B. D., & Vandewalle, J. (2002). Least squares support vector machines. Singapore: World Scientific.
Book Google Scholar
Thiele, T. N. (1931). Theory of observations. London: Layton. Reprinted in Annals of Mathematical Statistics, 2, 165–308.
Google Scholar
Tsang, E. P. K., & Martinez-Jaramillo, S. (2004). Computational finance. In IEEE computational intelligence society newsletter (pp. 3–8). New York: IEEE Press.
Google Scholar
Varetto, F. (1998). Genetic algorithms applications in the analysis of insolvency risk. Journal of Banking and Finance, 22, 1421–1439.
Article Google Scholar
Wahba, G. (1990). CBMS-NSF regional conference in applied mathematics: Vol. 59. Spline models for observational data. Philadelphia: Society for Industrial and Applied Mathematics.
Google Scholar
Williams, C. K. (1999). Prediction with Gaussian processes: from linear regression to linear prediction and beyond. In M. I. Jordan (Ed.), Behavioural and social sciences: Vol. 11. Learning in graphical models, D. Dordrecht: Kluwer Academic.
Google Scholar
Williams, C. K., & Barber, D. (1998). Bayesian classification with Gaussian processes. IEEE Transactions, Pattern Analysis and Machine Intelligence, 20(12), 1342–1351.
Article Google Scholar
Yip, A. Y. N. (2003). A hybrid case-based reasoning approach to business failure prediction (pp. 371–378). Amsterdam: IOS Press. ISBN 1-58603-394-8.
Google Scholar

Download references

Author information

Authors and Affiliations

Dirección General de Investigación Económica, Banco de México, Mexico City, Mexico
Tonatiuh Peña
Dirección General de Análisis del Sistema Financiero, Banco de México, Mexico City, Mexico
Serafín Martínez
Centre for Computational Finance and Economic Agents, University of Essex, Colchester, UK
Bolanle Abudu

Authors

Tonatiuh Peña
View author publications
You can also search for this author in PubMed Google Scholar
Serafín Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Bolanle Abudu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tonatiuh Peña .

Editor information

Editors and Affiliations

FB Wirtschaftswissenschaften, Universität Bielefeld, Universitätsstraße 25, Bielefeld, 33651, Germany
Herbert Dawid
Dept. Economics, New School for Social Research, Fifth Ave. 65, New York, 10003, New York, USA
Willi Semmler

Appendix

A brief description of the financial ratios that compose the FDIC data follows.

Ratio 1. Net interest margin (NIM) is the difference between the proceeds from borrowers and the interest payed to their lenders.

Ratio 2. Non-interest income (NII) is the sum of the following types of income: fee-based, trading, that coming from fiduciary activities and other non-interest associated one.

Ratio 3. Non-interest expense (NIX) comprises basically three types of expenses: personnel expense, occupancy and other operating expenses.

Ratio 4. Net operating income (NOI) is related to the company’s gross income associated with its properties less the operating expenses.

Ratio 5. Return on assets (ROA) is an indicator of how profitable a company is relative to its total assets. ROA is calculated as the ratio between the company’s total earnings over the year and the company’s total assets.

Ratio 6. Return on equity (ROE) is a measure of the rate of return on the shareholders’ equity of the common stock owners. ROE is estimated as the year’s net income (after preferred stock dividends but before common stock dividends) divided by total equity (excluding preferred shares).

Ratio 7. Efficiency ratio (ER) is a ratio used to measure the efficiency of a company, although not every one of them calculates it in the same way.

Ratio 8. Non current assets (NCA) are those that cannot be easily converted into cash, e.g. real estate, machinery, long-term investments or patents.

Ratio 9. It is the ratio of cash plus US treasury and government obligations to total assets.

Ratio 10. Equity capital (EC) is the capital raised from owners.

Ratio 11. The capital ratio (CR) also known as the leverage ratio is calculated as the Tier 1 capital divided by the average of the total consolidated assets.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Peña, T., Martínez, S., Abudu, B. (2011). Bankruptcy Prediction: A Comparison of Some Statistical and Machine Learning Techniques. In: Dawid, H., Semmler, W. (eds) Computational Methods in Economic Dynamics. Dynamic Modeling and Econometrics in Economics and Finance, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16943-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-16943-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16942-7
Online ISBN: 978-3-642-16943-4
eBook Packages: Business and EconomicsEconomics and Finance (R0)

Publish with us

Policies and ethics

Bankruptcy Prediction: A Comparison of Some Statistical and Machine Learning Techniques

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation