Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

2.1 Introduction

This chapter focuses on the particular difficulties encountered when developing internal rating models for corporate exposures. The main characteristic of these internal rating models is that they mainly rely on financial ratios. Hence, the aim is to demonstrate how financial ratios can be used for statistical risk assessment. The chapter is organised as follows: Sect. 2.2 describes some of the issues concerning model selection, while Sect. 2.3 presents data from Austrian companies that will illustrate the theoretical concepts. Section 2.4 discusses data processing, which includes the calculation of financial ratios, their transformation to establish linearity, the identification of outliers and the handling of missing values. Section 2.5 describes the actual estimation of the rating model, i.e. univariate and multivariate analyses, multicollinearity issues and performance measurement. Finally, Sect. 2.6 concludes.

2.2 Model Selection

Chapter 1presents several statistical methods for building and estimating rating models. The most popular of these model types – in the academic literature as well as in practice – is the logit model, mainly for two reasons. Firstly, the output from the logit model can be directly interpreted as default probability, and secondly, the model allows an easy check as to whether the empirical dependence between the potential explanatory variables and default risk is economically meaningful (see Sect. 2.4). Hence, a logit model is chosen to demonstrate the estimation of internal rating models for corporate exposures.

Next, the default event must be defined. Historically, rating models were developed using mostly the default criterion bankruptcy, as this information was relatively easily observable. However, banks also incur losses before the event of bankruptcy, for example, when they allow debtors to defer payments without compensation in hopes that later on, the troubled borrowers will be able to repay their debt. Therefore, the Basel Committee on Banking Supervision (2001) defined a reference definition of default that includes all those situations where a bank looses money and declared that banks would have to use this regulatory reference definition of default for estimating internal rating-based models. However, as demonstrated in Hayden (2003), rating models developed by exclusively relying on bankruptcy as the default criterion can be equally powerful in predicting the comprising credit loss events provided in the new Basel capital accord as models estimated on these default criteria. In any case, when developing rating models one has to guarantee that the default event used to estimate the model is comparable to the event the model shall be capable to predict.

Finally, a forecast horizon must be chosen. As illustrated by the Basel Committee on Banking Supervision (1999), even before Basel II for most banks it was common habit to use a modelling horizon of one year, as this time horizon is on the one hand long enough to allow banks to take some action to avert predicted defaults, and on the other hand the time lag is short enough to guarantee the timeliness of the data input into the rating model.

2.3 The Data Set

The theoretical concepts discussed in this chapter will be illustrated by application to a data set of Austrian companies, which represents a small sample of the credit portfolio of an Austrian bank. The original data, which was supplied by a major commercial Austrian bank for the research project described in Hayden (2002), consisted of about 5,000 firm-year observations of balance sheets and gain and loss accounts from 1,500 individual companies spanning 1994 to 1999. However, due to obvious mistakes in the data, such as assets being different from liabilities or negative sales, the data set had to be reduced to about 4,500 observations. Besides, certain firm types were excluded, i.e. all public firms including large international corporations that do not represent the typical Austrian company and rather small single owner firms with a turnover of less than 5 m ATS (about 0.36 m EUR), whose credit quality often depends as much on the finances of a key individual as on the firm itself. After eliminating financial statements covering a period of less than twelve months and checking for observations that were included twice or more in the data set, almost 3,900 firm-years were left. Finally, observations were dropped where the default information (bankruptcy) was missing or dubious.

Table 2.1 shows the total number of observed companies per year and splits the sample into defaulting and non-defaulting firms. However, the data for 1994 is not depicted, as we are going to calculate dynamic financial ratios (which compare current to past levels of certain balance sheet items) later on, and these ratios cannot be calculated for 1994 as the first period in the sample.

Table 2.1 Number of observations and defaults per year

2.4 Data Processing

Section 2.4 discusses the major preparatory operations necessary before the model estimation can be conducted. They include the cleaning of the data, the calculation of financial ratios, and their transformation to establish linearity.

2.4.1 Data Cleaning

Some of the important issues with respect to data cleaning were mentioned in Sect. 2.3 when the Austrian data set was presented. As described, it was guaranteed that:

  • The sample data was free of (obvious) mistakes

  • The data set comprised only homogeneous observations, where the relationship between the financial ratios and the default event could be expected to be comparable

  • The default information was available (and reliable) for all borrowers

In addition, missing information with respect to the financial input data must be properly managed. Typically, at least for some borrowers, part of the financial information is missing. If the number of the observations concerned is rather low, the easiest way to handle the problem is to eliminate the respective observations completely from the data set (as implemented for the Austrian data). If, however, this would result in too many observations being lost, it is preferable to exclude all variables with high numbers of missing values from the analysis. Once the model has been developed and is in use, the missing information needed to calculate the model output can be handled by substituting the missing financial ratios with the corresponding mean or median values over all observations for the respective time period (i.e. practically “neutral” values) in order to create as undistorted an assessment as possible using the remaining input factors.

2.4.2 Calculation of Financial Ratios

Once the quality of the basic financial data is guaranteed, potential explanatory variables have to be selected. Typically, ratios are formed to standardise the available information. For example, the ratio “Earnings per Total Assets” enables a comparison of the profitability of firms of different size. In addition to considering ratios that reflect different financial aspects of the borrowers, dynamic ratios that compare current to past levels of certain balance sheet items can be very useful for predicting default events. Overall, the selected input ratios should represent the most important credit risk factors, i.e. leverage, liquidity, productivity, turnover, activity, profitability, firm size, growth rates and leverage development.

After the calculation of the financial input ratios, it is necessary to identify and eliminate potential outliers, because they can and do severely distort the estimated model parameters. Outliers in the ratios might exist even if the underlying financial data is absolutely clean, for example, when the denominator of a ratio is allowed to take on values close to zero. To avoid the need to eliminate the affected observations a typical procedure is to replace the extreme data points by the 1% respectively the 99% percentile of the according ratio.

Table 2.2 portrays the explanatory variables selected for use for the Austrian data and presents some descriptive statistics. The indicators chosen comprise a small set of typical business ratios. A broader overview over potential input ratios as well as a detailed discussion can be found in Hayden (2002).

Table 2.2 Selected input ratios

The last column in Table 2.2 depicts the expected dependence between the accounting ratio and the default probability, where + symbolises that an increase in the ratio leads to an increase in the default probability and – symbolises a decrease in the default probability given an increase in the explanatory variable. Whenever a certain ratio is selected as a potential input variable for a rating model, it should be assured that a clear hypothesis can be formulated about this dependence to guarantee that the resulting model is economically plausible. Note, however, that the hypothesis chosen can also be rather complex; for example, for the indicator sales growth, the hypothesis formulated is “−/+”. This takes into account that the relationship between the rate at which companies grow and the rate at which they default is not as simple as that between other ratios and default. While it is generally better for a firm to grow than to shrink, companies that grow very quickly often find themselves unable to meet the management challenges presented by such growth – especially within smaller firms. Furthermore, this quick growth is unlikely to be financed out of profits, resulting in a possible build up of debt and the associated risks. Therefore, one should expect that the relationship between sales growth and default is non-monotone, what will be examined in detail in the next section.

2.4.3 Test of Linearity Assumption

After having selected the candidate input ratios, the next step is to check whether the underlying assumptions of the logit model apply to the data. As explained in Chap. 1, the logit model can be written as

$$ {{\hbox{P}}_{\rm{i}}} = {\hbox{P(}}{{\hbox{y}}_{\rm{i}}} = {1)} = {\hbox{F(}}{{\boldsymbol \beta ^\prime}} \cdot {{{x}}_{\rm{i}}}) = \frac{{{{\text{e}}^{{{\boldsymbol \beta ^\prime}} \cdot {{{x}}_{\rm{i}}}}}}}{{{1} + {{\hbox{e}}^{{{\boldsymbol \beta ^\prime}} \cdot {{{x}}_{\rm{i}}}}}}}, $$
(2.1)

which implies a linear relationship between the log odd and the input variables:

$$ {\hbox{Log}}\ {\hbox{odd}} = \ln \left( {\frac{{{{\text{P}}_{\rm{i}}}}}{{{1} - {{\hbox{P}}_{\rm{i}}}}}} \right) = {{\boldsymbol \beta ^\prime}} \cdot {{{x}}_{\rm{i}}} $$
(2.2)

This linearity assumption can be easily tested by dividing the indicators into groups that all contain the same number of observations, calculating the historical default rate respectively the empirical log odd within each group, and estimating a linear regression of the log odds on the mean values of the ratio intervals.

When applied to the Austrian data (by forming 50 groups), this procedure permits the conclusion that for most accounting ratios, the linearity assumption is indeed valid. As an example the relationship between the variable “EBIT/Total Assets” and the empirical log odd as well as the estimated linear regression is depicted in Fig. 2.1. The regression fit is as high as 78.02%.

Fig. 2.1
figure 1_2

Relationship between “EBIT/Total Assets” and log odd

However, one explanatory variable, namely sales growth, shows a non-linear and even non-monotone behaviour, just as was expected. Hence, as portrayed in Fig. 2.2, due to the linearity assumption inherent in the logit model, the relationship between the original ratio sales growth and the default event cannot be correctly captured by such a model.

Fig. 2.2
figure 2_2

Relationship between “Sales Growth” and log odd

Therefore, to enable the inclusion of the indicator sales growth into the rating model, the ratio has to be linearized before logit regressions can be estimated. This can be done in the following way: the points obtained from dividing the variable sales growth into groups and plotting them against the respective empirical log odds are smoothed by a filter, for example the one proposed in Hodrick and Prescott (1997), to reduce noise. Then the original values of sales growth are transformed to log odds according to this smoothed relationship, and in any further analysis the transformed log odd values replace the original ratio as input variable.

This test for the appropriateness of the linearity assumption also allows for a first check as to whether the univariate dependence between the considered explanatory variables and the default probability is as expected. For the Austrian data the univariate relationships between the investigated indicators and the default event coincide with the hypotheses postulated in Table 2.2, i.e. all ratios behave in an economically meaningful way.

2.5 Model Building

2.5.1 Pre-selection of Input Ratios

After verifying that the underlying assumptions of a logistic regression are valid, the model building process can be started. However, although typically a huge number of potential input ratios are available when developing a rating model, from a statistical point of view it is not advisable to enter all these variables into the logit regression. If, for example, some highly correlated indicators are included in the model, the estimated coefficients will be significantly and systematically biased. Hence, it is preferable to pre-select the most promising explanatory variables by means of the univariate power of and the correlation between the individual input ratios.

To do so, given the data set at hand is large enough to allow for it, the available data should be divided into one development and one validation sample by randomly splitting the whole data into two sub-samples. The first one, which typically contains the bulk of all observations, is used to estimate rating models, while the remaining data is left for an out-of-sample evaluation. When splitting the data, it should be ensured that all observations of one firm belong exclusively to one of the two sub-samples and that the ratio of defaulting to non-defaulting firms is similar in both data sets. For the Austrian data, about 70% of all observations are chosen for the training sample as depicted in Table 2.3.

Table 2.3 Division of the data into in- and out-of-sample subsets

The concrete pre-selection process now looks as follows: At first, univariate logit models are estimated in-sample for all potential input ratios, whose power to identify defaults in the development sample is evaluated via the criterion of the accuracy ratio (AR), a concept discussed in detail in Chap. 13. Afterwards, the pairwise correlation between all explanatory variables is computed to identify sub-groups of highly correlated indicators, where by rule of thumb ratios with absolute correlation values of above 50% are pooled into one group. Finally, from each correlation sub-group (that usually contains only ratios from one specific credit risk category) that explanatory variable is selected for the multivariate model building process that has got the highest and hence best accuracy ratio in the univariate analysis.

Table 2.4 displays the accuracy ratios of and the correlation between the financial ratios calculated for the Austrian data set. As can be seen, explanatory variable 1 is highly correlated with indicator 2 (both measuring leverage) and ratio 10 with variable 11 (both reflecting profitability). Besides, the input ratios 2 and 11 have got better (higher) accuracy ratios than the indicators 1 respectively 10, hence, the latter ones are dropped from the list of explanatory variables for the multivariate analysis.

Table 2.4 Pairwise correlation of all potential input ratios

2.5.2 Derivation of the Final Default Prediction Model

Those ratios pre-selected in the previous step are now used to derive the final multivariate logit model. Usually, however, the number of potential explanatory variables is still too high to specify a logit model that contains all of them, because the optimal model should contain only a few, highly significant input ratios to avoid overfitting. Thus, even in our small example with only 12 indicators being left, we would have to construct and compare 212 = 4,096 models in order to determine the “best” econometric model and to entirely resolve model uncertainty. This is, of course, a tough task, which becomes infeasible for typical short lists of about 30 to 60 pre-selected input ratios. Therefore, the standard procedure is to use forward/backward selection to identify the final model (see Hosmer and Lemeshow 2000).

For the Austrian data set backward elimination, one possible method of these statistical stepwise variable selection procedures that is implemented in most statistical software packages, was applied to derive the final logit model. This method starts by estimating the full model (with all potential input ratios) and continues by eliminating the worst covariates one by one until the significance level of all remaining explanatory variables is below the chosen critical level, usually set at 90% or 95%.

Table 2.5 describes two logit models derived by backward elimination for the Austrian data. It depicts the constants of the logit models and the estimated coefficients for all those financial ratios that enter into the respective model. The stars represent the significance level of the estimated coefficients and indicate that the true parameters are different from zero with a probability of 90% (*), 95% (**) or 99% (***).

Table 2.5 Estimates of multivariate logit models

Model 1 arises if all 12 pre-selected variables are entered into the backward elimination process. Detailed analysis of this model shows that most signs of the estimated coefficients correspond to the postulated hypotheses, however, the model specifies a positive relationship between the ratio number 9 “Net Sales/Total Assets”, while most empirical studies find that larger firms default less frequently. What’s more, even for our data sample a negative coefficient was estimated in the univariate analysis. For this reason, a closer inspection of input ratio 9 seems appropriate.

Although the variable “Net Sales/Total Assets” does not exhibit a pairwise correlation of more than 50%, it shows absolute correlation levels of about 30% with several other covariates. This indicates that this particular ratio is too highly correlated (on a multivariate basis) with the other explanatory variables and has to be removed from the list of variables entering the backward elimination process.

Model 2 in Table 2.5 depicts the resulting logit model. Here all coefficients are of comparable magnitude to those of model 1, except that the ratio “Accounts Receivable/Net Sales” becomes highly insignificant and is therefore excluded from the model. As a consequence, all estimated coefficients are now economically plausible, and we accept model 2 as our (preliminary) final model version.

2.5.3 Model Validation

Finally, the derived logit model has to be validated. In a first step, some statistical tests should be conducted in order to verify the model’s robustness and goodness of fit in-sample, and in a second step the estimated model should be applied to the validation sample to produce out-of-sample forecasts, whose quality can be evaluated with the concept of the accuracy ratio and other methods depicted in Chap. 13.

The goodness-of-fit of a logit model can be assessed in two ways: first, on the basis of some test statistics that use various approaches to measure the distance between the estimated probabilities and the actual defaults, and second, by analysing individual observations which can each have a certain strong impact on the estimated coefficients (for details see Hosmer and Lemeshow 2000).

One very popular goodness-of-fit test statistic is the Hosmer-Lemeshow test statistic that measures how well a logit model represents the actual probability of default for groups of firms of differently perceived riskiness. Here, the observations are grouped based on percentiles of the estimated default probabilities. For the Austrian data 10% intervals were used i.e. ten groups were formed. Now for every group the average estimated default probability is calculated and used to derive the expected number of defaults per group. Next, this number is compared with the amount of realised defaults in the respective group. The Hosmer-Lemeshow test statistic then summarises this information for all groups. In our case of ten groups the test statistic for the estimation sample is chi-square distributed with 8 degrees of freedom, and the corresponding p-value for the rating model can then be calculated as 79.91%, which indicates that the model fits quite well.

However, the Hosmer-Lemeshow goodness-of-fit test can also be regarded from another point of view for the application at hand. Until now we only dealt with the development of a model that assigns each corporation a certain default probability or credit score, which leads towards a ranking between the contemplated firms. However, in practice banks usually want to use this ranking to map the companies to an internal rating scheme that typically is divided into about ten to twenty rating grades. The easiest way to do so would be to use the percentiles of the predicted default probabilities to build groups. If for example ten rating classes shall be formed, then from all observations the 10% with the smallest default probabilities would be assigned the best rating grade, the next 10% the second and so on till the last 10% with the highest estimated default probabilities would enter into the worst rating class. The Hosmer-Lemeshow test now tells us that, given one would apply the concept described above to form rating categories, overall the average expected default probability per rating grade would fit with the observed default experience per rating class.

What’s more, as depicted in Table 2.6, the in-sample accuracy ratio is about 44%, which is a reasonable number. Usually the rating models for corporate exposures presented in the literature have an accuracy ratio between 40% and 70%. As discussed in Chap. 13 in detail, AR can only be compared reliably for models that are applied to the same data set, because differences in the data set such as varying relative amounts of defaulters or non-equal data reliability drives this measure heavily, hence, an AR of about 44% seems satisfactory.

Table 2.6 Validation results of the final logit model

Finally, the out-of-sample accuracy ratio amounts to about 41%, which is almost as high as the in-sample AR. This implies that the derived rating model is stable and powerful also in the sense that it produces accurate default predictions for new data that was not used to develop the model. Therefore, we can now eventually accept the derived logit model as our final rating tool.

2.6 Conclusions

This chapter focused on the special difficulties that are encountered when developing internal rating models for corporate exposures. Although the whole process with data collection and processing, model building and validation usually takes quite some time and effort, the job is not yet completed with the implementation of the derived rating model. The predictive power of all statistical models depends heavily on the assumption that the historical relationship between the model’s covariates and the default event will remain unchanged in the future. Given the wide range of possible events such as changes in firms’ accounting policies or structural disruptions in certain industries, this assumption is not guaranteed over longer periods of time. Hence, it is necessary to revalidate and eventually recalibrate the model regularly in order to ensure that its predictive power does not diminish.