Neural Networks in Bankruptcy Prediction

A Structured Approach to Neural Networks in Bankruptcy Prediction
Fernando C. Almeida, FEARP, USP -Brazil

Abstract
This paper explores the difficulties as well as a structured approach in developing neural networks in management, more specifically in bankruptcy prediction. It distinguishes the potential strategic interest in using neural networks in a firm as a decision support tool. Neural networks are developed an tested through the exploring of the French industry of merchandise transportation. Through the construction of an experimental plan neural networks are tested and compared with logistic regression..
Key words: Neural Networks, Decision Support Systems, Methodology, Logistic Regression
competitive trumps to a firm. The Chase Manhattan Bank, for instance, has developed a neural network system that automatically detects frauds in credit card use 30% more accurately than statistical methods [ 141. Chase Manhattans system has interesting strategic outcomes: its superior performance gives the bank privileged information in relation to its competitors. It will probably reduce the banks costs in non refunded credits. The system is also a differentiation source: as a quicker blocking of the irregular use of cards is executed, a more satisfying service is offered to customers.
Neural Networks in the Organisation

In a growing complex organisatilonal context, a firm must capture information from outside its boundaries in a prospective way. New information technology tools like neural networks act in this manner. The following points reveal the importance of information [SI: Information permits incertitude reductions in the decision process improving its quality and effectiveness;
Though numerous studies have been devoted to bankruptcy prediction 1, the problem of decision support and the development of information systems in bankruptcy risk and credit evaluation is rarely developed. Many authors have explored neural nets in bankruptcy prediction ([1];[3];[5];[12]; [19]; etc.) though difficulties in using this tool and structured methods for exploring it were not explored. This paper proposes a structured method to explore bankruptcy prediction and analyses the French transportation industry.
It adds value to products or services;

Information flow effectiveness conditions the liaisons quality and relationships engaged by the firm. From our point of view, a neur#alnetwork is a tool that permits a firm to scan its environment in a privileged way and assure its endurance.
1. Information
Systems in Context: Neural Bankruptcy Prediction and Networks
A neural network decision support system to bankruptcy or credit evaluation has many strategic and organisational implications to a firm.
Knowledge Base Support Systems
Systems
and
Decision
The Strategic Information Technology

Bankruptcy risk evaluation is usually supported by statistical tools like the ZETA model [2] developed by using a discriminant analysis and employed by several financial institutions [ 161. The use of new information technologies scarcely explored in credit domain can offer considerable
See DUMONTIER (1990) for a comprehensive review of these studies.
A Decision Support System (DSS) is a manmachine system that through a dialogue interface amplifies decision makers reasoning capabilities in complex and ill structured problem resolution ([4] p.30). Neural Networks can enlarge capabilities of the traditional DSS limited by what ifs scenarios analysis. Traditional DSS used 1.0 suppose that a systems capability in generalising and analysing a greater number of alternatives could improve decision process effectiveness. However DSS shall evolve through the use of new information technologies from passive systems to active ones having the capability of
0-8186-8070-9/97 $10.00 0 1997 IEEE
influencing and guiding decision process [9]. DSS may evolve through the use of neural networks giving it enlarged capabilities.
The collection of data for bankrupt firms requires a definition of failure. This definition here is purely legalistic: failed firms are those whose failure has been sanctioned by judicial proceedings.
2. Neural Networks
The Backpropagation Learning Method
Networks are constructed in this research as proposed by backpropagation method RUMELHART et al.s [ 151 model of generalised delta rule : AWji(n+l) = pSjOi + aAWji(n) AWji(n+ 1) is the weight adjustment introduced at time n+l on the connection weight between neurones i and j. p is a constant called learning rate that controls the rate of corrections made on connection weights. The larger the learning rate, the larger the changes introduced in weights in each iteration. a is a constant called smoothing factor that makes learning process consider the weight value at time n. 6 is the error signal at the neurone output. The backpropagation model is a feed-forward model frequently used in management and financial applications ([IO]; [SI; [5], [ 121, etc.).
R1 = Net sales/total assets
R7 = Earnings Before Interests and Taxes/Total Assets R2 = Total debutotal assets R8 = Sales/Net Plant R3 = Cash flow/ net sales R9 = CashlTotal Assets R10= Inventory/Sales R4 = Current ratio R11= R5 = EBIT/total interests payment lnven tory/Receivables R6 = Total Income/Total R12= log(Tota1 Assets) Capital
3. Bankruptcy Prediction
Overview of the Process
Most bankruptcy prediction models are built using a paired-sample technique : one part of the sample contains data from failing firms, the other part contains contemporaneous data from non-failing firms. Variables are then selected because of their potential relevancy to detect bankruptcy and a statistical method is used to develop a classification model (i.e.,: a combination of variables that best discriminates between the two types of firms). Finally, the classification success is evaluated on a holdout sample (i.e.: a sample other than the one used to derive the model).
Choices in statistical methods

Identifying the distinguishing characteristics between bankrupt and non bankrupt firms is critical for at least two reasons: i) because of the lack of a comprehensive theory concerning bankruptcy, some unknown relevant predictor variables may be forgotten ; ii) intricate relationships among predictor variables may alter the predictive ability of the classification models. As they can work with noisy and incomplete inputs and produce the correct output by making use of context and generalising in incomplete information, neural nets are supposed to perform well in the prediction of risk failure. That is why a neural nets approach is introduced in this study. However, in order to appreciate neural nets performance in bankruptcy prediction, their classification accuracy will be compared with that of LOGIT analysis. The logistic regression approach is here preferred over the more usual multivariate discriminant analysis because it is at least as efficient as a linear classifier, even when all the assumptions of discriminant analysis hold [ 131. Since OHLSON [ 1I ] LOGIT analysis is frequently used to estimate the failure risk, conditioned on
4. Quantitative Analysis
Sample Selection and Data Collection
The data sample of this study consists of 2736 French firms belonging to the transport industry including I14 firms that failed in the period 1955-1990.
financial characteristics (i.e. : ratios) of firms. The LOGIT model creates for each firm a score Z that may be used to assess the probability of failure:
1. Definition of an experimental plan 2. Construction of the neural nets from experimental

plan
Z= a+pXi
where Xi is the value of the ith variable (i.e.,: financial characteristic).
3 . Graph analysis of neural nets results

4. Identification of a portfolio of networks
Definition of an Experimental Plan

The following elements were explored through the experimental plan:
1
Since, by construction, P always falls between 0 and 1, it is usually interpreted as the probability of failure.
Incorporating Prior Probabilities and Cost of Misclassification

Prior probabilities of failure and cost of misclassification must be assigned to guarantee a successful application of the predicting model. There are two types of errors of classification. The first one, called type I error, consists of identifying a failed firm as non failed. The type I1 error consists of identifying a non failed firm as failed. The cut-off score is chosen so as to minimise the misclassifiaction of the two groups. With neural nets as well as with logistic regressions, this score may vary from 0 to 1 in so far as the probability of failure is equal to 1 for failing firm and to 0 for non failing firms.
i. The use of historical data
Validation of Results
As a model generally fit,s the sample from which it was derived, two sub samples were randomly selected from the entire 2414 firms sample. The first one is used to derive the models, the second one is used to test models predictive accuracy.
Three ways of introducing historical (data were used:
H1 : The introduction of the latest ratio value of each variable followed by the preceding years value: Rn- 1,Rn-2 Values from one (Rn-1) and two years (Rn-2) before bankruptcy are intiroduced
H2 :The introduction of the latest ratio value of each variable and the difference between two succeeding years: Rn-l ,
Neural Nets Construction Methodology

This paper proposes the conception and execution of an experimental plan to explore the various parameters (or factors) influence on networks performance. The plan evaluation is done through a graph analysis method. The most performing networks will be introduced in a portfolio to appreciate the failure risk. Performance is evaluated considering the percentage of firms correctly classified in each of two groups of firms (failing and non failing groups). The proposed methodology includes the following steps:
& = Rn-l -
Rn-2. Values one year before bankruptcy (Rn-l) and difference between two values of the same variable one and two years before failing (&) are introduced.
H3 : The introduction of the latest ratio value of each variable and the ratio between two
succeeding values: Rn-,,
& = Rn-,/ R,,-2
Values of each variable one year before bankruptcy (Rn.l) and the ratio between values of the same variable one and two years before bankruptcy (&) are introduced.
ii. Number of Neurones in Hidden Layers
The number of hidden neurones and hidden layers does not have any theoretical limit. This limit is only imposed by costs, time, and computational constraints in creating a network. In this study, hidden neurones are explored as follows: 5, 10 , 40 or 80 neurones by hidden layer.
iii. Number of Hidden Layers
representation of non failing firms in the samples. It can be noticed that this proportion is not consistent with the real distribution of failing and non failing firms in the population, but the inclusion of more non failing firms would have increased the learning time and therefore the computational costs. The neural networks learning algorithm could have been transformed to take the prior probability of failure into account, as suggested by TAM & KIANG [l8]. Unfortunately, the package used in this study does not allow any correction of the algorithm.
Number of Interaction
As time necessary to train a network increases with the number of layers, only two hidden layers will be used in this study and the number of neurones is limited to 40 in networks with two hidden layers. Three configurations are tested: i) 5 neurones in the first hidden layer and 5 in the second one, ii) 10 in the first and 10 in the other one, iii) 40 neurones in each hidden layer.
An iteration is a complete reading of the data set during the learning process. The learning process of certain neural nets has converged ( i.e.: the network has learned all facts in the fixed precision of 0,l). Some networks, however, have not converged after a certain number of iterations. Based on the mean error2 observed, the learning process in these cases was interrupted after 1300 iterations.
Graph Analysis of Neural Nets Results

A graph analysis was elaborated to evaluate the performance of networks constructed. As it has been mentioned earlier, when considering a cut-off score to class a firm in one of two groups (failing or non failing) there are two types of classifying error: the type I error 1 and the type 1 error. Through a graph analysis different cut-off values may be considered and network performance can be observed in different error levels. In varying cut-off scores, different errors of classification are obtained for each network, (i.e. different percentages of firms correctly classified are obtained in each of two groups). The following cut-off values were used : 0,95;0,9;0,7;0,5;0,3;0,1;0,05;0,01. As 10 sub samples were randomly selected to create 10 times the same configuration of network (56 configurations in the experimental plan), means (p)and standard deviation (0) percentage of firms correctly of classified in failing and non failing groups were calculated for each of 56 configurations. The trade-off between failing and non failing percentage of correct classification generates 8 points, one for each cut-off value, that are plotted in a graph. Curves from different networks may be compared.
iv. Predicting Ratios

As the most relevant ratios to predict failure are not fully known, two batteries of ratios are used.
Construction of the Neural Nets from an Experimental Plan

Number of Essays
The greater the number of experiences per cell, the better the experimental plan. However, because of the time necessary to train a network, the number of experiences must be carefully selected. The number of experiences per cell was here limited to 10. Out of the 76 failing and 2338 non failing firms, 45 failing firms and 135 non failing firms were randomly selected to create each network. The remaining 31 failing and 93 non failing firms were used to validate the results. This random selection was made 10 times to obtain 10 sub samples. With these 10 sub samples, 10 networks were created for each of the 56 cells in the experimental plan. According to table 2, 560 neural nets were therefore created to evaluate the influence of each parameter previously described on the networks predictive accuracy. The proportion between failing and non failing firms is 1:3. This proportion was chosen to increase the
The mean error is the error observed on each neurone divided by the number of neurones and multiplied by the total number of examples (or facts) in the training set.
Instead of using only the mean of correct classification percentage, the difference between mean and standard deviation is used ((p-o)def X ( p - c ~ ) ~ ~ In .this way i) not only the predictive capability of the network structure is considered (p),but also its robustness (0). Due to space only the best performing networks are presented here (Chart 1). L-R-D-N represents the network configuration and data used to create it. For example 1-2-1-5 means 1 hidden layer (l), second set of ratios (2), no historical data (I), 5 neurones in the hidden layer(s) (5).
Chart 1 - T rade-off between percentages of correct classification of failing and non failing firms - Group I - 6 mist performing networks (Mean - S tandcird Dev.)
I2O
1
L
-) I -
-E+ -W -U-
1-1-1-5 l-l-H2-40 1-2-1-40 1 -2-H2-40
Table 3 - The 16 best performing networks (The first and second groups) I First set of ratios I Second set of ratios I
20
40
60
80
100
% F ailing firms correctly classified
Result of the Graph Analysis

Observing the different curves it can be noticed that it is not always possible to identify the most performing configuration. Tables 3 and 4 indiicate 4 groups of networks obtained from the graph analysis. Configuration performance varies with the cut-off point. So a portfolio of 6 networks was identified as containing the most performing networks among the 56 structures explored. First group : 6 most performing networks (@); Second group : This group of 10 networks is less performing than the first group but more performing than the other networks (0); Third group: constituted of 11 nets
(0);
Forth group: constituted by 24 networks whose performance is inferior to preceding networks ( 0).
Table 4 - Less performing networks (third and forth groups)
13
of network with 5 or 10 neurones are among the best performing ones. This study does not permit a consistent conclusion about using only a few neurones. Moreover other studies have obtained interesting results with only a few neurones [3];[ 181, though they do not explore networks with more than 10 neurones.
Number of layers
66~
This study does not distinguish any interest in using two layers in bankruptcy prediction. Sometimes one layer network outperforms two layer networks, sometimes the opposite is observed.
Number of ratios
6-
It cant be concluded from this study that a set of ratio was better than the other.
The Influence of the Type of Structure in Network Performance

It can be noticed from these results that the systematic use of one type of network structure has not always produced the best performing network. In other words even if some of the best performing networks are obtained using 40 neurones in the hidden layer (1-1H2-40; 1-2-1-40;1-2-H2-40), 40 neurones can produce sometimes bad performing networks ( 1 - 1-1-40; 1-2-H140; etc.). Graph analysis and tables 3 and 4 suggest the following conclusions:
Use of Historical Data
Comparison of Neural Networks with Logistic Regression

In order to compare network performance with statistical methods the same data sample was used. Chart 2 compares classification performance of both techniques. Network 1-2-H2-40 presented in chart 2 is one of the six most performing networks. It can be noticed that performance of both techniques were considerably similar. Both graphs were constructed using the same interval of cut-off scores (from 0,95 to 0,001). However, it can be observed that LOGIT is more sensible to changes in cut-off score than networks (i.e. smaller variance of network curve). It may be inferred that predicting capability of nets are more stable than those of LOGIT that changes more abruptly when varying the cut-off score. In a real decision context when the error risks related to the critical score are unknown predictions made with nets are more reliable then those made by LOGIT as slight changes in the chosen cut-off score value do not incur in significant change in risk evaluation.
It can be observed that H3 networks are almost always the worst performers. H2 has produced better results (8 networks among the 16 better networks), superior to I (Y16) or H1 (3/16). These results suggest interesting implications. First, it suggests that using historical data as ratios gives bad results (H3) and suggest that networks are able by themselves to find the most interesting relations among different years (H2). Results are consistent with the hypothesis of STANLEY [ 171 concerning the use of historical data. STANLEY suggests that better nets may be obtained when using the difference between two years value (H2) than the variable values themselves (H 1). Finally better networks were obtained by using H2 than by using I, what suggests the interest of using historical data.
Number of Neurones
Networks with 5 and 10 neurones have generated bad performing networks, though certain configurations
100
$ 9 0 80 70 I3-.E 60 m.EI z 50 ,V 3 2 40 30 20 8 10 0
nets conception. Other parameters than those presented in this study should be explored. This study has explored the French transport industry. Neural nets predicting perfiormances have not significantly surpassed statistical methods. It is possible that in other industries where other variables than those used here are available a distinguished performance of nets may be observed.
This study concerns the problem of neural nets development in bankruptcy prediction. Other studies should explore failing processes comprehension through neural nets use. In fact neural nets do not have 0 50 100 the same behaviour as logistic regression and they could eventually bring new perspectives in bankruptcy ailling ified process comprehension ( [ 5 ] ) . In DE ALMEIDA the problem of interpreting failing processes through neural nets is more extensively discussed. As a matter of space, this paper does not develop this point. Chart 2 Comparison between Networks and LOGIT REFERENCES
5. Conclusions
This study brings some discussion about the use of neural networks for bankruptcy prediction. The absence of a theory of bankruptcy analysis brings the additional difficulty of properly choosing a set of predictive variables. Therefore other sets of variables could eventually bring a better predictive quality to the neural network model. This paper has distinguished not one best trained network, but a portfolio of six networks. Precise distinction among performance of different networks may not be the main concern of a decision maker. The identification of some best networks will probably be sufficient to conceive a portfolio of networks to support the decision process in bankruptcy risk evaluation. In this way a graph analysis is an interesting analysis tool. When choosing a neural network developing package, the developer should consider the capabilities and features of the package in helping the automation of the conception and execution processes. BRAINMAKER is not one of these packages and a database developing language was used to construct the experimental plan. This study introduces a structured manner of exploring neural networks in bankruptcy prediction. Despite the complexity of neural network development, neural nets conception in management is not fully discussed in the literature. Studies in neural nets use in management normally do not discuss the exploring of different parameters in
[ l ] ADYA, M. and COLLOPY, F. Does AI Research Aid Prediction? A Review and Evaluation.. Proceedings of the sixteenth International Conference on Information Systems, Amsterdam, the Netherlands, December 10-13, p. 123-140, 1995. [2] ALTMAN, E., R. HALDEMAN el: P. NARAYANAN (1977). -Zeta analysis. -Journal of banking and finance, June 1977. -p.29-54,. [31 BELL, B.T., G.R. RIBAR, J.R. VERCHIO (1990). Neural nets vs. logistic regression : a comparison of each models ability to predict commercial bank failures. -Actes du congres international de comptabilitk..-Tome I. -Nice, December 1990.
[41 COURBON, J-C. (1983). --Les ~ 1 ~ 1 ) concepts et Outil, mode daction. -AFCETinterface.,9 July 1983. -p.3036.
[ 5 ] DE ALMEIDA, F.C. and LESCA, H. - AdministraCBo EstratCgica da InformaCHo. - Revist:a de AdministraqSio. V.29, n., jul-sept, p. 66-75, 1994.
[6] DE ALMEIDA, F.C. -LEvaluation des risques de dCfaillance des entreprises a paritir des rCseaux de neurones insCrCs dans les systkmes daide a la d6cision.Doctoral thesis in Management. -Ecole SupCrieure des Affaires, Universidade de Grenoble, 1993.
[7] DUMONTIER, P. (1990). -Vices et vertus des modkles de prkvision de dkfaillance. - Papier de recherche no 90-12, Universite de Cirenoble 11, CERAG, 1990
[8] DUTrA S., S. SHEKHAR et W.Y. WONG (1992). Decision support in non-conservative domains : generalization with neural networks. -WP no 92-3 1, INSEAD. 1992. [9] KEEN, P.G.W et M.S.SCOIT-MORTON (1978). Decision supports systems : an perspective. -Addison Wesley, 1978. organisational
[I41 ROCHESTER, J.B. (1990) -New business for neurocomputing. -IS/Analyser, vol. 28, n2, 1990. -p. 116.
[ 151 RUMELHART, D.E., J.C. McCLELLAND, PDP Research Group.- Parallel Distributed Processing -
Exploration in the Microtexture of Cognition.- Volume 1.-London.- The MIT Press.- 1986. [16] SCOTT, J. (1981).-The probability of bankruptcy : a comparison of empirical predictions and theoretical models. -Journal of banking and finance, no 5, September, 1981.-p. 1-26. 1171 STANLEY J. - Introduction to Neural Networks.CA:Sierra Madre.-Cal. Scientific Software.-3rd edition.1990.
[I81 TAM, K.T. & KIANG, M.Y.- Managerial Applications of Neural Networks: The case of Bank Failure Predictions.. Management Science, vol. 38, p.926-947, 1992
[lo] MAGNIER, J.P. -Utilisation de rCseaux de neurones pour le dkveloppement de systkmes daide 2 la dCcision. -Montpelier, Centre de recherche en gestion des organisations, 1991.
[ I I] OHLSON, J.A.- Financial Ratios and the Probabilistic Prediction of Bankruptcy.- Journal of Accounting Research, Spring, p.109-131, 1980.
[ 121 PODDIG, T. Bankruptcy Prediction: A Comparison
with Discriminant Analysis. in Neural Networks in Capital Markets. Editado por A.P. REFENES, New York.-John Wiley & Sons, 1995.
[I31 PRESS D.J. & WILSON S . -Choosing Between
Logistic Regression and Discriminant Analysis. Journal of American Statistical Economics, 1978, p.335.
[19] WILSON, R.L. & SHAKDA, R. -Bankruptcy Prediction Using Neural Networks. -Decision Support Systems, vol. 1 1 , n. 5 , p. 545-557, junho 1994.

Neural Networks in Bankruptcy Prediction

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Neural Networks in Bankruptcy Prediction

Загружено:

Авторское право:

Доступные форматы

A Structured Approach to Neural Networks in Bankruptcy Prediction

Fernando C. Almeida, FEARP, USP -Brazil

Neural Networks in the Organisation

It adds value to products or services;

Systems in Context: Neural Bankruptcy Prediction and Networks

Knowledge Base Support Systems

The Strategic Information Technology

See DUMONTIER (1990) for a comprehensive review of these studies.

0-8186-8070-9/97 $10.00 0 1997 IEEE

R1 = Net sales/total assets

Choices in statistical methods

1. Definition of an experimental plan 2. Construction of the neural nets from experimental

3 . Graph analysis of neural nets results

Definition of an Experimental Plan

Incorporating Prior Probabilities and Cost of Misclassification

i. The use of historical data

Three ways of introducing historical (data were used:

Neural Nets Construction Methodology

& = Rn-,/ R,,-2

ii. Number of Neurones in Hidden Layers

Graph Analysis of Neural Nets Results

iv. Predicting Ratios

Construction of the Neural Nets from an Experimental Plan

1-1-1-5 l-l-H2-40 1-2-1-40 1 -2-H2-40

% F ailing firms correctly classified

Result of the Graph Analysis

Table 4 - Less performing networks (third and forth groups)

The Influence of the Type of Structure in Network Performance

Comparison of Neural Networks with Logistic Regression

Вам также может понравиться