CLASSIFICATION OF TRADE SECTOR ENTITIES IN CREDIBILITY ASSESSMENT USING NEURAL NETWORKS

One of the most valid tasks in credit risk evaluation is the proper classification of potential good and bad customers. Reduction of the number of loans granted to companies of questionable credibility can significantly influence banks’ performance. An important element in credit risk assessment is a prior identification of factors which affect companies’ standing. Since that standing has an impact on credibility and solvency of entities. The research presented in the paper has two main goals. The first is to identify the most important factors (chosen financial ratios) which determine company’s performance and consequently influence its credit risk level when granted financial resources. The question also arises whether the line of business has any impact on factors that should be included in the analysis as the input. The other aim was to compare the results of chosen neural networks with credit scoring system used in a bank during credit risk decision-making process.


Introduction
Running a business is a constant process of decision making.Those decisions are always accompanied by risk which can be characterised by a different scale.Ultimately that risk takes a form of a financial dimension.In market economy risk is a common phenomenon.However, it affects various branches, and decisions made within them, to a different extent.The branch which is exposed to a financial and credit risk the most is the financial sector (banks, financial institutions etc.).
The history of bank systems indicates that the main reason of decreasing potential profits or bank's own capital and the occurrence of financial difficulties was inefficient credit granting policy, faulty credit procedures, of credit norms and regulations as well as insufficient collateral of a loan.The mentioned difficulties often lead to a loss of credibility and liquidity which, in turn, to banks' default and bankruptcy.Banks cannot be too reckless in granting financial resources, on the other hand, however, they cannot be too restrictive, either.The first situation leads to granting loans to investors of low creditability and, in turn, to financial losses.The other one (a very conservative bank) to a lower profit due to a smaller number of awarded loans (so-called lost opportunities).
Accordingly, the efficient classification of bank's customers to an appropriate risk group is a fundamental principle for banks or other financial institution.As a result of the rapid increase of the amount of insolvent companies (debtors) and a credit risk level (in relative and absolute values) in the past 30 years, the interest in methods of identifying factors which influence credit risk, classifying customers of banks into defined groups and consequently reducing the level of credit risk, grew significantly.
Banks use different methods to assess customers' creditability (i.e. the probability that the customer will pay back the full amount of credit and all other contractual payments in a pre-determined time).Usually those methods are credit scoring methods combined with financial ratios analysis and models of discriminative analysis, the objective of which is to assign the potential debtor to one of two groups: "good" or "bad" customer [Wójcicka, Wójtowicz, 2009].However, despite being actively developed, some of these methods still are not flexible enough when compared to constantly changing economy conditions.Therefore, a growing interest in solutions like artificial neural networks (ANN1 ) and their applications in credit risk assessment is noticeable and every, even the smallest improvement in accuracy, is a significant accomplishment [West, 2000].This interest is due to the fact that neural networks have a built-in capacity to adapt their synaptic weights to changes in the surrounding environment.In particular, a neural network trained to operate in a specific environment can be easily retrained to deal with minor changes in the operating environmental conditions [Haykin, 2011, p. 3].

Methods
The study can be classified in applied studies group and the research strategy is descriptive.Neural network technique (NN) -also called artificial neural network (ANN) -is used.
The research focuses on investigating and comparing the results of two different structures of neural networks -the most common Multi-Layer Perceptron (MLP) and Radial Basis Function neural network (RBF).
Main similarities and differences between the two NN structures are presented in Table 1.
The collected data was obtained from a bank operating on Polish market, the Commercial Court in Poznań, Poland (the data is confidential therefore the names of the companies cannot be revealed) and from NOTORIA SERWIS.The data cover a period of six years (2009 -2014).The sample contains financial statements of companies which include a balance sheet, an income statement, a cash flow statement and a statement of changes in equity.Source: own elaboration based on [Kowalski, 2011;Gaudart et al., 2004;Nigrin, 1993;Statistica Help SANN].
The other purpose of the research is to determine those endogenous factors which affect the level of company's credit risk.It is important to identify which of the factors have the biggest impact and which are unnecessary and therefore can be removed from future analysis [Wójcicka, 2012;Wójciak, Wójcicka, 2008;Wójciak, Wójcicka, 2009].
The implemented tool is STATISTICA Neural Networks (SANN).Variables are divided into dependent and independent ones.Independent variables are the financial ratios of various groups which banks find the most significant in credit risk analysis and use them in their models.The dependent variable was identified as a "good" or "bad" company.A "good" company was the one which was (or would be) granted financial resources and, consequently, the "bad" enterprise was the one that was denied funding.However, it must be stressed that the fact that a bank classified the company as a good debtor and was willing to grant the financial means is not unequivocal with final entering into the contract of a loan.
The data set was divided into three groups in a following manner: learning group (80% of data set), -testing group (10% of data set), -validation / holdout group (10% of data set).For building models, different variants of hidden layers were used.

Research and findings
The first goal of the research was to identify the optimal set of financial indices from a global set of 25 most popular financial indexes.The whole set of ratios is presented in Table 2. Source: own on the basis of [Bragg, 2010].
The initial set of 25 indices was used as the entry data in NN learning process.In each step 20 NN of the chosen artificial neural network models -MLP and RBF, were estimated to state whether the analysed company is "good" or "bad" and to classify them in an appropriate group.Then the process of reducing the set of entry data began.It was conducted iteratively.The first step was to calculate the correlation between each pair of ratios.Next, the pair of the highest level of correlation was chosen.From that pair this ratio was rejected which had the highest average level of correlation with the remaining indexes.The obtained set of 24 indices was then implemented as entry data into the NN learning system.Then another index was rejected and once again, a new, limited set of entry data was applied to NN.This process was continued until the set got to 3 indices.Then, the whole process was stopped.
It appears that the best set of indices for MLP neural networks architecture consists of 7 following indices: Current ratio, Total debt ratio, Financial leverage, Financial surplus rate, Equity profitability index, Equity debt ratio, Sale profitability index.Rating of the results (MLP) is presented in Table 3. Concerning the results in a learning group it is certainly MLP 7-11-1 network which performs better then RBF, reaching 96.67% accuracy in comparison to 95.00%.However, level of accuracy in RBF 5 best networks is quite steady (the gap reaches just 10.00%) while in the same group of MLP 5 best networks this gap is bigger (15.00%) which indicates a certain instability of possible results.Yet an opposite situation occurs when concerning testing group.A conclusion is justified that in case of testing group, RBF neural networks do a little better than MLP.Just one of the MLP networks exceeded the level of 90.00% in testing group (7-12-1), while in case of RF 7-13-1 and 7-12-1 exceeded that level (91.67%).However, also the lowest level of testing quality was reached among MLP networks (78.33%,MLP 7-9-1) which proves the gap between the best and worst network to be relatively wide (13.34%) while in case of RBF it is merely 10.00%.It is worth stressing that almost all 5 best neural networks, in case of both types of NN, are in the range level above 80.00% when concerning the best networks in learning and testing sets.
Testing the neural networks on separate set of data (validation group) proved that both types of NN show good results, however, MLP performs slightly better than RBF.Table 5 presents the results obtained in validation group, for both types of tested NN, in comparison to a real bank credit scoring model (further referred to as BSCM).Comparison of MLP and RBF results with a banking credit scoring method, on the same set of data, is very encouraging.MLP architecture of neural networks slightly exceeds the performance of BCSM.BCSM uses 6 indices (respectively best MLP and RBF architecture use 7 and 7-9 indices) for making credit-granting decisions which are invariable irrespective of the line of business.This, in authors' opinion, can have an adverse impact on a final outcome of the decisions made, as it is very often too static and does not follow the latest trends in rapidly changing economy and market conditions or it does not distinguish some subtle factors.It is also important to stress that BCSM is usually only a part of the whole process of granting or rejecting loans and it is often followed by the expert's or analyst's opinion supported by the results of BCSM.
Although the differences between implemented methods can seem marginal, still they can be observed and eventually may decide not only about bank's performance but in extreme cases, about its survival.
Comparison of current and previous research (for construction sector [Wójcicka, 2016b] and industrial sector [Wójcicka, 2016a]) proved that the line of business influences the optimum set of input data (financial ratios) as they vary depending on a particular branch of business.The ratios used in analysed sectors are presented in Table 6.
Only two ratios appear in case of three analysed sectors, therefore, they can be tentatively considered as universal.It means that they can be used regardless of the line of business of the assessed company.However, it is worth stressing that 8 out of 25 initial input ratios do not appear at all in any of neural networks cases, therefore, they can be omitted in any analysis.Moreover, BCSM shares four (in case of construction sector) and 5 (in case of industrial sector) of all 6 initial ratios it uses.However, its results are not better than the ones of the best neural networks.It proves that too narrow set of input data negatively influences the outcome.It can be assumed that it would be beneficial to supplement this basic set of BCSM 6 ratios by chosen ratios indicated by neural networks as useful in the case of a specific business sector.The comparison of research for construction and industrial sector with the current research also proved the tendency of MLP performing slightly better than RBF.However, one fact remains disturbing.Depending on the sector the results of accuracy, that neural network achieved, vary significantly.Some sectors reach the results visibly higher than the other and that should be further investigated.This can be related to the fact that construction sector is quite homogenous, while industrial sector covers a wide variety of activities.However, the trade sector also covers a wide range of businesses but its results were comparable with construction sector.Therefore, it should be considered in future research whether it is justified to analyse sectors as the whole or whether it would be more beneficial to single out separate branches.

Conclusions
Credit risk estimation and correct classification of customers is a valid, up-to-day, significant issue.Therefore, methods are being constantly developed to improve the process of decision-making and new models are being created.The methods cover a wide range of various approaches.Their utility is checked daily in bank practice.The objective of used methods is increased accuracy which means that more creditworthy applicants are granted a loan, thereby increasing bank's profits.Consequently, those accounts which are not creditworthy are denied the loan and, thus avoiding unnecessary losses.
The paper analysed two types of neural networks: Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF).The choice of those types of neural network architecture was dictated by their popularity.
In the research, they both proved to be highly useful in credit risk decision-making process.The obtained results show that, irrespective of the model and data set, the accuracy is not less than 80% (among the best ones).
The research on decreasing input data (set of financial indices) showed that there is no need to excessively increase the number of indexes, for the best results were obtained for subsets of approximately 7-9 indices.It also showed that it is justified to use a smaller subset of universal ratios, regardless the line of business (2 ratios proved to be universal in case of trade, construction and industrial sectors).However, they should be combined with other subsets of ratios -specific for the particular sector.
In the author's opinion, it would also be essential to implement other methods of including and excluding the variables -preferably, independently for each method and branch of economy.
Moreover, one of the further directions of currently ongoing research, may lead to broadening the set of exogenous factors which, in the author's opinion, significantly influence credit risk.
The alternative direction of research, with respect to this analysis, bases on comparative analysis among neural networks and other approaches to categorising clients (popular credit scoring methods, Z-score models, other classification methods -classification trees, regression etc.) and amid different types of neural networks as well.

TABLE 3 . Best five neural networks of MLP architecture
Financial expenses and Current assets turnover ratio.Rating of the results (MLP) is presented in Table4.