Primary objective of this master’s thesis is to develop and evaluate a model that can be used to provide a selected commercial bank with an early prediction (e.g. about a month in advance) that a particular customer will default. In this way, the bank can be prepared to intervene or find solutions for the customer and adjust the repayment of its liabilities when the transition is predicted. Instead of focusing on the individual credit and its risk (which is more common in practice), in this master's thesis we focus on holistic view of the customer at the bank, as it is the customer who defaults, not his individual credit.
For the purposes of our analysis, we therefore spend a considerable amount of time on data preparation, with the most important steps being the rearrangement of the panel data into a pooled cross-sectional form and the choice of modelling time. For each customer, in addition to information on his/her credit, we use information on the balance of his/her bank accounts and selected demographic variables that the commercial bank has in its possession and has allowed to be used.
On the basis of the data thus produced, we have developed a classification model (based on logistic regression or neural networks) which treats each customer as a unit and predicts, for a selected time interval in the future (e.g. one, two or three months), the probability of the customer defaulting or not defaulting. When comparing the results of the two models, the neural network model performs better, but is much less reliable than the logistic regression. The reason for the lower reliability is because neural network model recognises very different patterns in the data when it is trained repeatedly on the same data, which makes it difficult to trust its result.
|