Credit risk

Can machine learning be used for credit risk modeling?

Can machine learning be used for credit risk modeling?

Lending is a major activity of banks. It is important for banks to predict whether a particular applicant is able to repay the loan. Bad debts are restricted by ensuring that loans are only granted to strong borrowers. Who are likely to be able to repay and who are unlikely to become insolvent.

Except for traditional financial institutions…

Internet finance, which combines the traditional financial industry and Internet technology, has developed rapidly and produced many network-based financial products. Including P2P credit platform, online finance platform, payment platform. However, these unregulated activities are likely to create risks for investment institutions and financial companies. Such as late payment from borrowers, credit card fraud and carrier fraud.

As a result, we can set up an effective risk control system, such as creating credit score card templates. In this way, this model learns historical information about customer behavior. Such as income status, credit history, jobs and debt status. As a result, take advantage of some machine learning methods to predict a credit score.

The model would group different customers into different rating levels, which indicates the ability to repay loans.

To improve the accuracy of predictions, data mining and feature selection is the key factor. The data set is so large that it is difficult to select the elements that are important to rank. Also, how much weight is assigned to each characteristic. Another problem with forecasting credit risk is an imbalanced data set. The majority of the data we collected indicates that most people would repay the loan. So we need to figure out how to train the models to recognize the negative label.

Tree models are applicable for predicting financial data, like LightGBM as it is interpretable and more accurate. Certainly, we compare accuracy with logistic regression to determine which is the better model. The evaluation parameters we use are accuracy and AUC to select a performing and efficient model. The AUC score represents the area under the curve of an ROC curve. The ROC curve is the graph between the false positive rate and the true positive rate at different thresholds.

Back to news

(1) JPMorgan equity chief Lee Spelman on active versus passive investing

Can machine learning be used for credit risk modeling?