Loan default prediction in microfinance group lending with machine learning
List of Authors
  • Kadek Dwi Pradnyana , Raden Aswin Rahadi

Keyword
  • credit model, default prediction, machine learning, microfinance, group lending

Abstract
  • Microfinance fintech enables the unbanked and underbanked communities to access credit by offering small, no collateral loans. Microfinance institutions (MFI) usually use credit scoring to filter out risky borrowers. Credit scoring method for individual loans has been widely studied. However, none are for group lending where members are women micro-entrepreneurs in a developing country, and jointly responsible for loan repayment. This research try to build a credit default prediction model for microfinance group lending using machine learning techniques. We examine six different machine learning methods, including XGBoost, logistic regression, linear discriminant analysis (LDA), decision trees, k-nearest neighbour (KNN) and random forest. The XGBoost model performs the best during the first modeling phase. With an accuracy of 0.97 and an AUC score of 0.85, it performs better than other models. Decision tree and random forest give comparable outcomes, with AUCs of 0.81 and 0.80 and accuracies of 0.81, 0.95, and 0.97. In an effort to increase performance, class balancing is performed. The XGBoost model's performance was successfully enhanced, resulting in an increase in AUC from 0.85 to 0.89. Its accuracy stays the same as 0.97. False positive and false negative rates for this model are both low (2.05% and 1.38%, respectively). Consequently, the model has been effectively developed and is capable of differentiating between bad and good loans.

Reference
  • 1. AC Ventures. (2022). 2022 the Coming of Age of Indonesia’s Fintech Industry - AC Ventures. https://acv.vc/fintech-indonesia-2022/

    2. Caire, D., Barton, S., de Zubiria, A., Alexiev, Z., Dyer, J., Bundred, F., & Brislin, N. (2006). A HANDBOOK FOR DEVELOPING CREDIT SCORING SYSTEMS IN A MICROFINANCE CONTEXT. www.microLINKS.org.

    3. Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17-August-2016, 785–794. https://doi.org/10.1145/2939672.2939785

    4. Dong, G., Lai, K. K., & Yen, J. (2010). Credit scorecard based on logistic regression with random coefficients. Procedia Computer Science, 00, 0–000. https://doi.org/10.1016/j.procs.2010.04.278

    5. Dumitrescu, E., Hué, S., Hurlin, C., & Tokpavi, S. (2022). Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects. European Journal of Operational Research, 297(3), 1178–1192. https://doi.org/10.1016/J.EJOR.2021.06.053

    6. Dumitrescu, E.-I., Hué, S., Hurlin, C., & tokpavi, sessi. (2020). Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds. SSRN Electronic Journal. https://doi.org/10.2139/SSRN.3553781

    7. Fisher, R. A. (1936). THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS. Annals of Eugenics, 7(2), 179–188. https://doi.org/10.1111/J.1469-1809.1936.TB02137.X

    8. García, V., Marqués, A. I., & Sánchez, J. S. (2012). Non-parametric statistical analysis of machine learning methods for credit scoring. Advances in Intelligent Systems and Computing, 171 AISC, 263–272. https://doi.org/10.1007/978-3-642-30864-2_25/COVER

    9. Grameen Bank. (2015). Grameen bank’s cumulative loan disbursement since inception crosses the threshold of BDT. https://demo.grameenbank.org/wp-content/uploads/bsk-pdf-manager/GB-2015_33.pdf

    10. Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression: Third Edition. Applied Logistic Regression: Third Edition, 1–510. https://doi.org/10.1002/9781118548387

    11. Hovdenakk, A. H. (2021). Machine learning vs logistic regression in credit scoring: A trade-off between accuracy and interpretability? https://bora.uib.no/bora-xmlui/handle/11250/2762661

    12. Imai, K. S., Gaiha, R., Thapa, G., & Annim, S. K. (2012). Microfinance and Poverty—A Macro Perspective. World Development, 40(8), 1675–1689. https://doi.org/10.1016/J.WORLDDEV.2012.04.013

    13. Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136. https://doi.org/10.1016/J.EJOR.2015.05.030

    14. Li, Y. (2019). Credit risk prediction based on machine learning methods. 14th International Conference on Computer Science and Education, ICCSE 2019, 1011–1013. https://doi.org/10.1109/ICCSE.2019.8845444

    15. Littlefield, E., MORDUCH, J., & HASHEM, S. (2003). Is Microfinance an Effective Strategy to Reach the Millennium Development Goals? CGAP FocusNote No 24.

    16. Loiseau-Aslanidi, O., Thiagarajah, N. S., & Tolstova, V. (2020). Automating Interpretable Machine Learning Scorecards. www.economy.comwww.moodysanalytics.com

    17. PwC. (2019). Indonesia’s Fintech Lending: Driving Economic Growth through Financial Inclusion - Executive Summary. https://www.pwc.com/id/en/fintech/PwC_FintechLendingThoughtLeadership_ExecutiveSummary.pdf

    18. Rizyameza, A. (2020). Credit Scorecard Implementation in Agricultural Peer-to-peer (P2P) Lending: A Case of PT Berkembang.

    19. Samer, S., Majid, I., Rizal, S., Muhamad, M. R., Sarah-Halim, & Rashid, N. (2015). The Impact of Microfinance on Poverty Reduction: Empirical Evidence from Malaysian Perspective. Procedia - Social and Behavioral Sciences, 195, 721–728. https://doi.org/10.1016/J.SBSPRO.2015.06.343

    20. Schreiner, M. (2003). SCORING: THE NEXT BREAKTHROUGH IN MICROCREDIT? Building financial systems that work for the poor SCORING: THE NEXT BREAKTHROUGH IN MICROCREDIT? CGAP, 7.

    21. Tian, Z., Xiao, J., Feng, H., & Wei, Y. (2020). Credit Risk Assessment based on Gradient Boosting Decision Tree. Procedia Computer Science, 174, 150–160. https://doi.org/10.1016/J.PROCS.2020.06.070

    22. Urs, S., & Lehner, M. (2009). Group Lending versus Individual Lending in Microfinance. www.sfbtr15.de

    23. Vidal, R. L., & Agustí, J. S. (2018). Microcredit in the developed countries: the case of Barcelona. https://ec.europa.eu/migrant-integration/sites/default/files/2019-10/EWI05-Microcreditinthedevelopedcountries_thecaseofBarcelona.pdf

    24. Wu, W. (2022). Machine Learning Approaches to Predict Loan Default. Intelligent Information Management, 14(5), 157–164. https://doi.org/10.4236/IIM.2022.145011

    25. Zhang, Z. (2016). Introduction to machine learning: k-nearest neighbors. Annals of Translational Medicine, 4(11). https://doi.org/10.21037/ATM.2016.03.37