Optimizing credit risk assessment with ensemble sampling and hybrid machine learning models

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Strathmore University

Abstract

Accurate credit risk modeling is essential for minimizing financial losses, but class imbalance, where defaulters make up a small fraction of the data, remains a challenge. This study tackles the issue using ensemble sampling and hybrid machine learning models. A Kaggle dataset with 32,582 entries was used in this study. SMOTE + Random Under sampling, ADASYN + Random Under sampling, Borderline-SMOTE + Random Under sampling, SVM-SMOTE + Random Under sampling, and SMOTE-TOMEK, were applied before training. Our findings reveal that Random Forest with Borderline-SMOTE + Random Under sampling achieved the highest recall, while SMOTE + Random Under sampling with Random Forest achieved highest AUC. While hybrid machine learning models improved precision, they sacrificed recall. This study reinforces the power of ensemble sampling and hybrid approaches in credit risk modeling, with future research focusing on dynamic thresholding and advanced ensemble strategies to refine predictions. Keywords: Credit risk modeling, Class Imbalance, Ensemble sampling, Hybrid machine learning, Random Forest

Description

Full - text thesis

Keywords

Citation

Mucheru, N. (2025). Optimizing credit risk assessment with ensemble sampling and hybrid machine learning models [Strathmore University]. https://hdl.handle.net/11071/16461

Endorsement

Review

Supplemented By

Referenced By