Optimizing credit risk assessment with ensemble sampling and hybrid machine learning models
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Strathmore University
Abstract
Accurate credit risk modeling is essential for minimizing financial losses, but class imbalance, where defaulters make up a small fraction of the data, remains a challenge. This study tackles the issue using ensemble sampling and hybrid machine learning models. A Kaggle dataset with 32,582 entries was used in this study. SMOTE + Random Under sampling, ADASYN + Random Under sampling, Borderline-SMOTE + Random Under sampling, SVM-SMOTE + Random Under sampling, and SMOTE-TOMEK, were applied before training. Our findings reveal that Random Forest with Borderline-SMOTE + Random Under sampling achieved the highest recall, while SMOTE + Random Under sampling with Random Forest achieved highest AUC. While hybrid machine learning models improved precision, they sacrificed recall. This study reinforces the power of ensemble sampling and hybrid approaches in credit risk modeling, with future research focusing on dynamic thresholding and advanced ensemble strategies to refine predictions.
Keywords: Credit risk modeling, Class Imbalance, Ensemble sampling, Hybrid machine learning, Random Forest
Description
Full - text thesis
Keywords
Citation
Mucheru, N. (2025). Optimizing credit risk assessment with ensemble sampling and hybrid machine learning models [Strathmore University]. https://hdl.handle.net/11071/16461