Suspicious transaction prediction in Kenyan digital payments: a machine learning comparative study with imbalanced data
| dc.contributor.author | Swaleh, I. J. A. | |
| dc.date.accessioned | 2026-04-25T13:42:44Z | |
| dc.date.issued | 2025 | |
| dc.description | Full - text thesis | |
| dc.description.abstract | The surge in digital payments in Kenya has heightened financial crime risks, including money laundering and terrorist financing. Despite regulatory mandates, Suspicious Transaction Reports (STRs) from Payment Service Providers (PSPs) remain below expectations. Traditional rule-based systems often fail to detect such activities, driving interest in machine learning (ML) methods like Random Forest, k-Nearest Neighbours, and Support Vector Machines. However, comparative research on these models, especially in handling severe class imbalance in Kenyan financial datasets, remains limited. This study therefore evaluated the four ML algorithms ( Random Forest, Support Vector Machine, k-Nearest Neighbours and Logistic Regression) for detecting suspicious transactions. To address class imbalance, the SMOTE-ENN re-sampling technique was applied. Factor Analysis for Mixed Data (FAMD) was used for dimensionality reduction, and model performance was assessed using F1-score and Matthews Correlation Coefficient (MCC). Random Forest outperformed other models post-re-sampling (MCC 99.93%, F1-score 99.94%). Logistic Regression showed the greatest sensitivity to class imbalance, with MCC improving from 62.87% to 97.47%. kNN and SVM also recorded significant gains. Key predictors included Business Age, Score Rank, and Product Type. The findings underscored the importance of using MCC and F1-score over accuracy when evaluating models on imbalanced datasets. They also supported the adoption of hybrid re-sampling techniques , specifically SMOTE-ENN , to enhance model performance, and highlight Random Forest as a particularly effective algorithm for fraud detection. Future research should explore advanced models such as XGBoost and leverage more diverse datasets to better capture evolving fraud patterns. Keywords: suspicious transaction reporting; digital payments; machine learning; class imbalance; SMOTE-ENN; fraud detection; random forest; F 1-score; MCC. | |
| dc.identifier.citation | Swaleh, I. J. A. (2025). Suspicious transaction prediction in Kenyan digital payments: A machine learning comparative study with imbalanced data [Strathmore University]. https://hdl.handle.net/11071/16471 | |
| dc.identifier.uri | https://hdl.handle.net/11071/16471 | |
| dc.language.iso | en | |
| dc.publisher | Strathmore University | |
| dc.title | Suspicious transaction prediction in Kenyan digital payments: a machine learning comparative study with imbalanced data | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Suspicious transaction prediction in Kenyan digital payments - a machine learning comparative study with imbalanced data.pdf
- Size:
- 10.04 MB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: