MSc. DSA Theses and Dissertations (2024)
Permanent URI for this collection
Browse
Recent Submissions
Now showing 1 - 5 of 11
- ItemInterrupted time series and machine learning with application to effect of Influenza Vaccine(Strathmore University, 2024) Juma, C. O.Interrupted time series analysis is being increasingly employed to assess the effects of extensive health interventions. Autocorrelation and seasonality are best captured but are not well captured by the simple implementation of the time series model like segmented regression, which is commonly used. An Autoregressive Integrated Moving Average (ARIMA) model presents an alternative approach to address these issues. In this study, the fundamental principles of ARIMA and LSTM models are expounded upon, along with their application in evaluating interventions at a population level, such as determining the effect of influenza vaccine administration. Considerations such as determining the impact shape, model selection process, transfer functions, loss functions, selection of batch sizes and epochs training the neural networks, evaluation metrics, and interpreting results are discussed. Additionally, detailed R and Python codes are provided for result replication. The application of ARIMA and LSTM predictive modeling is demonstrated through an analysis of influenza vaccination intervention to reduce the number of medically attended respiratory illnesses among children under five years. Precisely, from November 2019 to November 2021, an influenza vaccination demonstration project. In conclusion, ARIMA modeling and LSTM serve as valuable tools for assessing the effects of large-scale interventions when traditional methods are not applicable, given their ability to consider underlying patterns, autocorrelation, seasonality, and flexibility in modeling various impacts. Comparing the MAE and RMSE error results, LSTM outperformed the ARIMA model. Key terms: Interrupted time series analysis, Autoregressive integrated moving average models, LSTM, Intervention analysis
- ItemPredicting risky taxpayers in Kenya using machine learning(Strathmore University, 2024) Cheboi, C. J.Taxation is a fundamental tool for governments to raise revenue and fulfill their responsibilities to society. It is an essential component of modern governance, facilitating economic development, social welfare, and the provision of goods and services for the benefit of the public. Conversely, tax evasion poses a pervasive challenge impacting both advanced and emerging economies globally. In Kenya, addressing tax evasion is a significant hurdle, with the government estimating substantial annual revenue losses as a result leaving the government to seek debt financing for its programs. This study used machine learning models to classify taxpayers according to certain attributes and predict those who are most likely to evade. The study explored 24 such attributes. The target output variable was the payment time. The dataset was trained using six supervised machine learning algorithms including the Decision Tree, Logistic Regression, Random Forest, XGBoost, Support Vector Machines and Stacking. Among the trained models, the Random Forest classifier exhibited the optimal performance with a precision score of 90% and recall score of 86%. This suggests that the model can effectively predict the risky taxpayers to be subjected to a tax audit with likelihood of high returns. The study identified the top five crucial features influencing optimal tax evasion prediction as installment tax paid, total liabilities, credit brought forward, withholding value added tax credit and total expenses. Accordingly, adjusting these parameters within specified ranges is anticipated to result in an increased accuracy of the prediction of taxpayer classes. These results offer valuable insights for understanding determinants of tax compliance and enhancing the accuracy of predicting risky taxpayers towards optimizing resource allocation for better tax revenue mobilization outcomes. Keywords: Taxation, Tax evasion, Risky taxpayers, Tax Audit, Machine Learning
- ItemCorrelated stock identification in pairs trading using extreme gradient boosting algorithm(Strathmore University, 2024) Muhia, C. N.Pairs trading is a well-known market-neutral trading strategy that aims to exploit market inefficiencies by identifying and trading pairs of highly correlated stocks. This research addresses the pressing problem of accurately identifying correlated stock pairs for pairs trading strategies, recognizing the potential for reducing risk and generating profits in financial markets. While traditional statistical and deep learning methods have provided valuable insights, there exists a notable research gap in assessing the effectiveness of advanced machine learning algorithms like Extreme Gradient Boosting (XGBoost) in this context. To bridge this gap, the study meticulously compares the performance of the XGBoost algorithm with conventional techniques through quantitative analysis. Leveraging historical stock price data and machine learning methodologies, the research explores the intricacies of stock pairing accuracy and profitability. The findings reveal that the tuned XGBoost model demonstrates superior accuracy, precision, and recall in identifying profitable stock pairs, outperforming traditional statistical methods and other machine learning algorithms. Specifically, the XGBoost model achieved an accuracy of 95.50% and a precision of 95.34% in identifying profitable stock pairs. These results underscore the potential of XGBoost to enhance pairs trading strategies and optimize trading decisions in dynamic financial environments. However, while the XGBoost model showcases remarkable performance, it is not without limitations. Susceptibility to overfitting and reliance on input feature quality and quantity present challenges that need to be addressed. Nonetheless, the study provides valuable insights for investors and traders, suggesting avenues for optimizing trading strategies and maximizing profitability. Recommendations include further exploration of XGBoost's capabilities in diverse market conditions and the integration of additional data sources to enhance predictive accuracy. Moreover, the research highlights the need for continued investigation into other advanced machine learning algorithms and ensemble techniques to further improve stock pairing accuracy. Ultimately, this study contributes to advancing pairs trading strategies by providing empirical evidence of XGBoost's effectiveness, while also identifying avenues for future research and development in the field. Key Words: Pairs Trading, Correlated Stocks, Autoencoders, Self-Organizing Maps, Random Forest, Support Vector Machine, Trading Strategy, Sharpe Ratio, Maximum Drawdown, Cointegration, Backtesting, Machine Learning, XGBoost
- ItemPredicting financial inclusion and access to credit in Kenya(Strathmore University, 2024) Tanui, C.Financial inclusion, particularly access to credit, is a crucial aspect of economic development in Kenya. This study aims to investigate the determinants of financial inclusion and access to credit in Kenya, employing logistic regression modeling to predict financial inclusion patterns; and construct a forecast model that can support policymakers and financial organizations in boosting financial inclusion. The study analyzed several factors including demographics, technology adoption, financial services usage and barriers to assess their impact on financial inclusion and access to credit. The results revealed that the use of mobile phones and the internet as technological indicators of financial inclusion were the most effective predictors. Contrary to previous studies, gender was not found to significantly affect financial inclusion in this context. The development of a machine-learning model achieved an overall prediction accuracy of 90.9%. An interactive user dashboard was also developed using flexdashboard in R and hosted in the web, with visualizations and regression models to provide insights into the key factors driving financial access in Kenya. The results showed that demographics, technology adoption, financial services usage and barriers to financial inclusion were the most significant factors that impacted financial inclusion; however, there were no significant correlations between these factors and financial inclusion as a whole. This research study will offer insights into the causes of financial exclusion in the country and how to overcome them.
- ItemDeveloping an early warning system for Banana Xanthomonas Wilt (BXW) in Rwanda(Strathmore University, 2024) Owuor, C. A.Bananas are crucial for the agricultural economy of the African Great Lakes region, including countries like Kenya, Uganda, Tanzania, Burundi, Rwanda, and parts of the Democratic Republic of Congo, with an annual production exceeding 22 million tonnes. However, banana productivity faces significant threats from pests and diseases such as the Banana Xanthomonas Wilt (BXW), caused by the bacterium Xanthomonas campestris pv. Musacearum. In this study, machine learning techniques were employed to develop an early warning system for BXW. Various classification models, including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest (RF), and Gradient Boosting Machine (GBM), were trained and evaluated for predicting BXW occurrence. RF outperformed the other models with an accuracy of 94%, followed by GBM (89%), KNN (87%), and SVM (83%). In terms of the area under the curve (AUC), RF outperformed the other models with a score of 96%, followed by GBM (95%), KNN (94%), and SVM (90%). This highlights RF’s effectiveness in creating habitat suitability maps and establishing an early warning system for BXW. The RF model was used to develop a BXW habitat suitability map for Rwanda, aiding agricultural stakeholders in identifying high-risk areas. Furthermore, a Short Message Service (SMS)-based early warning system was implemented to provide timely alerts to farmers, thereby, enhancing BXW mitigation efforts. Additionally, a web portal for real-time BXW risk prediction and analysis was developed, providing accessible information to stakeholders for proactive management strategies. Keywords: BXW, Early Warning System, Rwanda, Remote Sensing, Machine Learning.
- «
- 1 (current)
- 2
- 3
- »