MSc. DSA Theses and Dissertations (2024)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 5 of 7
  • Item
    Developing an early warning system for Banana Xanthomonas Wilt (BXW) in Rwanda
    (Strathmore University, 2024) Owuor, C. A.
    Bananas are crucial for the agricultural economy of the African Great Lakes region, including countries like Kenya, Uganda, Tanzania, Burundi, Rwanda, and parts of the Democratic Republic of Congo, with an annual production exceeding 22 million tonnes. However, banana productivity faces significant threats from pests and diseases such as the Banana Xanthomonas Wilt (BXW), caused by the bacterium Xanthomonas campestris pv. Musacearum. In this study, machine learning techniques were employed to develop an early warning system for BXW. Various classification models, including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest (RF), and Gradient Boosting Machine (GBM), were trained and evaluated for predicting BXW occurrence. RF outperformed the other models with an accuracy of 94%, followed by GBM (89%), KNN (87%), and SVM (83%). In terms of the area under the curve (AUC), RF outperformed the other models with a score of 96%, followed by GBM (95%), KNN (94%), and SVM (90%). This highlights RF’s effectiveness in creating habitat suitability maps and establishing an early warning system for BXW. The RF model was used to develop a BXW habitat suitability map for Rwanda, aiding agricultural stakeholders in identifying high-risk areas. Furthermore, a Short Message Service (SMS)-based early warning system was implemented to provide timely alerts to farmers, thereby, enhancing BXW mitigation efforts. Additionally, a web portal for real-time BXW risk prediction and analysis was developed, providing accessible information to stakeholders for proactive management strategies. Keywords: BXW, Early Warning System, Rwanda, Remote Sensing, Machine Learning.
  • Item
    Music recommendation system using natural language processing
    (Strathmore University, 2024) Chege, C. N.
    Music recommendation systems have become increasingly popular in recent years, facilitating personalized music discovery for users worldwide. This dissertation explores the application of natural language processing (NLP) and machine learning techniques in developing a music recommendation system. The study involves building a collection of music lyrics databases, analyzing the lyrics using NLP methods (such as TF-IDF and similarity/distance metrics), and integrating these findings into a recommendation model. The cosine similarity model was evaluated and recorded an accuracy of 96%, precision of 95%, recall of 96% and F1-score of 95%. Therefore, incorporating lyrics-based features in music recommendation systems can improve user experience in consuming recommendations of similar and relevant music.
  • Item
    Use of machine learning (text recognition, natural language processing, and large language models) for hand-written answer sheet evaluation
    (Strathmore University, 2024) Mutugi, B.
    The realm of machine learning, encompassing text recognition, natural language processing and large language models, presents a transformative potential for the education sector, particularly in the evaluation of hand-written tests. This dissertation explored the use of these technologies in hand-written tests, acknowledging their prevalence and addressing inherent challenges encountered when evaluating the tests. The significant time required for evaluation often leads to delayed results and academic calendars, while the physical and mental strain on the evaluators, coupled with varying levels of skill and knowledge can lead to inconsistencies and inaccuracies in scoring. To address these challenges, this research explored the development of a machine learning approach capable of automatically extracting questions and student responses from images/pictures of exam papers and answer sheets. The approach then assessed student responses with the corresponding exam questions using pre-trained large language models. This research adopted the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework —business understanding, data understanding, data preparation, modelling, evaluation, and deployment, to streamline the development and comparison of machine learning models. The result of was a machine learning model designed to process photos of question papers and answer sheets. It extracted text questions and answers, seamlessly facilitating the interaction between users and the technology. The textual content could then be analyzed by a pre-trained large language model, which performed the assessment and provided feedback. Enhancing the efficiency of assessments and elevating the accuracy and objectivity of feedback provided to learners, this approach promised to significantly reduce the time and effort involved in the evaluation process; thereby overcoming the limitations of current practices in hand-written test evaluation.
  • Item
    Leveraging learning analytics to optimize virtual learners’ performance
    (Strathmore University, 2024) Ng'eno, B. C.
    Learning analytics has gained traction globally over the years with many institutions acknowledging its potential to optimize learning and the environments in which learning occurs. The study is structured around three primary objectives aiming to provide a key focus on optimization of virtual learners’ academic outcome using learning analytics approaches. Firstly, it aims to identify key indicators that reliably predict students' performance within academic settings. Secondly, it seeks to examine and compare the effectiveness of different algorithms in accurately forecasting students' performance outcomes. Lastly, the research endeavours to develop and deploy a performance prediction and early alert tool utilizing R-Shiny. In this study, the performance of Logistic Regression, Naive Bayes, K-Nearest Neighbors and Support Vector Machine in predicting learners’ performance were evaluated. Utilizing 21,216 records from students at the The Open University UK, the results indicated Logistic Regression as the best performing model with a precision rate of 90% and key features encompassed student demographic information and academic history. The findings of this study give invaluable insights to educational institutions on leveraging learning analytic practices for data-driven interventions to optimize and enhance student performance. In conclusion, this study not only provides a tangible solution of students’ performance optimization but also contributes to the growing body of knowledge on learning analytic practices that provide solutions which can be incorporated in the education sector. Keywords: Learning analytics, machine learning, student performance, R-Shiny.
  • Item
    Voronoi diagrams and how they shape up offense analytics in women’s football
    (Strathmore University, 2024) Mugwe, A. I.
    Vilar et al. (2013) introduces a method for analyzing collective offensive and defensive behavior, finding that maintaining numerical dominance in key areas of the field is crucial for both defensive stability and offensive opportunity. The consideration of offensive tactics we try to employ is looking at spotting defensive weaknesses, expected goal improvements and exploiting the opposing team’s defense when attacking. The use of the expected goal metric is important to a team as it serves beneficial from the aspect of seeing where to improve the offense by creating opportunities that have higher expected goals, and as well help in the defense by learning the expected model of the other teams and adequately positioning the team in order to make the opponent make shots from the low expected goal regions. The expected goal metric to be used will employ the use of machine learning techniques such as logistic regression, bagging algorithms, decision trees and deep learning techniques such as Multilayer Perceptron models so as to help in the dealing with the imbalanced goals variable. The expected goals model cannot be a stand alone feature and would need the incorporation of other metrics to determine what key factors per team lead to the creation of higher goal scoring opportunities, because of this, Voronoi diagrams were used in the exploration of how different team shapes at different moments during the game lead to either more goals or chances being created dependant on the space that the team occupies.