MSIT Theses and Dissertations (2022)
Permanent URI for this collection
Browse
Browsing MSIT Theses and Dissertations (2022) by Issue Date
Now showing 1 - 11 of 11
Results Per Page
Sort Options
- ItemA Machine learning model for support tickets servicing: a case of Strathmore University ICTS client support services(Strathmore University, 2022) Maina, Antony KoimbiCustomer service is a highly vital part of any business. How satisfied your customers are can make or break a company. One of the greatest contributors to customer satisfaction is the ability to respond to their issues efficiently and effectively. Many businesses therefore opt to establish a customer service department that handles customers’ services, this includes receiving phone calls and replying to emails. Customers are expected to call with issues such as, “How do I reset my password?” “How do I access the Student Information System?” “Are the student’s marks out yet?” and the like. Often, the issues reported by customers are similar and tend to get similar resolutions. These requests can be overwhelming at times, for example in cases where the users/customers are accessing an online resource and the system goes down, the number of inquiries can be in the order of thousands depending on the number of system users. This means a human agent may not be able to service all these requests on time. This research aims to develop an intelligent chatbot model for a support ticketing system using machine learning to deliver an exceptional customer experience. This research specifically proposes to develop a machine-learning model that can be used to service customer tickets in the context of a university or learning institution. The Rapid Application Development methodology was used to produce a working prototype of a chatbot to test the model to be developed. Machine learning and natural language processing were used to extract a user’s intent from a message and by leveraging pre-trained frequently asked question models from the DeepPavlov library, the model was trained on 80% of the data and 20% for testing. All 37 sessions tested on Dialogflow were successful, translating to a 100% success response rate. The prototype was tested by integrating the WhatsApp messaging platform to send messages to the chatbot. The chatbot was able to respond to the user in a fraction of a second. The average response time was less than one minute during testing.
- ItemA HDF5 data compression model for IoT applications(Strathmore University, 2022) Chabari, Risper NkathaInternet of things has become an integral part of the modern digital ecosystem. According to current reports, more than 13.8 billion devices are connected as of 2021 and this massive adoption will surpass 30.9 billion devices by 2025. This means that IoT devices will become more prevalent and significant in our daily lives. Miniaturization in form factor chipsets and modules has contributed to cost-effective and faster running computer components. As a result of these technological advancements and mass adoption, the number of connected devices to the internet has been on the rise, leading to the generation of data, in high volumes, velocity, veracity, and variety. The major challenge is the data deluge experienced which in turn makes it challenging to visualize, store and analyse data generated in various formats. The adoption of relational databases like MySQL has been majorly used to store IoT data. However, it can only handle structured data because data is organized in tables with high consistency. On the other hand, NoSQL has also been adopted because of its capabilities of storing large volumes of data and has no reliance on a relational schema or any consistency requirements. This makes it suitable for only unstructured data. This outlines a clear need of adopting an effective way of storing and data managing IoT heterogeneous data in a compressed and self-describing format. Furthermore, there is no one- size all approach of managing heterogeneous data in IoT architecture. It is in the paradigm that this research solved this challenge by creating a tool that compresses heterogeneous data while saving it in a HDF5 format. The format of the data used was in .csv datasets. These data was parsed in the storage tool and data tool of the HDF5 for compression and conversion. The tool managed to achieve a good compression ratio percentage of 89.34% decrease from the original file. The output of the compressed file was represented on an external interactor called hdfview to validate that the algorithm used was lossless.
- ItemA Prototype for detecting procurement fraud using data mining techniques: case of banking industry in Kenya(Strathmore University, 2022) Muriithi, Francis W.Fraud is a million-dollar business, and it is increasing every year. The numbers are shocking, all the more because over one third of all frauds are detected by 'chance' means. Given that the procurement process is part of the expenditure cycle that culminates with the payment of cash, it is rife with potential for exposing an organization to fraud and embezzlement. Today, whistle blowing, is the most common fraud detection method. However, this method does not proactively search for misconduct. As a result, a fraud detected through this means tends to be caught too late and after the organization has already lost millions of dollars. In this study, we propose a data driven fraud detection prototype to reduce the duration and cost of procurement fraud in Kenya’s banking industry. To achieve this, electronic data from the HR and ERP systems was analysed by the prototype using data mining techniques to identify potential fraud misconduct. The data mining techniques applied included rule-based, fuzzy string-matching, and z-score outlier analytics to crossmatch the data against procurement fraud red flag indicators. Thereafter, the prototype generated potential frauds notifications to the organization’s audit, risk, or forensic department for further investigation. The outcome of the investigation done by the audit team was also captured by the prototype to increase the accuracy of fraud detection and reduce future false positive alerts.
- ItemAn Intelligent chatbot implementation for employee exit auto-clearance using deep learning(Strathmore University, 2022) Kasera, Lawrence MwakioAs part of the employee exit in an organization, the clearance process is a mandatory requirement that guarantees that the employee leaves formally, returns all organization property, and gets the final paycheck. It is commonplace for this process to entail filling and submitting an exit clearance form. For each area of responsibility in the clearance process, an ascertainment of completion is marked by the use of signatures or clearance approvals from the requisite personnel. The review of literature showcased that this process often employs the use of physical forms, which means printing, filing, and tones of record keeping. On the other hand, some organizations use automated means, which are still largely human reliant, leading to delays, inconsistencies and lots of redundancies. The aim of this study was to develop an intelligent chatbot implementation for employee auto clearance using Deep Learning. A chatbot is an Artificial Intelligence (AI)-driven software tool that simplifies the interaction between humans and computers. Among many other advantages, a chatbot reduces the overall costs in mundane tasks, enhances the user experience and has greater availability. This research employed the qualitative design to explore the different ways and approaches that make up the clearance process, alongside their challenges, in formal organizations, within Nairobi, Kenya. The proposed deep learning chatbot model was developed using 2 hidden layers and trained on 2,000 epochs. The training data dictionary was categorized as tags, patterns and responses. The model was able to correctly match 99.91% of the input pattern data points to their corresponding response output data points, and where an input pattern seemed unclear, the model was able to respond accordingly. The model could successfully make the API calls to the web service, where digital signatures are appended, and finalize the exit clearance process with a complete and signed clearance form.
- ItemA Location-aware nutritional needs prediction tool for type II Diabetic patients: case Kenya(Strathmore University, 2022) Karega, Lulu AminaDiabetes is a chronic disease caused by a lack of insulin production by the pancreas or by poor utilization of the insulin that is produced, with insulin being the hormone that helps glucose get to blood cells and produce energy. Urbanization and busy day to day schedules mean patients tend to pay little or no attention to their dietary habits which results in a preference for fast foods and processed food. The prevalence of type II diabetes in the world, Kenya included, has been steadily rising over the years and is projected to keep growing at an alarming rate. Diabetes if not properly managed can result in long-standing, costly and time-consuming complications. Diabetes management and control of blood sugar levels are generally done by the use of medication, namely insulin and oral hypoglycemic agents. However nutritional therapy can also go a long way to boosting the general health of a patient and reducing risk factors leading to further complications. Personalised nutrition has been formally defined as healthy eating advice, tailored to suit an individual based on genetic data, and alternatively on personal health status, lifestyle, and nutrient intake. Diabetes management falls under the field of health informatics that can benefit from data analytics. Predictive analytics is the process of utilizing statistical algorithms, software tools and services to analyze, interpret and visualize data with the aim to forecast trends, and predict data patterns and behavior within or outside the observed data. This study sought to develop a location-aware nutritional needs prediction tool for type II diabetic patients in Kenya. The prediction tool would help both nutritionists and patients by providing accurate and relevant nutritional advice that would help in dietary changes to combat type II diabetes with the added benefit of being location aware. The tool will use pathological results from nutritional testing to support nutritional therapy. If any deficiencies are identified from the provided nutritional markers, food items likely to improve those nutrient levels will be recommended. The amount of nutrient available in a given food item are determined by the food composition table for Kenya as published by the Food and Agriculture Organization (FAO) in conjunction with the Kenyan government. The study used a simplistic implementation of matrix factorization to provide predictions of locally available food items, down to the county level.
- ItemA Machine learning model to predict non-revenue water with severely unbalanced classes(Strathmore University, 2022) Muriithi, Patrick KimaniEvery household, industry, institution, organization needs clean water for existence. In Kenya, water is used for human consumption, production, and agriculture. The consumption of water, therefore, contributes to the overall growth of the economy through water bills. The term non-revenue water (NRW) is defined as water produced and 'lost' before it reaches the customers. NRW is also described as the difference in volume reaching the final consumer for billing and the initial volume released into the distribution network. Based on the assessment of the Public-Private Infrastructure Advisory Facility (PPIF), an organization that fosters inter-agency cooperation to curbing NRW, physical losses are the main causes of NRW. As per PPIF, most NRW emanates from physical losses, including burst pipes that are often a result of poor maintenance. Besides physical losses, PPIF notes other numerous sources of NRW, especially commercial losses arising from the manner billing data is handled throughout the billing process. The main issues related to this cause include under-registration of customers' meters’ reading, data handling errors, theft, and illegal connections. Other causes of NRW include unbilled authorized consumption such as water used for firefighting, utilities for operational purposes, and water provided to specific groups for free. Therefore, non-revenue water risks the country's revenue collection, which can lead to slow economic growth. This research proposes development of a machine learning model that will be used by water service providers. The model will be able to assist the WSP companies to reduce non-revenue water by predicting water consumption of different customers. To achieve these objectives, we intend to focus on providing tools and methods that will guide the WSPs on reducing the non-revenue water. Our model was trained with 2 years consumption dataset of Nairobi County. The model developed was able to predict customer monthly consumption with percentage accuracy of 95%.
- ItemA Rainfall prediction model using long short-term neural networks for improved crop productivity: a case of maize planting in Machakos County(Strathmore University, 2022) Wangome, Brian MwathiClimate variability is a factor that affects crop productivity in Kenya. The unpredictable nature of weather patterns during the traditional long and short rain seasons has resulted in the rains starting earlier or later than expected. This unpredictability results in rainfed agriculture farmers experiencing losses on capital, fertilizers, and labor input and consequently declined agricultural productivity. The decline in food production also poses an existential threat to our nation’s food security and farmers’ incomes. Weather forecasts are aimed are reducing this uncertainty however, the sparse distribution of synoptic weather stations in Kenya that collect and monitor surface level meteorological conditions makes it hard for the Kenya Meteorological Department to guarantee a high spatial and temporal resolution. Therefore, the current forecast data disseminated to farmers is ‘coarse’, at the county and town level, which is of less significance to the smallholder farmer since this data does not factor in the topographical nuances within locations. The format of the weather forecasts is also technical for the farmers hence they resort to traditional methods in terms of planning for planting. The study proposed the use of deep learning techniques to build a rainfall forecasting model that accepted historical weather data and returned forecasted rainfall values in millimeters. The historical weather data was satellite data sourced from NASA’s Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA-2). The historical data was used to train a Long Short-Term Memory neural network. An experimental approach was used to determine the number of epochs used in training the model and the number of timesteps/days into the future in which the most optimal model would forecast. In this study, the model forecasts 30 days into the future by looking at the past 60 days observed. The 30-day prediction model had a Root Mean Squared Error of 2.45 millimeters. Therefore, given the farmer’s Global Positioning System coordinates, the system can fetch past 60-day weather data and forecast the rainfall for the coming 30 days to help farmers to determine when to sow.
- ItemA Snake classification model for snakebite envenoming management(Strathmore University, 2022) Mabinda, MariamSnakebite envenoming is a potentially life-threatening disease caused by the injection of toxins through a bite or venom sprayed into the victim’s eyes by certain venomous snake species. WHO program dubbed Neglected Tropical Disease Program (NTD) of 2019 indicated that about 5.4 million snake bites occur each year, resulting in 1.8 to 2.7 million cases of envenoming. Of this about 81,000–138,000 deaths occur and approximately 400,000 people are permanently disabled annually. Kenya is approximated to have more than 15,000 bites annually. Correct identification of the snake species in question plays a critical role in the proper administration of the right first aid and suitable prescription of the anti-venom for the patient. Currently, there is no automated method of identifying snake species using images in Kenya. The usual practice is to, kill the snake and carry it along with the patient to the hospital or to give a visual feature description of the biting snake. Also, a blood test can be done to look for the presence of toxins associated with the described snake species. The challenge however is, that the time required for test results to be out can jeopardize patients' survival depending on the type of venom injected. Furthermore, the cost associated with the test is also punitive. The ability to correctly classify a snake species is a challenging task for both humans and machines mainly because of subtle differences between different snake species and strong variety within the same species. Existing studies used a combination of feature extraction methods and deep neural networks and yielded an accuracy of 90%. These models applied Principal component analysis (PCA) and Linear discriminant analysis (LDA) as feature extractors. However, the use of the Singular Value Decomposition (SVD) algorithm was not explored despite its apparent advantage. This research study solved the classification challenge by creating a Kenyan snake species dataset and developing a snake species classification model that predicts a snake species based on the image and classifies it according to its venom toxicity. The study carried out feature reduction of the images using the SVD algorithm and passed these features as input to a deep learning model using transfer learning. The model was trained using 4521 labelled snake images via supervised transfer learning using MobileNetV2. The model was trained, validated, tested, and achieved an outstanding classification accuracy of 96 %. The model surpassed the accuracy of the existing model.
- ItemTransactional data tracking and analytics tool: case of coffee farming in Kenya(Strathmore University, 2022) Kinyua, John Kiria MugutehReliable agriculture data has become key in enhancing farmer confidence particularly in the production supply chain. In the coffee supply chain, the collection and management of transactional data across the coffee supply chain is critical since the farmers are not directly involved in various sections of the supply chain. Challenges associated with data manipulation and corrupting of the data resulting into payment of non-deserving farmers have led to mistrust and hence declining in coffee farming in the country. This study focused on the development of a Data Aggregation and Tracking Tool that can help farmers monitor the transactional data in the coffee supply chain. To develop the proposed tool, an investigation on the processes and actors involved in the coffee supply chain; examination of the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain; and the identification of technologies that could be aligned with the actions implemented to ensure an effective coffee supply chain in Kenya were done. The proposed tool used the aggregation function that was implemented using the APEX Oracle analytics function. The Agile system development methodology was adopted in the development of the Transactional Data and Analytical Tool (TDAT). At the functional level of the proposed tool, the data clerk can only enter data upon the confirmation of the registration of the farmer. This will eliminate “ghost” farmers. The ability if generating an aggregation data upon entry of the quantity of cherry delivered and the capability of the farmer receiving the continuous notification of the status of the cumulative quantity cherry delivered will eliminate possibility of manipulating the data in the database.
- ItemA Predictive analytics model for pharmaceutical inventory management(Strathmore University, 2022) Musimbi, Patience MusangaInefficient inventory management is a factor that affects pharmacies in Kenya. The unpredictable nature of weather patterns during the traditional long and short rain seasons has resulted in seasons starting earlier or later than expected. Seasonal diseases such as flu may spike up when the temperatures decrease or when the rainy seasons begin, causing an increase in sales of drugs that cure and prevent the flu and vice versa. Due to this unpredictability, pharmacies may fail to stock up or down for different seasons due to unpreparedness and not knowing what to stock and when to stock. Ineffective drug management has a significant financial impact on pharmacies. Inventory management ensures that needed drugs or medicines are always available, in sufficient quantities, of the right type and quality, and are used rationally. An effective drug management process ensures the availability of drugs in the right type and amount in accordance with needs, thereby avoiding drug shortages and excesses. This research proposed a predictive analysis tool that would predict the required drugs or medicines prior to when they are needed, based on sales and seasonality. Another parameter for predictive analysis for this research was the period of the year when a certain disease could be common. This research discussed stocking and inventory management of pharmaceutical products and how predictive analytics with machine learning algorithms could be applied to improve the inventory management process in a pharmacy’s context. The purpose of the study was to examine the inefficient stocking of medicines in pharmacies and use predictive analysis to predict future stock. It reviewed various previous methods used for pharmaceutical inventory management and proposed the SARIMAX model with time series analysis for stock prediction. The result was a model that predicted the quantity of drugs to be stocked for the next six weeks. The six-week prediction model had a Root Mean Squared Error (RMSE) of 5.5.
- ItemA Prototype for securing non-digital assets using non-fungible tokens(Strathmore University, 2022) Kabiru, Brian MwangiIn the capitalistic world, there are many people acquiring and transacting assets. While many of them go about it the legal way, there is also substantial number of those that use any form of trickery to acquire the said assets in the form of forgery. Normal citizens as well as government institutions and financial institutions deal with non-digital asset documents during their day-to-day operations. The analysis of the said documents is not only time consuming but also stressful for the human resources to go through. While there exist many methods to analyse the documents, there is also an affinity to doctor the documents or bypass the process of verification which has many ripple effects. Due to this and other factors, there is need to develop a prototype to protect the integrity of non-digital assets in an automated form that is accessible to both the individual and the institution. Furthermore, to avoid tampering, the prototype must be free from mutable changes and the said changes must be public and open for viewing and verification. This research aims to explore the existing strategies deployed in Kenya and other countries to protect non-digital assets, their merits and their challenges after which a prototype based on the Flow Blockchain Network will be developed for the purpose of protecting, tracking and authenticating non-digital assets.