Classification of anaemia types using supervised machine learning techniques

dc.contributor.authorOnwong'a, C. K.
dc.date.accessioned2026-04-23T15:27:30Z
dc.date.issued2025
dc.descriptionFull - text thesis
dc.description.abstractAnaemia reduction is one of the World Health assembly goals for 2025. Given the complex aetiology of anaemia, classification of nutritional anaemia using traditional methods has limitations and drawbacks. Traditional methods of classification rely heavily on analysis of complete blood count tests which need specialists and trained personnel, and present potential for errors in analysis. These traditional methods are also expensive and time consuming given the wait time between testing and getting the results. Machine learning based algorithms offer more accuracy and efficiency in the classification of anaemia given their ability to learn data and identify patterns. This study aimed at building a classification model for classifying nutritional types of anaemia using supervised machine learning techniques. The dataset that was utilized in this study was retrieved from Kaggle, an opensource dataset repository and used in accordance with the Open Database license. The dataset contained complete blood count test results for patients with proven cases of nutritional anaemia. The data was pre-processed and explored in preparation for model building. The features were all used in the model development because all the variables are different, and they contribute to the classification of anaemia. The models that were built are Naïve Bayes, random forest, XG Boost, decision trees, and multilayer perceptron. These models were tested using the testing set and their performances compared to find the better performing one. Hyperparameter tuning was done on some of the poorer performing models to try and improve their performance. The best performing model was the XG Boost classifier which achieved an accuracy of 98.85%. The poorest performing model was the Gaussian Naive Bayes model with an accuracy score of 0.7872. The SVM model was very computationally heavy and could not build. For deeper analysis of the model, metrics like recall, precision and F1 scores were measured. The XG boost model was then loaded to an interface for functionality testing. The tool was able to classify nutritional classes of anaemia based on complete blood count data entered by a user. This tool could potentially be plugged into hospitals and clinics to aid in the early detection, diagnosis and treatment by reducing the wait time between getting tested and getting results. This can be considered one of many steps towards anaemia reduction. Keywords: Anaemia Classification, Nutritional Anaemia classification, Supervised learning, SMOTE, XG Boost, Random Forest, Naive Bayes, Decision Trees, machine learning, multilayer perceptron.
dc.identifier.citationOnwong’a, C. K. (2025). Classification of anaemia types using supervised machine learning techniques [Strathmore University]. https://hdl.handle.net/11071/16451
dc.identifier.urihttps://hdl.handle.net/11071/16451
dc.language.isoen
dc.publisherStrathmore University
dc.titleClassification of anaemia types using supervised machine learning techniques
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Classification of anaemia types using supervised machine learning techniques.pdf
Size:
1.56 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: