Machine learning for multi-class identification of Gender Based Violence on social media

dc.contributor.authorMutahi, E. W.
dc.date.accessioned2026-05-07T07:36:16Z
dc.date.issued2025
dc.descriptionFull - text thesis
dc.description.abstractThis study aims at showcasing the use of Machine Learning algorithms in the classification of forms of Gender Based Violence using Social Media data. Data mining processes were used to fetch 1 million tweets from January 2012- January 2023 from Twitter using keywords that identified Gender Based Violence. 160,000 tweets were manually labeled to identify the form of Gender Based Violence namely; physical violence, economic violence, sexual violence and emotional violence. The rest of the data was saved in SQLite as a GBV database. The tweets were filtered and analysed using Natural language Processing techniques such as Exploratory Data Analysis, Sentiment analysis and Topic Modelling. Machine learning algorithms such as Naïve bayes, Random Forest and Support Vector Machines were trained using the labelled data in order to predict the form of Gender based violence on the tweets. The models were evaluated using Accuracy, Precision, Recall, F1 score and AUC as the performance metrics. SVM using Glove features had the highest F1 score of 61% and an accuracy score (62%) followed by the Multinomial Logistic Regression at an F1 score of 60% and an accuracy of (61%). A web application was designed on streamlit to host the results of the study and allow users to interact and get the predicted form of GBV from text inputs or from data selected from the GBV database. Logistic Regression and SVM were found to show superiority in the detection of cyberbullying on twitter without the involvement of victims (Muneer,2020). In this study, the classification of GBV was intended to inform key stakeholders on the extent and form of GBV incidences and to aid in the identification or structuring of programs that can offer timely and relevant support to survivors of Gender Based Violence. The insights can be used to build social media-based interventions to support survivors immediately they are identified. Key words: Gender Based Violence (GBV), social media, Machine Learning, Classification
dc.identifier.citationMutahi, E. W. (2025). Machine learning for multi-class identification of Gender Based Violence on social media [Strathmore University]. https://hdl.handle.net/11071/16526
dc.identifier.urihttps://hdl.handle.net/11071/16526
dc.language.isoen_US
dc.publisherStrathmore University
dc.titleMachine learning for multi-class identification of Gender Based Violence on social media
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Machine learning for multi-class identification of Gender Based Violence on social media.pdf
Size:
2.03 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: