Use of regular expressions for multilingual detection of Hate speech in Kenya

Maloba, Wilson

Use of regular expressions for multilingual detection of Hate speech in Kenya

Date

2013

Authors

Maloba, Wilson

Publisher

Strathmore University

Abstract

language in online forums and other text based communication mediums such as SMS is nothing new. It is not uncommon to see comments such as: ‘this comment has been removed due to a low rating’ or simply ‘comment removed’ on websites. Most sites employ a ‘report abuse’ button for users to flag comments they deem as abusive for one reason or another. So how does this happen, how are site administrators able to detect offensive texts? There are several methods used some of which include manual filtering and the use of mufti-level classifiers. However, the focus of this paper is on the use of regular expressions or regex in short. Regular expressions present a powerful method to detect string patterns in text. Hate speech has of late become a sensitive issue in Kenya given that it helped trigger the PEV of 2007/2008. However, the detection of this hate messages relies mostly on what is captured on the media or text that an online user happens to flag. This paper presents a method of using regular expressions, which are tried and tested, in the detection of hate speech in Kenya while taking into consideration three languages: English,Swahili and Sheng.

Description

Thesis submitted to the Faculty of Information Technology in partial fulfillment of the requirements for the award of a Master of Science Telecommunications Innovation and Development of Strathmore University

URI

http://hdl.handle.net/11071/2198

Collections

MMTI Theses and Dissertations (2013)

Full item page

Use of regular expressions for multilingual detection of Hate speech in Kenya

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By