Strathmore University SU+ @ Strathmore University Library Electronic Theses and Dissertations 2018 A Real-time location based algorithm for notification of crime hot-spots using crowd sourcing Maryline Chepngetich Faculty of Information Technology (FIT) Strathmore University Follow this and additional works at https://su-plus.strathmore.edu/handle/11071/6078 Recommended Citation Chepngetich, M. (2018). A Real-time location based algorithm for notification of crime hot- spots using crowd sourcing (Thesis). Strathmore University. Retrieved from https://su- plus.strathmore.edu/handle/11071/6078 This Thesis - Open Access is brought to you for free and open access by DSpace @Strathmore University. It has been accepted for inclusion in Electronic Theses and Dissertations by an authorized administrator of DSpace @Strathmore University. For more information, please contact librarian@strathmore.edu A REAL-TIME LOCATION BASED ALGORITHM FOR NOTIFICATION OF CRIME HOTSPOTS USING CROWDSOURCING By Maryline Chepngetich A Thesis Submitted to the Faculty of Information Technology in partial fulfillment of the requirements for the award of Masters of Science in Information Technology Strathmore University April 2018 ii Declaration I declare I declare that this work is my original work and has not been previously submitted and approved for the award of a degree in this or any other university. To the best of my knowledge and belief, the dissertation contains no material previously published or written by another person except where the due reference is made in the thesis itself. Student Name: Maryline Chepngetich Sign: ________________________ Date: ________________________ Supervisor’s Name: Dr. Bernard Shibwabo Sign: ________________________ Date: ________________________ iii Abstract Security of the people has always been the number one objective of many governments in the world today. Governments endeavour to achieve this objective has faced several challenges ranging from economic, social and political. Despite heavy investments by local and National Government in Kenya on security measures, crime continues to remain a serious problem in the society, as a result, there are loss of lives, loss of property and investors shying away. Gathering relevant and up to date operational information on crime intelligence across several sources has always been one of the challenging issues faced by national security practitioners and citizens. This therefore makes it difficult to identify crime hotspot areas in timely manner, and also improper allocation of Police resources in the right hotspot areas. The data collection exercise was done earnestly to ensure that there was ample understanding of the participants’ interaction with crowdsourcing platforms and their experience and willingness to use a crowd-based crime hotspot reporting network. The study thus found significant justification for the design of the criminal hotspot system to leverage data about crime incidents in the city in order to classify crime hotspots. The design of the system was made using unified modelling language and detailed in the fifth chapter of the thesis. The developed prototype was then tested against parameters to gauge its efficiency and effectiveness. The conclusions of the testing as well as the recommendations of the study are documented in the sixth and last chapter of the study respectively. Keywords: Crowdsourcing, Crime, Intelligence, public portal, criminal hotspot. iv Table of Contents Declaration and Approval ............................................................................................................................. ii Abstract ........................................................................................................................................................ iii Table of Contents ......................................................................................................................................... iv List of Tables .............................................................................................................................................. vii List of Figures .............................................................................................................................................. ix Chapter One: Introduction ............................................................................................................................ 1 1.1 Background ......................................................................................................................................... 1 1.2 Problem Statement .............................................................................................................................. 2 1.3 Aim ..................................................................................................................................................... 3 1.4 Specific Objectives ............................................................................................................................. 3 1.5 Justification ......................................................................................................................................... 3 1.6 Scope and Limitation .......................................................................................................................... 4 Chapter Two: Literature Review................................................................................................................... 5 2.1 Introduction ......................................................................................................................................... 5 2.2 Crowd Sourcing Technologies ............................................................................................................ 6 2.2.1 Crowd Voting ............................................................................................................................... 6 2.2.2 Crowd Solving ............................................................................................................................. 6 2.2.3 Crowd Research ........................................................................................................................... 6 2.2.4 Crowd Funding ............................................................................................................................ 7 2.3 Crowdsourcing Platforms ................................................................................................................... 7 2.3.1 Mobile Crowdsourcing ................................................................................................................ 7 2.3.2 Crowdsourcing in Agriculture ..................................................................................................... 8 2.3.3 Implicit Crowdsourcing ............................................................................................................... 9 2.3.4 Explicit Crowdsourcing ............................................................................................................... 9 2.4 Crime Detection Technologies.......................................................................................................... 10 2.5 Research Gap .................................................................................................................................... 10 2.6 Crowdsourcing models and algorithms ............................................................................................. 11 2.7 Conceptual Framework ..................................................................................................................... 12 Chapter Three: Research Methodology ...................................................................................................... 14 3.1 Introduction ....................................................................................................................................... 14 v 3.2 Agile Software Development Methodology ..................................................................................... 14 3.3 Research Design ................................................................................................................................ 15 3.3.1 System Analysis ......................................................................................................................... 15 3.3.2 System Design ........................................................................................................................... 16 3.4 Target Population and Sampling ....................................................................................................... 17 3.5 Sample Size ....................................................................................................................................... 17 3.6 Data Analysis .................................................................................................................................... 19 3.7 Research Quality ............................................................................................................................... 19 Chapter Four: Data Analysis, Presentation and Interpretation .................................................................... 21 4.1 Introduction ....................................................................................................................................... 21 4.2 Questionnaire Return Rate ................................................................................................................ 21 4.3 Sample Demographics ...................................................................................................................... 22 4.3.1 Gender ........................................................................................................................................ 22 4.3.2 Age Bracket of the Respondents ................................................................................................ 24 4.3.3 Level of Education ..................................................................................................................... 26 4.3.4 Exposure to Crowd-sourcing platforms ..................................................................................... 27 4.3.5 Willingness to Support Crime-sourcing Platform ...................................................................... 28 4.4 Conclusion ........................................................................................................................................ 28 Chapter 5: System Architecture and Design ............................................................................................... 30 5.1 Introduction ....................................................................................................................................... 30 5.2 System Architecture .......................................................................................................................... 30 5.3 System Design .................................................................................................................................. 32 5.3.1 Use Case Diagram ...................................................................................................................... 32 5.3.2 System Sequence Diagram ......................................................................................................... 33 5.3.3 Class Diagram ............................................................................................................................ 34 5.3.4 Entity Relation Diagram ............................................................................................................ 35 5.4 Database Design ................................................................................................................................ 36 5.5 Conclusion ........................................................................................................................................ 37 Chapter 6: System Testing and Implementation ..................................................................................... 38 6.1. Overview .......................................................................................................................................... 38 6.2. Description of the Test Environment ............................................................................................... 38 6.3 Prototype Development Environment ............................................................................................... 38 vi 6.4. Model Components .......................................................................................................................... 38 6.4.1. Login into the system ................................................................................................................ 38 6.5. System Modules ............................................................................................................................... 40 6.6. System testing for the Anomaly Based Fraud Detection System..................................................... 41 6.6.1 Test of the crime reporting module ............................................................................................ 41 6.6.2 Test of the crime information module ........................................................................................ 41 6.6.3 Test of the general information module ..................................................................................... 42 CHAPTER 7: CONCLUSIONS AND RECOMMENDATIONS .............................................................. 43 7.1 General Research conclusions .......................................................................................................... 43 7.2. Achievement of Research objectives ............................................................................................... 44 7.2.1. Identifying the factors that influence identification of crime hotspots in Nairobi .................... 44 7.2.2. Reviewing existing techniques used in identifying crime hotspots .......................................... 44 7.2.3. Developing a crowd-based solution for crime hotspot reporting .............................................. 44 7.2.4. Validating the proposed solution .............................................................................................. 45 7.3. Conclusions on the design and implementation of criminal hotspot network system for city dwellers ................................................................................................................................................... 45 7.4 Challenges realized in the study ........................................................................................................ 45 7.5. Recommendations for further study ................................................................................................. 46 References ................................................................................................................................................... 47 APPENDICES ............................................................................................................................................ 52 Appendix A: Research Questionnaire ..................................................................................................... 52 A.1 Pre-development questionnaire .................................................................................................... 52 A.2 Questionnaire for system testing .................................................................................................. 53 Appendix B: System User Manual.......................................................................................................... 55 B.1 Login into the system ....................................................................................................................... 55 B.2 Reporting a criminal hotspot area .................................................................................................... 56 B.3 Viewing Crime Statistics .................................................................................................................. 57 vii List of Tables Table 4.1 : Gender of the Participants .......................................................................................... 23 Table 4.2: Age Bracket of the Respondents.................................................................................. 25 Table 4.3 : Participants’ exposure to crowdsourcing platforms ................................................... 27 viii List of Equations Equation 1 : Infinite Sampling procedure ..................................................................................... 17 ix List of Figures Figure 2.1 How Crowdfunding works ............................................................................................ 7 Figure 2.2 Crowdsourcing in agriculture ........................................................................................ 8 Figure 2.3 : Microtasks involved in explicit crowdsourcing ........................................................ 10 Figure 2.4 : Conceptual Approach on Criminal Hotspot identification ........................................ 13 Figure 3.1 : Agile software development ...................................................................................... 14 Figure 4.1 : Questionnaire Return Rate ........................................................................................ 22 Figure 4.2 : Gender of the Respondents....................................................................................... 24 Figure 4.3 : Age Bracket of the Respondents ............................................................................... 25 Figure 4.4 : Participants' Level of Education ................................................................................ 26 Figure 4.5 : Willingness to support crime-sourcing platform ....................................................... 28 Figure 5.1: System Architecture Diagram .................................................................................... 31 Figure 5.2: Use Case Diagram ...................................................................................................... 33 Figure 5.3: Sequence Diagram ...................................................................................................... 34 Figure 5.4: Entity Relationship Diagram ...................................................................................... 36 Figure 5.5: Class Diagram ............................................................................................................ 35 Figure 6.1: Login Prompt ............................................................................................................ 319 Figure 6.2: Main System Switchboard ......................................................................................... 40 Figure 6.3: Tests done by testing team ......................................................................................... 41 Figure 6.4: Testing validity of the crime information module...................................................... 42 Figure B.1: Login to the system .................................................................................................... 55 Figure B.2: System Dashboard ..................................................................................................... 56 Figure B.3: Adding a crime hotspot .............................................................................................. 57 Figure B.4: Viewing crime statistics ............................................................................................. 58 1 Chapter One: Introduction 1.1 Background Security of the people has always been the number one objective of many governments in the world today. The Kenyan Government’s endeavour to achieve this objective has faced several challenges ranging from economic, social and political issues. This has made it difficult for different governments to manage and implement development agendas especially where crime has consistently continued to be increasingly rampant. According to Kayrak (2008), many governments in Africa have been unable to handle common civil unrest and criminal activity orchestrated to upset the common stability of regions. Organized gangs and other criminal outfits continue to grow due to the inability to realize measures to minimize their influence over the public (Taylor, Fritsch, & Liederbach, 2014; Verma & Bhatia, 2013). The recently released Economic Survey 2016 (Nkonya, Mirzabaev, & Braun, 2016) shows that criminal activity in Kenya continued to grow. 72,490 cases were reported, compared to 69,376 in 2014 and 71,832 in 2013. The report shows that the most prevalent crime counties were Nairobi (6,732 incidents), Nakuru (4,525), and Kiambu (4,449). Evidently, crime has been on the increase in the country despite efforts to discourage criminal activity and better equip the police to manage and deal with criminal incidents. There is a seemingly incoherent relationship between the government preparedness to fight crime in Kenya and the general anticipated decrease in criminal incidents. It is thus shocking that as Kenya continues to adopt new crime monitoring and detection technologies, crime is on the rise. The application of crowdsourcing techniques can help with identification of crime hotspot areas. The World Wide Web’s phenomenal growth has resulted in more users expressing their opinions online. Nowadays, social media services, have become an indispensable part of people's daily life, where they can share events around them, such as what they have seen and what events have occurred around them, photos or even videos (Byrne-Evans, et al., 2013). As a result, there has been a growing interest in using the Internet crowdsourcing to solve crimes. Social media currently serves approximately 313 million users, with100 million users login daily, and a combined 340 million messages according to Twitter Statistics and Facts (Newman, 2017). 2 In 2015, Kenya came 4th in Africa in a ranking of countries whose citizens tweeted most last year, with 76 million geo-located tweets. This implies that Kenyans are willing and ready to use social media platforms for crowdsourcing. Social media is used, primarily, for the following four reasons; Daily Chatter (such as., status messages on what the user is doing) Conversations, (such as., tweeting to either a user or a group of users within a community), Sharing information (such as., posting links to web pages) and Reporting news (such as., status updates on current affairs (Bollen, Mao & Zheng, 2011). Since social media forums are crispy and brief, the public sentiment can be easily explored. By providing a platform where users can generally get and post information on crime, it is possible to amalgamate criminal data and determine effectively how to manage crimes. Indeed, many information portals are based on average daily use. In order to get relevant and immediate information, it would be necessary to have registered used consistently log into the system for updates as well as to post information about crime and other suspicious criminal activity and incidents (DiGrazia, et al., 2013). The high volume of interaction with twitter has created ‘Big Data’ with each social media platform needing to be able to store all of the data its users create. Big Data technology has made it possible to analyse small bits of information over a long period of time to eventually formulate intelligence information concerning different aspects of a commonly discussed phenomenon that has a public bearing and of national importance. This phenomenon now allows organizations to collect small bits of data over a long period of time to eventually develop 10-year or even century-long findings on an issue of human behavior that can direct predictive analysis on certain antisocial behavior (Bollier & Firestone, 2010). 1.2 Problem Statement Crime is a great impediment to the growth of any nation. In Nairobi, businesses incur a lot of losses as a result of criminal activities. The crime situation has continued to worsen, with many people losing properties and lives, consequently making potential investors to shy away (Kupatadze, 2012). In many instances, people find themselves in criminal hotspots without their knowledge. It would suffice to make them more aware and prepared to live in such circumstances. 3 Citizen involvement and community policing in crime identification has not been fully embraced in Kenya. This therefore makes it difficult to identify crime hotspot areas in timely manner, and also makes it hard for proper allocation of Police resources in the right hotspot areas. Based on these challenges, this research proposes a model that addresses issues relating to real- time identification of crime hotspots by collecting crime information from social media data on major crime types and analysing them. This should assist in identifying crime hotspot areas, and therefore facilitating the police in allocating their resources more effectively and predicting future crime. 1.3 Aim The aim of this research is to develop a model for Real-Time Location based Algorithm for Notification of Crime Hotspots Using Crowdsourcing so as to aid in intelligence and in combating criminal activity. By having data gathered about the different criminal activities and represented in an information gathering interface, the solution is a concise and effective tool to efficiently guide in crime prevention. 1.4 Specific Objectives i. To investigate factors relating to crime hotspots areas in cities. ii. To review the existing techniques and solutions used in the identification of crime hotspot areas. iii. To develop a crowd-based model to gather criminal-related information, and analyse it in order to identify crime hotspot areas in Nairobi. iv. To validate the proposed solution. 1.5 Justification One of the National crime prevention strategy is Community Policing. Community Policing is an approach to policing that recognizes the independence and shared responsibility of the Police and the Community in ensuring a safe and secure environment for all citizens. It aims at establishing an active and equal partnership between the Police and the public through which crime and community safety issues can jointly be discussed and solutions determined and implemented (Cordner, 2014). 4 This research proposes an approach of crowd sourcing, by collecting crime information from public data on major crime types and analysing them. This can assist in identifying crime hotspot areas, and therefore facilitating the police in allocating their resources more effectively. 1.6 Scope and Limitation The greatest threats to life in Nairobi, Kenya continue to be road safety and crime. Particularly in Nairobi, Violent, and sometimes fatal, criminal attacks, including home invasions, burglaries, armed carjacking, grenade attacks, and kidnappings can occur at any time and in any location. This particular research focuses on major threats to security in the Nairobi region (LeBas, 2013). The words used on web platforms do not fully constitute formal language, they involve acronyms, emoticons, slang and sheng. This at times makes it difficult as there is also an increase of such slang in the social media. The prototype developed shall thus fail to identify or categorize information that in incorrigible or in a language other than English. 5 Chapter Two: Literature Review 2.1 Introduction Intelligence gathering is the purview of any government that seeks to ensure the safety of its citizens. According to Cordner (2014), every government has the interests of the citizen within the realm of responsibilities that cannot be avoided. Regardless, many governments find it very challenging to promise, guarantee or even control the security in the nation. Criminal activities around the world can be in two forms; local crime and international crime (Soden & Palen, 2014; Sen & Ghosh, 2018). Local criminal activities intend to cause harm, loss of property and unfair gain in wealth or resources through corrupt or illegal means. International crimes are initiated and executed by organized crime syndicates. They often include; prostitution rings, illegal firearms trade, drug peddling, ivory and exotic wildlife trade, human trafficking and other commercial crimes such as fraud. The constitution of Kenya is very clear about the actions that need to be taken once a criminal or suspect is apprehended in connection to any of these crimes. However, the constitution does not prescribe investigation methods. LeBas (2013) asserts that, as criminal activities continue to increase, they become more sophisticated and difficult to detect. There are modern technologies such as social media and cloud technologies that have motivated cybercrime as well. As technology advances, it is important to take advantage of the same technologies to combat crime. Most criminal activities are repeat patterns of fraud and con schemes that can be detected if the public is made aware. However, most governments do not have the prior intelligence with which to notify the public and keep them ware of criminals. Data gathering can only be efficient if all citizens take part in the process and aid others who may not be aware of impending crimes of the existing dangers and how to handle them. According to Aten and Thomas (2016), crowdsourcing technology has existed for almost a decade now. Leveraging on crowd sourcing has enabled many advancements in technology across the world. The potential of crowd sourcing can be well explored in crime reporting and awareness projects as well (Taylor, Fritsch, & Liederbach, 2014; Verma & Bhatia, 2013). 6 2.2 Crowd Sourcing Technologies 2.2.1 Crowd Voting Crowd voting is a technique that allows the public to be involved in different appraisals about issues concerning them through online surveys. These surveys are designed to get deterministic responses from them and tally the resultant feedback to get an average view of the majority’s decision on a matter of opinion on an issue that would affect them directly. This technique is commonly used in modern online polls during election campaigns (Bailard & Livingstone, 2014; Quercia & Saez, 2014). 2.2.2 Crowd Solving According to Sen and Ghosh (2018), Crowd solving is the creation of forums for the public to give expert opinion on a matter that requires an urgent or all-inclusive solution. On an online platform, crowd-solving involves someone posing a question and then inviting all available persons to give their best answers to the particular issue. Eventually, the general view of the majority determines the approach one would take in handling the issue of concern that afflicts them (Singh, et al., 2014; Verma & Bhatia, 2013). Crowd solving is used in Engineering, Medicine and IT forums to find answers to very technical problems, as provided by experts who are members in these forums. 2.2.3 Crowd Research Crowd research is the use of online surveys to gather specific data that may be qualitative but often quantitative about an issue that is known to the public. According to Bamberger, et al. (2015), crowd research often seeks information about the locality, gender, age and education of a population by asking each respondent to indicate a specific answer in response to these major variables. The responses then assist the researcher to analyse the information and form generalist probability determinations about an entire region or place. Such information can then be used to assist the residents of the region either through government policies or donor funding. Crowd research however also applies to company research on products and the target markets they would find it successful to reach. Eventually, this research shapes modern manufacturing and market penetration strategies by large industries and SMEs (Taylor, Fritsch & Liederbach, 2014; Young & Hermida, 2015). 7 2.2.4 Crowd Funding Crowd funding is also referred to as crowd-fundraising. It incorporates online payment platforms to accept donations from different persons for a single purpose or goal. According to Aten and Thomas (2016), common crowd-funding goals include; paying of hospital bills, raising money for political campaigns and even raising money for charitable organizations that distribute food, medical aid and blankets to refugees. Often, funds raised online are regulated and monitored by relevant governments and international e-payment service providers. By donating small amounts, many people can aid in the alleviation of disasters (Sanga, et al., 2016; Quercia & Saez, 2014). Crowd funding is very common in the American political campaigns as it is a requirement for candidates seeking public mandate to fundraise. Figure 2.1 demonstrates the concept of crowd- funding. Figure 2.1 : How Crowdfunding Works (Adapted from He & Chan, 2016) 2.3 Crowdsourcing Platforms 2.3.1 Mobile Crowdsourcing Feng, et al. (2014) argues that the advent of Mobile Computing led to the development and proliferation of mobile crowd sourcing. It is the technology that brings together smartphone owners to a single platform to undertake different communal tasks. Thanks to the Global 8 Positioning System and the General Packet Radio Service (GPRS) technologies supported by smart phones, it is easy to know the location of all persons contributing to a forum and thus limit contributions to specific localities. Mobile crowd sourcing is popular is social media platforms that can pinpoint the location of the respondents and make it easy for people to meet and chat on different issues online (Phua, et al., 2012; Ntalianis, et al., 2014). 2.3.2 Crowdsourcing in Agriculture According to Sanga, et al. (2016), Crowdsourcing has gained momentum in the field of agriculture due to the ability to gather information about climatic conditions in different regions and have it displayed online. Information on pest infestations and other attacks on livestock also find its way into online forums where farmers can understand and take action on such grave matters. Crowd sourcing in agriculture has made it possible for new farmers without significant experience in farming to learn on best methods and practises to cultivate as well as the different places to get good and affordable farm implements (Sanga, et al., 2016; Kwapisz, Weiss, and Moore, 2011). Due to the increased support in farming activities. Farmers have been able to use online platforms to sell their produce to large-scale organizations as well as get capital and investment for different farming ventures. These efforts have boosted food security in many parts of the world. Figure 2.2 represents online data on crop yields that has been captured on a crowdsourcing portal. Figure 2.2 Crowdsourcing in Agriculture (Adapted from Singh, et al. 2014) 9 2.3.3 Implicit Crowdsourcing Implicit crowdsourcing is the use of unstructured forums to let users discuss problems while other users present solutions. According to Quercia and Saez (2014), there is no specific issue being targeted but eventually, many people get help out of problems that their colleagues, friends or fellow system users have faced. This way, they are able to get solutions without having to ask the specific questions and even avoid certain mistakes. Most forums where implicit crowdsourcing is done are based on a specific area of study or industry such that they attract similar system users and contributors to the forum (Larman, 2012; Kupatadze, 2012; Hudson-Smith, et al., 2009). The issues discussed nonetheless may take any approach or form. Eventually, solutions developed from these problems bear a lot of significance to a majority of the members as they do address issues people may have failed to mention before yet critical ones. Ntalianis, et al. (2014) argues that the approach to implicit crowdsourcing is more like group therapy, where people may have specific problems and believe that they suffer individually yet it is a communal affair. Most implicit solutions assist persons who deem themselves to be introverts thus unable to precisely discuss their problems or challenges with the rest of the public. 2.3.4 Explicit Crowdsourcing Fish (2013) believes that, in explicit crowdsourcing, the tasks, issues and problems discussed are specific and require specific answers or approaches. They are often methodical problems that can only have a predetermined solution or approach. Indeed, many explicit crowdsourcing concerns addressed online require a sense of familiarity among the participants. They should be able to relate to one another in a manner that suggests consistent and strategic help. According to Ntalianis and Tsapatsoulis (2016), Explicit crowdsourcing also allows for persons to develop unique forums and invite like-minded persons to discuss certain social or professional issues with. This is the approach encouraged and practised by most of the social media group forums. The people in the groups eventually develop a rapport that can be tied to some relevant similarity or factor they have in common. Solutions here are tailormade and very personal to the person or persons with the problems or needs. Figure 2.3 illustrates micro-tasks applicable in explicit crowdsourcing (DiGrazia, et al., 2013; Byrne Evans, et al., 2013; Bollen, Mao & Zeng, 2011). 10 Figure 2.3 :Microtasks involved in explicit crowdsourcing (Adapted from Soden & Palen, 2014) 2.4 Crime Detection Technologies Taylor, Fritsch, and Liederbach (2014) argue that Crime detection is the survey, review and investigation of criminal activity with the intent to realize the cause and perpetrators for prosecution. Unfortunately, most crime detection technologies available do not rely on crowd- sourced information. According to Tayal, et al. (2015), crime detection technologies depend on available data on criminal activities and other monitoring systems to discover and arrest criminal suspects. Most of the technologies available in the market include; sensing technologies such as lasers, motion and heat sensors, crime detection cameras and biometric system alerts. Most biometric systems restrict entry to locations with valuable data, property or fragile property of value. Phua, et al. (2012) believes that Crime detection technologies often present a way to realize and handle criminal activity through silent alarms, siren alarms and other signals. These technologies can only indicate high risk criminal targets through analysis of past data. Nonetheless, as Nickell and Fischer (2013) argue, there are many criminal activities that cannot be addressed by machines and digital sensors but human input as well. It is necessary to incorporate human senses and machine technology in crime prevention in order to have a more accurate feedback on criminal activities and regions often targeted for crime. 2.5 Research Gap Many researchers have addressed issues regarding handling of criminal data and analysing it using algorithms to come up with specified targets for criminal activity. According to Young and 11 Hermida (2015), crime reporting is hampered by the need to report factual crimes taking place and not just suspicious activity. Indeed, the classification of hotspots is dependent on the realization of the different reports about crimes taking place in these areas. However, despite the reality that public participation is important in the fight against crime, many researchers have trivialized the importance of anonymous crime reporting and the use of unverified criminal reports in crime databases. There is need for research into a system that can appreciate anonymous data from public crime incident report and use the same data to develop police intelligence databases that can assist in tracking and arresting suspicious criminal activity. There is also need to incorporate crowdsourcing algorithms to accurately classify criminal hotspots and direct policing resources in such regions. This research seeks to address these gaps by introducing a system that can accurately pinpoint criminal hotspots based on crowdsourcing. 2.6 Crowdsourcing Models and Algorithms Hudson-Smith, et al. (2009) and Soden and Palen (2014) worked on the use of crowdsourcing algorithms to develop portals for information on different land marks and convenient locations with the use of Google Map technology. The research was able to come up with an automated assistant for travellers seeking to find common stops such as fuel stations, diners and hotels based on the technology developed. According to Colorado, et al. (2015), there has been continuous research on the use of modern geo-mapping systems to predict shortest routes from a point to a destination. Researchers have committed resources to provide further information about locations within the Google Map directory so as to aid travellers and other tourists who may not be aware of different regions spot black spots on roads, distinguish between the quality of restaurants they seek, as well as find affordable convenient points based on their budgets. All these researchers seek to use available data and manipulate it in a manner that eventually predictively presents the results desired by the individual based on their needs (Aten & Thomas, 2016; Bailard & Livingston, 2014). Jens Rasmussen was the first researcher to develop geo-mapping crowdsourcing technologies that led to the development of the Google Pins that pinpoint one’s location in coordinates in early 2004 (Verma & Bhatia, 2013). Researchers in companies such as Uber and Didi Chuxing have developed technologies to locate taxi service clients based on their GPS and the GPS located on the taxi. The same technology is applied in billing the client based on the destination of their travel 12 and the mileage they seek to be driven. Lars Rasmussen worked on a system to allow specific phone users to locate each other using gyroscopes and accelerometers on their phones (Kwapisz, Weiss, & Moore, 2011). According to He and Chan (2016), Wi-Fi technology is also able to bring together users and create a forum for discussions only that it is yet secured and may face many hurdles relating to hacking and trust issues from users. Related work on using crowdsourcing technology can also be traced to Hester, Shaw and Biewald (2010), whose solution has assisted donors pinpoint and deliver relief aid to hunger and war victims who needed the help yet could not get to major distribution centres due to their inability to travel. 2.7 Conceptual Framework The research on the use of the existing crowd sourcing technologies to develop a real-time location-based notification of criminal hotspots requires consistent input from the public. Just like all the crowdsourcing technologies available as discussed in the literature, a lot of awareness needs to be done to make the public willing and ready to use the system. However, it is important to ensure that the system presents little tolerance to bias and false-reporting. Generally, the metrics used in ensuring accurate information gathering should be similar to those used in explicit crowdsourcing to ensure that governments, especially the government of Kenya can trust the proposed system and implement its use in the distribution of policing resources and the securing of the country. The algorithms applied and the methodologies intended shall be discussed in the latter chapters of this research. This conceptual framework illustrates the simplified approach to solving the issues addressed in the research. It offers the approach to developing the system in a simplified flow chart that demonstrates the system inputs as well as the anticipated outputs based on hypothesized appreciation for the developed system. A value stream map is applied for this conceptual approach as it presents an approach that will validate all the information from the system users. It is shown in Figure 2.4 13 Figure 2.4 : Conceptual Framework for Criminal Hotspot Identification 14 Chapter Three: Research Methodology 3.1 Introduction The research is aimed at finding out how real-time location-based crime reporting can be implemented in Kenya using crowdsourcing technology to identify crime hotspots in the country. The methods used for conducting the research and its viability is described in this chapter. The target population, the sample size to use in the research, data collection procedures and analysis of the results obtained are also discussed. Furthermore, this section studies the approaches applied in system analysis, system architecture, system design, system development, and implementation and testing. 3.2 Agile Software Development Methodology Figure 3.5 : Agile software Development Agile software development method allows for faster iteration and more frequent release with subsequent user feedback. Agile processes allow release schedule and user feedback opportunities that allows faster and more controlled improvements. This methodology also allows for repeated improvements on the different modules of the prototype based on the success of the research and the discovery of new technologies to improve the functionality of the anticipated system. Above all, it enables the researcher to better define the system requirements as the process is done incrementally (Lu & DeClue, 2011). 15 Figure 3.1 shows the steps followed in the research to achieve the set objectives for this thesis. The first step was requirements which involved the collection of the intended product specification or features and specifying what it should do or how it should do it. The second step was the architecture and design which includes defining the architecture and design of the system. Development of the system is the third step which involves implementation of the system. Test and feedback is the fourth step which allows the product improvement (Krasteva & Llieva, 2008). The developed model was tested independently during every development iteration. The data flow between the different components was also tested to ensure complete test coverage. Testing the model, was to make sure that the needed functionalities are working as required. 3.3 Research Design Research design is the conceptual structure within which research is conducted and guides in the collection, measurement and analysis of data (Lewis, 2015). Research design has been used in planning for the methods adopted for collecting relevant data and techniques used in data analysis, keeping in view the objective of the research and availability of resources. The research incorporates qualitative and quantitative methods of research. Qualitative research objective is to get an enhanced understanding through truthful reporting, first-hand experience, and citations of actual conversations. This was used to understand the current platforms and process being used in crowdsourcing of data by the different organizations applying the technology. The quantitative research was used to see the number of people who would like to participate in crowdsourcing for the purpose of reporting criminal hotspots in Kenya. 3.3.1 System Analysis There are three approaches in information system development section; data-oriented, process oriented, and object-oriented approaches. The object-oriented method, unlike its two predecessors that lay emphasis either on data or process, combines processes and data into single entities called objects. It incorporates the concept of encapsulation, which is the main aspect emphasised by most object-oriented languages. Encapsulation is the association of methods and data members (objects) into analogous groups referred to as classes. From these classes, the members can then be called 16 (referenced) by an instance of the class (object) in order to minimize compiling deadlocks (Smith, 2015). Object-oriented Analysis (OOA) is the concept used in this research. OOA escalates the understanding of problem domains because OOA promotes a smooth transition from the analysis phase to the design phase and offers a more ordinary way of establishing specifications. This study focuses on use-case modelling and class modelling to explore the various approaches that are conducted in the analysis of the system. In the object-oriented system development life cycle, use- case modelling is established in the analysis phase. Use-case modelling is done in the initial stages of system development to help the developers gain a perfect understanding of the functional requirement of the system without worrying about how those requirements were applied. A use-case model consists of use cases and actors. An actor is an external entity that interacts with the system and a use case denotes a sequence of interrelated activities initiated by an actor to achieve a precise objective. In Unified Modelling Language (UML) designs used in object- oriented system design, use case diagrams encapsulate methods. However, the UML schemes used to demonstrate the different aspects of the crowdsourcing system also incorporated deployment diagrams, entity relation diagrams, class diagrams and data flow diagrams. It is necessary to model all the components of the system so as to ensure that deployment and development are in sync, an aspect that makes system implementation quite easy and efficient (Larman, 2012). 3.3.2 System Design Object-oriented design (OOD) techniques was used to refine the object requirements definition identified during system analysis and to define design specific objects. Design class diagram was used for general conceptual forming of the software systematics. Design class diagram also entails comprehensive modelling to translate the models into programming code and for data modelling. The research adopted design class diagram to embrace classes which comprise the main methods, objects and interactions of the system. The relationship between the methods and the objects was to be represented in the class diagram, the sequence diagram and the component diagrams. 17 3.4 Target Population and Sampling Draugalis and Plaza (2009) define population as the total of all subject items under consideration for a study. This study covered different software development companies in Nairobi, as well as the public. The selection of the target population was informed because of the strategic location of Nairobi. Furthermore, most if not all software companies in Kenya have their headquarters in Nairobi. The research study, targeted the public as the data utilised in the system would be sourced from the public by participating in the real-time reporting of the criminal activities in their region. According to Baruch and Holtom (2008), sampling is the process of obtaining information about a given population by examining a part of it. A sample population should be as representative of the population’s characteristics with no bias so as to result in valid, reliable conclusions. The study was conducted through random sampling of the respondents from different suburbs within the city, as well as software companies located within the central business district for ease of access. Random sampling has the advantage of being in a position to obtain as diverse as possible a picture of the variables being tested and ensures reliability, validity and objectivity. 3.5 Sample Size The researcher could not determine the exact number of respondents from the public and the software development institutions in Nairobi which led to the choice of a sampling method that suits a large infinite population. To eliminate bias in the sample chosen, it is often important to apply an effective sampling formula that takes into account the sample space as well as the distribution of the research population/potential respondents (Dattalo, 2008). A sample size for infinite population that is where the population is greater than 50,000. According to Kjær, et al. (2007), the sample size for such a population can be determined using the formula shown in formula 3.1 SS=Z^2*P*(1-p)/C^2 Equation 1:Infinite Sampling Procedure 18 Where SS= Sample size Z=Z-value P=Percentage of population likely to respond C= confidence interval. Once sample data collected and the sample mean x is calculated, the sample mean is different from the population u. The difference between the sample and population mean is the error E. 𝑬 = 𝒛𝒂/𝟐( 𝒂 √𝒏 ) This can be resolved for n (the sample), used to determine the minimum sample size to be used in order to assure a given level of confidence and maximum error allowed. 𝒏 = ( 𝒛𝒂𝝈 𝟐 𝑬 )2 Taking the following for this study, 95%confidence level (z-value-1.96) Confidence interval of 0.05 3.5 Data Collection Methods Research is a methodical process full of different procedures and generally accepted principles as well as formulae. Data collection methods are those techniques that are used for gathering research data. These techniques are meant to ensure that the final data gathered is free from bias and is representative of the entire population intended for research. Data collection methods are often informed by the type of research being carried out (Sullivan-Bolyai, Bova, & Singh, 2014). The Data was also collected using questionnaires. Questionnaires consist of a set of questions printed or typed in a definite format in a form (Draugalis & Plaza, 2009). The questionnaires were sent to the respondents with a request to answer them and return them in time. Respondents were expected to answer the questions by themselves and give very objective responses. The questionnaires were preferred as they allow the collection of large amounts of data and are manageable in terms of time and costs. Questionnaires are more convenient for respondents therefore they give more honest answers. 19 3.6 Data Analysis Analysis of data is a process of inspecting, cleaning, transforming and modelling data with a goal of highlighting useful information, suggesting conclusions and supporting decision making. The analysis offers an insight into the nature of data collected and the implications for the rest of the research. The analysis bit presents the verdict of the responses from the mixed methods research undertaken (Dattalo, 2008). During data analysis, relationships or differences supporting or conflicting with original or new hypotheses should be subjected to statistical tests of significance to determine with what validity data can be said to indicate any conclusions (Baruch & Holtom, 2008). This research utilized both qualitative and quantitative methods of data analysis. Statistical package for Social Sciences (SPSS) was used in data analysis. 3.7 Research Quality The issues raised and discussed in this chapter were obtained from reliable and authenticated sources. The research is based on real-time identification of crime hotspots by the public and how the public feel about the proposal to develop a system that can aid in identifying crime hotspots especially in the city. City dwellers participated in the research and offer their views on the system and comment on their willingness and commitment to use the system. This ensured that the development of the system is justified, thus a measure of quality guarantee. The research methods to be used were as objective and in line with the laid down procedures of the University for Reliability Purposes. The research was supervised by experienced and peer reviewed scholars from the University. This ensured that the outcome and process of the research presents a standardized and acceptable methodology, leading up to the development of a prototype that offers solutions to the existing problem criminal hotspot realization within the city. Eventually, the researcher was tied to generally accepted research principles and metrics that guaranteed quality of the work produced. The discussed data collection procedures and methods followed strictly the sampling method as outlined to avoid biased results to ensure objectivity and reliability. 20 3.8 Ethical Consideration The researcher sought permission before using names and data of individuals during data collection. The research however anticipated to steer away from using personal information concerning the respondents in any way. This was in line with data collection principles that call for total discretion in the process of research. The researcher explained the purpose of the study to all the participants, all information that may affect a participant’s decision will be clearly stated. The participants were made aware of their input and the nature of data expected of them. All participants were of legal age, thus responsible for their own decisions and shall be committed to the process only with their consent. All personal information given by a participant willingly or unwillingly was treated as confidential and privately stored to avoid infringing into their privacy rights. Such information was not divulged to a third party nor used in any way without the consent of the participant. The researcher personally partook all questioning and interviewing necessary to ensure that the liability and risk of legal exposure solely lies with the researcher. Reference material used in the study was duly referenced in this research. This work contains only original work from the researcher. Where information from outside sources has been included, it is fully referenced and cited in-text to acknowledge the source of the data, statement or image. Plagiarism has thus been highly avoided in the development of this thesis. 21 Chapter Four: Data Analysis, Presentation and Interpretation 4.1 Introduction In this chapter, the findings of the survey pitting the different categories of respondents shall be represented. Indeed, gathering information on a phenomenon of technology in Kenya is not quite easy. Efficient and effaceable processes had to be employed to guarantee participation. Generally, online tools such as Google forms and emails were employed immensely to gather the data from the respondents. The respondents had been classified into; the public and software development companies in the city. This analysis and presentation of findings thus focuses on the responses from the two groups. The research was guided by specific objectives that were further compounded into the conceptual framework. The data gathered was from questionnaires and interviews. This analysis sums up the responses from the participants, as gathered using Google forms. 4.2 Questionnaire Return Rate The research had a sample of 100 participants, of which 90 were successfully reached to fill out questionnaires by the means of Google forms. These respondents were chosen from both the public and the software companies involved in crowd-sourcing applications. The questionnaires were appropriately distributed to the respondents relatively at the same time. Out of the 90 questionnaires sent out, a total of 74 forms were filled by the respondents. This is in congruent to the provisions of Mugenda and Mugenda (2003) who argue that; as long as the response rate from a questionnaire is more than half of the submitted lot, the response rate is valid to make conclusions about the research. Figure 4.1 represents the response rate and validates its satisfactory nature. 22 Figure 4.1 : Questionnaire Return Rate 4.3 Sample Demographics It is important to understand various aspects about the respondents that actually participated in the research. This can only be done by analysing their demographics. Important demographics about any group include; the gender, the age, the level of education and experience on the subject matter. With regard to the subject matter, the concern was on exposure or understanding on crowdsourcing applications and platforms. Generally, all participants were asked questions that would lead to an understanding of who they were, and thus understand their behaviour patterns. It would be vital to get the right depiction of demographics even before assessing their feelings on the subject of crowdsourcing in identifying criminal hotspots in the city, so as to measure trends of use in the technology as well as the acceptance of a platform to endorse the technology if one were provided. These demographics would also assist in the use of the research analysis techniques that would be applied on the data. 4.3.1 Gender Although gender balance was not required for the research, it was an important social statistic that would predict platform usage for the prototype that would later be developed as part of the research 82% 18% Questionnaire Return Rate Returned Questionnaires Unreturned Questionnaires 23 objectives. As such, all respondents were asked to indicate their gender. Regardless, the respondents were not chosen based on their gender but by their role in the research. It was purely a coincidental concept that would later reveal itself out of the nature of natural distribution. The ratio of male to female participants was generally fair and reasonably unbiased. The results of the gender analysis are as represented in Table 4.1. Table 4.1: Gender of the Participants Tool Used Gender No. of Respondents Rate (%) Questionnaire Male 42 56.76 Interview Male 6 8.1 Questionnaire Female 22 29.7 Interview Female 4 5.4 Total 74 100.0 The respondents participated from both genders. Out of the 74 respondents participating in the research, 6 male respondents were either interviewed, while 42 were given questionnaire forms to fill using the Google form platform. 22 Female respondents also participated in the Google form questionnaire survey while only 4 female respondents were interviewed. A graphical representation of these statistics is as shown in Figure 4.2. 24 Figure 4.2: Gender of the Respondents 4.3.2 Age Bracket of the Respondents The age demographic was important because it would represent the age category of the participants and attach the feelings and suggestions of the respondents of the respondents to their very age. Indeed, given that the research could not capture all age brackets, it would detail the category of respondents that gave these particular views concerning the research and the outcome of the study as realized in this chapter. For purposes of this study, three major age brackets were considered; below 30 years, between 31-45 years and above 45 years. The goal was not to estimate the exact age of the respondents but to get an estimate of the feeling of the respondents from the different age brackets. The responses from the participants concerning their age estimates is as indicated in Table 4.2 Gender of the Respondents Male Questionnaires Female Questionnaires Male Interviews Female Interviews 25 Table 4.2: Age Bracket of the Respondents AGE BRACKET (YRS) NUMBER RATE (%) BELOW 30 YEARS 42 56.8 31-45 YEARS 22 29.7 ABOVE 45 YEARS 10 13.5 TOTAL 74 100.0 The respondents in this research were majorly below 30 years (56.8%). Only about 14 percent (13.5%) of the respondents were above 45 years. The rest of the respondents (29.5%) were aged between 31-45 years. This meant that the inferences drawn from the questionnaires and interviews would be majorly from young respondents. These were the majority of persons either working for leading software companies in the city or actively involved in crowdsourcing platforms, especially on social media channels. The clear depiction of this analysis is as shown in Figure 4.3 Figure 4.3: Age Bracket of the Respondents 57%30% 13% Age Bracket of the Respondents Below 30 Years 31-45 Years Above 45 Years 26 4.3.3 Level of Education The importance of level of education to the study was to ascertain the competence levels of the respondents. Education is an indicator of exposure as well as the practicability of the responses gathered. For a sample of highly educated respondents, the responses are equally of high value to future research inferences. It was also important to ask this question before seeking their exposure to crowdsourcing platforms as it would indicate the quality level of the data they would offer the research. Indeed, it would be important to work with educated or exposed respondents to make the platform anticipated for criminal hotspot reporting using crowd-sourced data. This is as demonstrated in Figure 4.4 Figure 4.4: Participants' Level of Education From the findings of the research, it was determined that 21.4% of the respondents had some tertiary training that did not get to the level of varsities of colleges. Varsity level education had been attained by at least 34.5% of the respondents. The research also determined that 7.2% of the respondents had some post-graduate training while 36.9% had college/tertiary level education. Generally, the education levels of the participants in the research were quite high. This meant that the likelihood that they would offer quality input and support for the crowdsourcing platform was as high too. 0 5 10 15 20 25 30 35 40 Technical Training Tertiary College Varsity Level Post-graduate Level Participants' Level of Education Participants' Level of Education 27 4.3.4 Exposure to Crowd-sourcing platforms Upon determining the level of the participants’ education, it would be necessary to ask them about crowdsourcing platforms to gather their understanding on the subject matter. The question was phrased by way of Likert scale seeking on a scale of 1-5 to get the response from the respondents on how exposed they were to such platforms. Indeed, the question received mixed responses from the 74 participants. Table 4.4 indicates the responses as gathered from the participating groups in the research. Table 4.3: Participants’ Exposure to Crowdsourcing Platforms Exposure Level Number of Respondents Rate (%) Very Exposed (5) Exposed (4) Moderately exposed (3) Less Exposed (2) Unexposed (1) 25 24 11 9 5 33.8 32.4 14.9 12.2 6.7 Total 74 100 On the matter of exposure to crowd-sourcing platforms, the responses can be classified as; 66% of the respondents were properly aware and engaged on the use of the platforms. About 15% of the respondents were aware of the presence of these platforms and their use. 12% were only aware that crowdsourcing platforms existed but had never used them before. Only about 7% of the respondents were totally unaware about crowdsourcing platforms. This meant that more than two thirds of the respondents (49 participants) could actually be of help in making the platform a reality. The next set of questions would be geared at gauging their willingness to participate in system development and testing, as well as implementation. 28 4.3.5 Willingness to Support Crime-sourcing Platform Upon gathering information about their awareness on the crowdsourcing platforms in use, those who were unaware were given some background information in the interviews. Emails sent to the respondents also indicated the purpose of the study so as to familiarize the participants with the concept of crowdsourcing. This would be necessary in order for everyone to be in full knowledge of the prototype that would be discussed in this question. As such, their willingness to participate in the use and testing of a prototype in any way was asked. The answered sought were either ‘yes’ or ‘no.’ Figure 4.5 indicates the willingness of the participants to participate in the use of the prototype. Figure 4.5: Willingness to Support Crime-sourcing Platform 4.4 Conclusion The findings of the data collection process indicate very positive reviews on the nature of the participants with regard to their demographics. The respondents also showed an inherent willingness to participate in the testing and use of the prototype if one were developed. The respondents were mostly educated participants who would be very influential to the use of the platform among other city dwellers. The next level of the research would pertain the design of the Willingness to Support Crime-Sourcing Platform Willing Unwilling Unsure 29 said system and the implementation of a prototype. It was important to get many participants willing to partake in the use of the system as the same participants would be used to test the developed prototype and propose areas for improvement in view of the final system that would be developed if the prototype were endorsed by the different stakeholders in the country. 30 Chapter 5: System Architecture and Design 5.1 Introduction This chapter was designed to examine the logical and physical setup of the system to be developed. The design of the system takes into consideration various important factors that are necessary to ensure effective delivery of a crowd-sourcing platform. Generally, the system to be developed was based on effective control of data about certain phenomenon and the ability to get the right respondents to test the system. However, the designs of the system were necessary to model an efficient prototype. Developing a complete system would require concise understanding of social media application programming interfaces. However, given the need to test and affirm the functionality of the system, such an approach would be costly for the intended purpose. As such, a web-based design would incorporate common tools to capture data and allow respondents offer information concerning their location and witnessed crimes. The rest of this chapter attempts to model the system using object-oriented tools available in the Unified Modelling Language object library. This is so since the prototype development takes the form and design of an object- oriented system, with significant encapsulation and abstraction to the user. 5.2 System Architecture The architecture of a system presents its interactions between the input, processes and outputs anticipated. Figure 5.1 represents the system architecture of the crowd-sourcing prototype that will be used to identify crime hotspots in the city. The goal of the system architecture diagram is to represent the general flow of information from on tool or component of the system to the next. It models the entire steps data has to undergo before being fully processed by the system. The architecture diagram is also the blue print from which most of the other system diagrams are modelled as well. 31 Figure 5.6: System Architecture For the optimal running of the system, there is need for both local and remote access to the system. The users are expected to interact with the system’s dashboard in a candid way and only report or publish data that is of relevance to the system. As a validation measure, users shall be restricted to the input types acceptable and thus, most of the information from the system shall be pre- anticipated. New users shall be met with prompts for different information and dialog boxes with options. Radio buttons and dropdown lists shall be employed in the system as well. The front-end and back-end (database) are connected through a series of servers that implements data-mining algorithms. The effective result of the users input are generation of statistics on criminal hotspots in the city. Each input thus alters the existing database and change the analytical perspective of the system on criminal hotspots in the city county. 32 5.3 System Design The design of the proposed crowdsourcing platform for identifying criminal hotspots in Nairobi City County will be done in a manner to suggest proper and reliable data collection from varied user sources yet form a formidable criminal database with effaceable analytical tools. In order to do so, several diagrams modelled from the unified modelling language were necessary to ensure proper design of the model for structured development and adherence to the waterfall model of the system development life cycle. Several diagrams are represented in the rest of this section. They form the collective design approach taken in the development of the prototype for the crowdsourcing system. The first of these diagrams drawn to UML standards was the Use case diagram as shown in Figure 5.2. The diagram illustrates the different players in the system and the relationships between players. 5.3.1 Use Case Diagram The use case diagram models all system use cases as anticipated in the actual design and development of the system. All actors and the methods they interact with are clearly represented in Figure 5.2. From the use case, there are clear representations of the actors, the functions and the relationships modelled by the prototypes. The different system access cases (use cases) are also represented in the diagram. 33 Figure 5.7: Use Case Diagram Figure 5.2 shows the design of the use cases the system should have. The user shall undergo a series of transformations as they use the system. They can choose to offer information as anonymous new users but have no access to crime databases. They can also choose to be web users who simply comment about crimes and leave notes or register to be users who can offer credible intelligence on criminal activities in the city. It is also vital that there be security stakeholders represented. From the portal view, they can see criminal hotspots that may or may not have attracted police attention. The last user (actor) shall be the system administrator who is required for consistent functioning of the system. 5.3.2 System Sequence Diagram A sequence is a series of steps from the initial input to the output the system gives out. Alongside the information is a series of methods (functions) that aid in system processes. The sequence diagram for this particular system shall be important as it shall contain the relevant information on 34 the methods required for proper system functioning and necessary for the development of the amicable crowdsourcing algorithms to be applied in the system design at the end of the research project. Generally, the sequence diagram in Figure 5.3 depicts the processes that entail using the system to achieve the desired outcome and eventually achieve full classification of criminal hotspots in the city. Figure 5.8: Sequence Diagram 5.3.3 Class Diagram The class diagram represents the encapsulated members and member functions necessary for the modelling of the efficient system. Each class encapsulates its unique set of functions that are critical to the functioning of the system. Generally, all classes shall be collated for the different system views the system shall abstract the user with. The methods in the class design model the various algorithms and functions applied in the functioning of the system. The Figure 5.4 depicts 35 the class diagram for the Crowdsourcing System for identifying crime hotspots in Nairobi City County. Figure 5.4: Class Diagram 5.3.4 Entity Relation Diagram The entity relation diagram models the different tables in the system as well as the relationships these tables have as pertains the effective running of the system. The table is the most basic component of the database. As such, it is important to have all relations properly designed in order to efficiently design the database application for the system. Indeed, the development of the database is important for this system as it represents the main data capture tool before the anticipated analysis of the crowd-sourced data. The prototype takes advantage of the relations in the database SQL schema. Figure 5.5 effectively represents the entity relation diagram as modelled into the Crowdsourcing platform for identifying crime hotspots in the city. 36 Figure 5.5: Entity Relationship Diagram The entity relationship diagram on Figure 5.5 shows the various entities of the proposed crime hotspot tracking and reporting system. These entities are later transformed into actual classes during the prototype design. The class diagram however shall be modelled in a similar fashion for easier encapsulation. The overall design of the system shall be based on the ability to model the different entities into actual objects and classes that can encapsulate the different functions (methods) and algorithms within the system. This fully depicts the object-oriented nature of the system. 5.4 Database Design The database schema is modelled on an SQL setting. This is based on the entity relation diagram represented in Figure 5.4. The database should be developed from a modelled schema that should employ the use of Structure Query Language. The system shall be designed in a manner to ensure that the underlying database is able to capture the different aspects reported about crimes as well as the comments from the different system users in order to develop charts gathered from auto- generated database queries. In general, the design of the database shall include the underlying 37 algorithms for crowdsourcing and representation of the specific significant information concerning criminal hotspots in the city. Data shall be validated using input masks and also drop-down menus where necessary to minimize unnecessary information and also guarantee that the information thus represented can fully suffice for analysis on the different crime hotspots in the city. Eventually, the reports shall consistently show the regions and even streets in the city that bear the greatest criminal risk to dwellers and thus require significant police attention. 5.5 Conclusion The design of the system should guide the modelled system to ensure congruence in the development. Generally, the development stage shall ensure on completion of the design. This will require ample input from the research participants as the testing stage shall be deemed successful if the information or suggestions offered at the pre-development stage are well implemented. The system design however is the guiding tool for the development of the final system and represents all the functionalities of the prototype. As such, as the waterfall model suggests, further design may be necessary as new developments are made in the system construction (engineering) phase of the project. 38 Chapter 6: System Testing and Implementation 6.1. Overview After the development of the system the system was tested to confirm whether it meets the required specifications. Myers and Rosson (2000) outline the need to spend time focusing on the user interfaces and system usability, and stress that it is important to do so since the efficiency with which the end user achieves their desired result from the system is increased as a result. This chapter therefore focuses on the tools, requirements and functionalities achieved while developing the prototype. 6.2. Description of the Test Environment A possible environment for testing the developed system would be in a city estate or suburb where the crime prevalence was quite high and that the possibility that there would be a report on a daily basis was quite feasible. This was not feasible since there is no common ground to seek such information from private citizens in the city. Besides, for purposes of the prototype development activity, it was not necessary to involve the city dwellers, as there was already a sample team of the research participants that would be willing to test the developed prototype and present test cases for documentation. An ideal environment was adopted in which the test was done. 6.3 Prototype Development Environment An integrated development environment was utilized for the development of the prototype. This environment encompassed the use of a framework for modeling of PHP websites with other development languages namely; JavaScript, SQL, CSS and HTML. This is why the XAMPP engine was utilized. 6.4. Model Components 6.4.1. Login into the system In order to log into the system, the user has to input their username and password in the login prompt and press the return key (Enter) or [Login] button indicated, to ensure that only authorized users are granted access. Upon entering user details, the system would be able to 39 internally verify the credentials, and then direct the user to the section of the system they were authorized access. The login prompt is as shown in Figure 6.1. Figure 6.1 : Login Prompt 40 Once a user successfully logs in to the system, a reporting dashboard page showing the different main processes appears. This dashboard gives the user options whether to report a crime, to view crime hotspot areas and common crimes in various regions. A snapshot of the interface is shown in Figure 6.2 below. Figure 6.2 Main System Dashboard 6.5. System Modules The modules developed with the system include; - A user module for reporting crime incidents in the city. - A crime information module, that will anticipate information from user reports and record common statistics about criminal activities in the city. - A general information module that will consistently update information about general facts the user needs to know and understand about the city. 41 6.6. System Testing for the Anomaly Based Fraud Detection System In order to test the system, users were asked to use the prototype for a week and give feedback on the three modules that the system offers to them. The responses anticipated from these users were; ‘satisfactory’ or ‘unsatisfactory’ based on the module used. The modules were tested by the research respondents and results demonstrated as shown; 6.6.1 Test of the Crime Reporting Module The test of this module was done by banking sector respondents. Of the ten users involved in the testing, three of the respondents thought that the system did not achieve all functional requirements and was thus unsatisfactory. The rest of the respondents however affirmed that the system was satisfactory and would use it where necessary. This is as represented in Figure 6.3. Figure 6.3 : Tests Done During the System Testing 6.6.2 Test of the crime information module The crime information module was tested by members of the public that had participated in the previous data collection. For this particular module, the ten respondents from the previous data collection team were asked to give their views on the performance of the system. Those willing to 0 1 2 3 4 5 6 7 APPROVAL TO USE SYSTEM DISSAPROVAL OF SYSTEM Crime Hotspot Network test results 42 use the system further were asked to indicate ‘approve’ while those who did not wish to use the system further were asked to indicate ‘disapprove.’ The responses are as represented in Figure 6.4. Figure 6.4: Testing Validity of the Crime Information Module 6.6.3 Test of the General Information Module The general information module developed to gather crime statistics that were not necessarily given by the users was also tested. For this test, the users were asked to approve or disapprove the module, similarly to the other developed modules existing in the system. The response was not graphed as all of the ten respondents showed their approval in the development and design of this module. This meant that the module would likely feature in the final system, if one were developed. 90% 10% Approve Disaprove 43 Chapter 7: Conclusions and Recommendations 7.1 General Research conclusions The research done in this project entailed both primary and secondary data. Primary data was sourced from different security stakeholders in the city who have interacted with or experienced criminal activities before. This includes; city dwellers (sourced from different parts of the city), and Information technology participants, who were involved in commercial application development (programmers). From their findings, there was need for an effective crime hotspot monitoring system to monitor and guarantee the security of city dwellers in Nairobi, Kenya. The proposals by stakeholders on the features to include in the crime hotspot monitoring system informed the design and development of the system. However, there was need for more study on how best to implement the system in a holistic manner, covering other towns in Kenya, that are interconnected via a common crime tracking database. The research revealed a lot of enthusiasm from the stakeholders on the urge to implement a new system. Many of these stakeholders were willing to support the implementation of the new system and its use in the security sector of the country. The research from past literature also revealed that crime monitoring systems were always monitored, yet breach of security in city did occur from time to time. Although some towns in Kenya have installed security systems in their streets, they are not able to effectively monitor issues such as identity theft and common muggings and carjacking. City dwellers are thus often robbed by people who could study and manipulate the existing systems; taking advantage of their weaknesses. The research also revealed that while police stations and security systems did not offer all the services demanded by the city dwellers, they constituted most of the services required by the civilians; which include crime reporting, tracking criminals and offering intelligence information on different security risks in the country, especially within the city of Nairobi, the only existing city in the country. Secondary research also indicated that police collusion with criminals posed the greatest risk to public safety and security. 44 7.2. Achievement of Research objectives 7.2.1. Identifying the Factors that Influence Identification of Crime Hotspots in Nairobi The research determined that; crime hotspot determination in Kenya requires the user to input validation details for login and be part of the system. This way, these details are often prompted from the user once they seek to use the network, making them less susceptible to inaccurate reporting. The user is required to input a PIN that validates his or her login information. The user then gets an interface that allows them to view the dashboard. The research also noted that the identification of crime hotspots was also based on common stereotypes about crime in the city in general. Identification of criminal hotspots also meant that there was need to have all the right information about common statistics revolving around the very criminal activities. Factors such as the nature of the crime, the targeted group as well as the different time the crimes took place would be a great measure of prevalence for the hotspot areas. 7.2.2. Reviewing Existing Techniques used in Identifying Crime Hotspots The research revealed that currently, there are algorithms used for predicting criminal activities in some parts of the world. These algorithms assist in efforts such as security protocols for Very Important Person (VIP) security, tracking of common criminal networks and the determination of predictable terror targets to look out for the in the city as well. Generally, there was significant data and previous information on how best to have a crime hotspot system developed in the city. The different aspects such as geographical repositioning to determine the correct and accurate location of the respondents using geo-positioning systems on smart phones and other gadgets is already an existing technology. As such, the good will to implement the crime hotspot system in the country involved significantly appreciating that such systems do exist in other regions around the world, only that they are not as prevalently used due to the low crime rates in these regions. 7.2.3. Developing a Crowd-Based Solution for Crime Hotspot Reporting The crowd-based system for crime hotspot reporting is based on the presumption that users do have established patterns with the use of online-based systems. The study of these patterns shall then set precedents for the evaluation of possible crimes that can are common and thus can easily be noted to have taken place in a specific area in the city. Generally, it would suffice to have these crime hotspots known before hand and anticipated as the system was only a representation of the existing situation in the city, and not a way to map an imaginary issue. The crowd-based hotspot 45 detection system was designed, developed and implemented effectively. The work was evaluated and tested among several security stakeholders, on their portal. Indeed, as anticipated, the system was able to capture user precedents and behaviour patterns, utilising these to determine when there was a crime noted and reported in different constituencies in the city. This system was thus proposed for use among security experts, especially the police, to monitor criminal hotspots and institute measures to handle or deal with the crimes. 7.2.4. Validating the Proposed Solution Validation of the crowd-based system for crime hotspot detection was done earnestly. This was through various testing exercises, done with modular expectations in mind. The system was tested against expectations such as possible anomalies, detected client behaviour modelling systems and possible system errors in input validation. These tests were done before the final presentation of the system. The development of the system and the validation of the modules was done in a cyclic manner, taking into consideration the different aspects of the development of the system, including the user requirements that needed to be met for the completed prototype to serve the different purposes anticipated by the researcher participants. 7.3. Conclusions on the Design and Implementation of Criminal Hotspot Network System for City Dwellers The research proved that indeed, a crowd-based system can be used for crime hotpot reporting and criminal activity detection. This system was designed using the parameters used in a police crime incident reporting system. These parameters were then used to develop the system and improve on its functionalities. Since the input parameters on the system were similar to the expected inputs by the system, the design of the system was quite flexible and appealing. It did not require too many improvements on the crime hotspot network interface, and neither did it require an overhaul of existing criminal database systems. The use of server-side web development made it possible for the system to efficiently track user data, without the user’s knowledge. This meant that common criminals would be trailed slowly and nabbed in the end. 7.4 Challenges Realized in the Study The study had several realized challenges. First, there were considerations about alternative methodologies to implement the system. Being a highly secured system, it was not easy to have an interface or underlying network that would guarantee security and neither were the existing 46 crime databases in the system available for use as a design and testing case. The systems used by the Kenyan Police department in the city (the focus of the study) were not designed and implemented in a similar fashion to all other towns in the country. It was thus possible that the system would not be applicable in other banks. With regard to user training and implementation, the research established that many people in the city and the security sector in general were not aware of the basic logical functionalities of the crime hotspot reporting system. As such, the implementation of the system would also require a lot of input from the contractors involved in installing the different location sensors as well as the tracking systems that would be required for full functioning of the system. Given that some of these contracts did not involve maintenance costs, it may pose a threat in the future, where the system supplier may be unavailable or indisposed. From the research within the literature studied, many city dwellers are also resistant to technological change of such a nature, especially reporting crimes, as they feared victimization. It would thus take time to convince city residents, and the country in general to adopt the crowd- based hotspot reporting system. 7.5. Recommendations for Further Study Among the areas noted for further study research include; research and improvement of the design of the crime reporting systems in Kenya to incorporate biometric recognition, the application of cloud systems to the field of intelligence, to inter-fuse ultra-modern anomaly detecting systems across the world to the crime use cases in Kenya, security of city centres using facial recognition patterns, as opposed to other biometric and cryptic systems used, as well as the involvement of criminal behaviour experts in design of better anti-fraud systems for human computer interaction between the police and the public. Such research will revolutionize the use of crime reporting and detection system in Kenya. In so doing, there will be better designs and proliferation of the criminal hotspot reporting system for the users. City security is likely improve with research. Once this happens, security will get more value for money for the installed crime reporting network systems that will incorporate system users spread across the country. 47 References Aten, K., & Thomas, G. F. (2016). Crowdsourcing strategizing: communication technology affordances & the communicative constitution of organizational strategy. International Journal of Business Communication, 53(2), 148-180. Bailard, C. S., & Livingston, S. (2014). Crowdsourcing accountability in a Nigerian election. Journal of Information Technology & Politics, 11(4), 349-367. Bamberger, J., Geßler, A. L., Heitzelmann, P., Korn, S., Kahlmeyer, R., Lu, X. H., ... & Kretz, T. (2015). Crowd research at school: Crossing flows. In Traffic & Granular Flow'13 (pp. 137- 144). Springer, Cham. Baruch, Y., & Holtom, B. C. (2008). Survey response rate levels & trends in organizational research. Human relations, 61(8), 1139-1160. Bollier, D., & Firestone, C. M. (2010). The promise & peril of big data (p. 1). Washington, DC: Aspen Institute, Communications & Society Program. Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of computational science, 2(1), 1-8. Byrne-Evans, M., O'Hara, K., Tiropanis, T., & Webber, C. (2013, May). Crime applications & social machines: crowdsourcing sensitive data. In Proceedings of the 22nd International Conference on World Wide Web (pp. 891-896). ACM. Colorado, J., Mondragon, I., Rodriguez, J., & Castiblanco, C. (2015). Geo-mapping & visual stitching to support l & mine detection using a low-cost uav. International Journal of Advanced Robotic Systems, 12(9), 125. Cordner, G. (2014). Community policing. The Oxford h&book of police & policing, 148-171. Dattalo, P. (2008). Determining sample size: Balancing power, precision, & practicality. Oxford University Press. DiGrazia, J., McKelvey, K., Bollen, J., & Rojas, F. (2013). More tweets, more votes: Social media as a quantitative indicator of political behavior. PloS one, 8(11), e79449. 48 Draugalis, J. R., & Plaza, C. M. (2009). Best practices for survey research reports revisited: implications of target population, probability sampling, & response rate. American journal of pharmaceutical education, 73(8), 142. Feng, Z., Zhu, Y., Zhang, Q., Ni, L. M., & Vasilakos, A. V. (2014, April). TRAC: Truthful auction for location-aware collaborative sensing in mobile crowdsourcing. In INFOCOM, 2014 Proceedings IEEE (pp. 1231-1239). IEEE. Fish, A. (2013). Participatory television: Convergence, crowdsourcing, & neoliberalism. Communication, Culture & Critique, 6(3), 372-395. He, S., & Chan, S. H. G. (2016). Wi-Fi fingerprint-based indoor positioning: Recent advances & comparisons. IEEE Communications Surveys & Tutorials, 18(1), 466-490. Hester, V., Shaw, A., & Biewald, L. (2010, December). Scalable crisis relief: Crowdsourced SMS translation & categorization with Mission 4636. In Proceedings of the first ACM symposium on computing for development (p. 15). ACM. Hudson-Smith, A., Batty, M., Crooks, A., & Milton, R. (2009). Mapping for the masses: Accessing web 2.0 through crowdsourcing. Social science computer review, 27(4), 524-538. Kayrak, M. (2008). Evolving challenges for supreme audit institutions in struggling with corruption. Journal of financial crime, 15(1), 60-70. Kjær, S. K., Trung Nam, T., Sparen, P., Tryggvadottir, L., Munk, C., Dasbach, E., ... & Nygård, M. (2007). The burden of genital warts: a study of nearly 70,000 women from the general female population in the 4 Nordic countries. The Journal of infectious diseases, 196(10), 1447-1454. Krasteva, I., & Ilieva, S. (2008, May). Adopting an agile methodology: why it did not work. In Proceedings of the 2008 international workshop on Scrutinizing agile practices or shoot-out at the agile corral (pp. 33-36). ACM. Kupatadze, A. (2012). Explaining Georgia's anti-corruption drive. European security, 21(1), 16- 36. 49 Kwapisz, J. R., Weiss, G. M., & Moore, S. A. (2011). Activity recognition using cell phone accelerometers. ACM SigKDD Explorations Newsletter, 12(2), 74-82. Larman, C. (2012). Applying UML & Patterns: An Introduction to Object Oriented Analysis & Design & Interative Development. Pearson Education India. LeBas, A. (2013). Violence & urban order in Nairobi, Kenya & Lagos, Nigeria. Studies in Comparative International Development, 48(3), 240-262. Lewis, S. (2015). Qualitative inquiry & research design: Choosing among five approaches. Health promotion practice, 16(4), 473-475. Lu, B., & DeClue, T. (2011). Teaching agile methodology in a software engineering capstone course. Journal of Computing Sciences in Colleges, 26(5), 293-299. Newman, T. P. (2017). Tracking the release of IPCC AR5 on Twitter: Users, comments, & sources following the release of the Working Group I Summary for Policymakers. Public Underst&ing of Science, 26(7), 815-825. Nkonya, E., Mirzabaev, A., & Von Braun, J. (Eds.). (2016). Economics of l& degradation & improvement: a global assessment for sustainable development. Springer Open. Ntalianis, K., & Tsapatsoulis, N. (2016, December). Wall-Content Selection in Social Media: A Revelance Feedback Scheme Based on Explicit Crowdsourcing. In Internet of Things (iThings) & IEEE Green Computing & Communications (GreenCom) & IEEE Cyber, Physical & Social Computing (CPSCom) & IEEE Smart Data (SmartData), 2016 IEEE International Conference on (pp. 534-539). IEEE. Ntalianis, K., Tsapatsoulis, N., Doulamis, A., & Matsatsinis, N. (2014). Automatic annotation of image databases based on implicit crowdsourcing, visual concept modeling & evolution. Multimedia tools & applications, 69(2), 397-421. Phua, C., Smith-Miles, K., Lee, V., & Gayler, R. (2012). Resilient identity crime detection. IEEE Transactions on Knowledge & Data Engineering, 24(3), 533-546. 50 Quercia, D., & Saez, D. (2014). Mining urban deprivation from foursquare: Implicit crowdsourcing of city l& use. IEEE Pervasive Computing, 13(2), 30-36. Sanga, C. A., Phillipo, J., Mlozi, M. R., Haug, R., & Tumbo, S. D. (2016). Crowdsourcing platform ‘Ushaurikilimo’enabling questions answering between farmers, extension agents & researchers. International Journal of Instructional Technology & Distance Learning, 10(13), 19-28. Sen, K., & Ghosh, K. (2018). Incorporating Global Medical Knowledge to Solve Healthcare Problems: A Framework for a Crowdsourcing System. International Journal of Healthcare Information Systems & Informatics (IJHISI), 13(1), 1-14. Singh, P., Jagyasi, B., Rai, N., & Gharge, S. (2014, December). Decision tree based mobile crowdsourcing for agriculture advisory system. In India Conference (INDICON), 2014 Annual IEEE (pp. 1-6). IEEE. Soden, R., & Palen, L. (2014). From crowdsourced mapping to community mapping: The post- earthquake work of OpenStreetMap Haiti. In COOP 2014-Proceedings of the 11th International Conference on the Design of Cooperative Systems, 27-30 May 2014, Nice (France) (pp. 311-326). Springer, Cham. Smith, B. (2015). Object-oriented programming. In Advanced ActionScript 3 (pp. 1-23). Apress, Berkeley, CA. Sullivan-Bolyai, S., Bova, C., & Singh, M. D. (2014). Data-collection methods. Nursing Research in Canada-E-Book: Methods, Critical Appraisal, & Utilization, 287. Tayal, D. K., Jain, A., Arora, S., Agarwal, S., Gupta, T., & Tyagi, N. (2015). Crime detection & criminal identification in India using data mining techniques. AI & society, 30(1), 117-127. Taylor, R. W., Fritsch, E. J., & Liederbach, J. (2014). Digital crime & digital terrorism. Prentice Hall Press. Verma, P., & Bhatia, J. S. (2013). Design & development of GPS-GSM based tracking system with Google map based monitoring. International Journal of Computer Science, Engineering & Applications, 3(3), 33. 51 Young, M. L., & Hermida, A. (2015). From Mr. & Mrs. outlier to central tendencies: Computational journalism & crime reporting at the Los Angeles Times. Digital Journalism, 3(3), 381-397. 52 APPENDICES Appendix A: Research Questionnaire A.1 Pre-development questionnaire A questionnaire was given to respondents from the different parts of the city & the IT companies to give their input on the research objectives & questions asked. The questionnaire below was used to gather such inormation from the diffferent respondents surveyed. The Pre-Development Questionnaire Hi, my name is Maryline Chepng’etich. I’m a student at Strathmore University. In partial fulfilment of the requirements of the Degree of Master of Science in Computer Based Information Systems at University, I am doing a thesis on REAL-TIME LOCATION BASED ALGORITHM FOR NOTIFICATION OF CRIME HOTSPOTS USING CROWDSOURCING Kindly NOTE, all your answers are completely confidential, & you are free to skip any question or to end the survey at any point. No personally identifying information will be released & the result will be reported only in aggregated percentage form. 1) Please indicate your gender a) Male b) Female c) I rather not say 2) Kindly indicate your age bracket as appropriate a) Below 30 years b) 31-45 years c) Above 45 years 53 3) Kindly indicate your level of training/education a) Technical b) Tertiary c) Varsity d) Post graduate 4) How exposed are you to crowd-sourcing platforms? a) Very Exposed b) Exposed c) Moderately exposed d) Less Exposed e) Unexposed 5) Would you be willing to support a crowd-sourcing platform? a) Yes b) No c) Unsure A.2 Questionnaire for system testing Upon development of the system, the respondents were asked to participate in the testing of the developed prototype to ascertain the functioning of the system. This questionnaire is as indicated below; Prototype testing Questionnaire Hi, my name is Marylyne Chepng’etich. I’m a student at Strathmore University. In partial fulfilment of the requirements of the Degree of Master of Science in Computer Based Information Systems at University, I am doing a thesis on REAL-TIME LOCATION BASED 54 ALGORITHM FOR NOTIFICATION OF CRIME HOTSPOTS USING CROWDSOURCING. By now, you have given data on the thesis at an earlier stage in the exercise. I now seek that you answer this questionnaire upon viewing & interacting with the prototpe developed. If you have yet done that, kindly do so, so that we can proceed. Kindly NOTE, all your answers are completely confidential, & you are free to skip any question or to end the survey at any point. No personally identifying information will be released & the result will be reported only in aggregated percentage form. 1) Using the above information, do you think a better system can be built? a) Yes. b) No. 2) As a percentage value; how do you rate the following system features? a) Functionality _________ b) Security _________ c) Operational Efficiency _________ 3) What new features would you propose for the developed prototype? a) ____________________________________________________ b) _____________________________________________________ c) _____________________________________________________ d) _____________________________________________________ ---------------Thank you for your time------------------ 55 Appendix B: System User Manual This section details the different use cases as they apply to the interactions between the different actors & the system. Here is a look at the different dashboards within the crime hotspot reporting network. B.1 Login into the system In order to log into the system, the user has to register, then log in, just like most social networks are designed. The login interface to the system is as shown in Figure B.1 Figure B.1: Login to the system Upon login to the system, the interface shown in Figure B.2 Appears, showing the existing crime hotspots, information on crimes in the city, & a form for reporting criminal activity. 56 Figure B.2: System Dashboard B.2 Reporting a criminal hotspot area In order to add a hotpot area, the criminal hotspot network allows you to fill in information on the location, the common crime in the area, as well as the target demographic for the specific crime. The user can then submit this information for integration into the system records & the development of charts & histograms on crime statistics. This can be demonstrated as shown in Figure B.3 57 Figure B.3: Adding a crime hotspot B.3 Viewing Crime Statistics From the statistic tab, one can view the crimes that are inherently common in the city thus responsible for the hotspots. This is as shown in Figure B.4 58 Figure B.4: Viewing crime statistics