Transactional Data Tracking and Analytics Tool: Case of Coffee Farming in Kenya John Kiria Muguteh Kinyua 056306 Master of Science of Information Technology 2022 Transactional Data Tracking and Analytics Tool: Case of Coffee Farming in Kenya John Kiria Muguteh Kinyua 056306 Submitted in Partial Fulfilment of the Requirements for the Degree of Master of Science of Information Technology at Strathmore University School of Computing and Engineering Sciences Strathmore University Nairobi, Kenya October, 2022 This thesis is available for Library use on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. ii DECLARATION I declare that this work has not been previously submitted and approved for the award of a degree by this or any other University. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made in the thesis itself. © No part of this thesis may be reproduced without the permission of the author and Strathmore University Name: John Kiria Muguteh Kinyua Student No: 056306 Signature: …………………………… Date: 17th May 2022 Approval The thesis of John Kiria Muguteh Kinyua was reviewed and approved by the following: Dr Vincent Omwenga Senior Lecturer School of Computing and Engineering Sciences Strathmore University Dr Julius Butime Dean, School of Computing and Engineering Sciences Strathmore University Dr Bernard Shibwabo Director of Graduate Studies Strathmore University iii ABSTRACT Reliable agriculture data has become key in enhancing farmer confidence particularly in the production supply chain. In the coffee supply chain, the collection and management of transactional data across the coffee supply chain is critical since the farmers are not directly involved in various sections of the supply chain. Challenges associated with data manipulation and corrupting of the data resulting into payment of non-deserving farmers have led to mis-trust and hence declining in coffee farming in the country. This study focused on the development of a Data Aggregation and Tracking Tool that can help farmers monitor the transactional data in the coffee supply chain. To develop the proposed tool, an investigation on the processes and actors involved in the coffee supply chain; examination of the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain; and the identification of technologies that could be aligned with the actions implemented to ensure an effective coffee supply chain in Kenya were done. The proposed tool used the aggregation function that was implemented using the APEX Oracle analytics function. The Agile system development methodology was adopted in the development of the Transactional Data and Analytical Tool (TDAT). At the functional level of the proposed tool, the data clerk can only enter data upon the confirmation of the registration of the farmer. This will eliminate “ghost” farmers. The ability if generating an aggregation data upon entry of the quantity of cherry delivered and the capability of the farmer receiving the continuous notification of the status of the cumulative quantity cherry delivered will eliminate possibility of manipulating the data in the database. iv TABLE OF CONTENTS DECLARATION................................................................................................................... ...ii ABSTRACT ............................................................................................................................ iii LIST OF TABLES ................................................................................................................. vii LIST OF FIGURES ............................................................................................................. viii LIST OF ABBREVIATIONS ................................................................................................ ix DEFINITION OF TERMS...................................................................................................... x CHAPTER 1: INTRODUCTION ........................................................................................... 1 1.1 Background to the Study .................................................................................................. 1 1.1.1 Role of Data in Coffee industry .................................................................................... 2 1.1.2 Agricultural Data Tracking ........................................................................................... 3 1.2 Problem Statement ........................................................................................................... 4 1.3 Objectives ......................................................................................................................... 6 1.3.1 General Objective ......................................................................................................... 6 1.3.2 Specific objectives ........................................................................................................ 6 1.3.3 Research Questions....................................................................................................... 6 1.4 Justification of the study .................................................................................................. 7 1.5 Scope ................................................................................................................................ 7 CHAPTER 2: LITERATURE REVIEW .............................................................................. 8 2.1 Introduction ...................................................................................................................... 8 2.2 Coffee Production and Processes ..................................................................................... 8 2.3 Coffee Key Actors in Kenya .......................................................................................... 10 2.4 Challenges in Coffee farming in Kenya ......................................................................... 11 2.5 Data in Coffee Supply Chain ......................................................................................... 13 2.6 Data Aggregation ........................................................................................................... 14 2.6.1 Data Aggregation in Closed-loop Automation ........................................................... 16 2.6.1.1 The Concept of Control Loop (CL) ........................................................................ 16 2.6.1.2 Data Aggregation process in a CL .......................................................................... 17 2.7 Tracking Data Aggregation ............................................................................................ 19 2.8 Technologies adoption in Management of Coffee Supply Chain .................................. 19 v 2.9 Data Aggregation Algorithms ........................................................................................ 23 2.10 Conceptual framework ............................................................................................... 24 CHAPTER 3: RESEARCH METHODOLOGY ................................................................ 26 3.1 Introduction .................................................................................................................... 26 3.2 System Development methodology Agile Development Systems Methodology .......... 26 3.3 Research Design ............................................................................................................. 27 3.4 Target Study Area .......................................................................................................... 29 3.5 Data and Data Collection Methods ................................................................................ 29 3.6 Data Analysis and Tool development ............................................................................ 29 3.7 Research Quality ............................................................................................................ 30 3.8 Ethical Considerations.................................................................................................... 30 CHAPTER 4: SYSTEM ANALYSIS AND DESIGN ......................................................... 31 4.1 Introduction .................................................................................................................... 31 4.2 System Requirement Analysis ....................................................................................... 31 4.2.1 Functional Requirements ............................................................................................ 31 4.2.2 Non-functional Requirements..................................................................................... 31 4.3 System Architecture ....................................................................................................... 32 4.4 System Designs .............................................................................................................. 32 4.4.1 Context Diagram......................................................................................................... 33 4.4.2 Level 1 DFD ............................................................................................................... 33 4.4.3 Entity Relationship Diagram ...................................................................................... 35 4.5 Wireframes ..................................................................................................................... 36 4.5.1 Login Page for Farmer ................................................................................................ 36 4.5.2 Login Page for Admin ................................................................................................ 37 4.5.3 Home Page for Admin ................................................................................................ 37 CHAPTER 5: SYSTEM IMPLEMENTATION AND TESTING .................................... 38 5.1 Introduction .................................................................................................................... 38 5.2 Implementation............................................................................................................... 38 5.3 Tool Front-end................................................................................................................ 39 5.4 Registration Module ....................................................................................................... 40 5.5 Data Entry and Aggregation modules ............................................................................ 41 5.6 Tracking and Analytics Module ..................................................................................... 42 vi 5.7 System testing ................................................................................................................ 43 CHAPTER 6: DISCUSSION ................................................................................................ 44 6.1 Introduction .................................................................................................................... 44 6.2 A Discussion of the findings of the Research objectives ............................................... 44 6.2.1 Processes and actors in the coffee supply chain in Kenya ......................................... 44 6.2.2 Challenges and expectations of the coffee supply chain in Kenya ............................. 44 6.2.3 Technologies for effective coffee supply chain in Kenya .......................................... 45 6.2.4 Key components of the proposed Data Aggregation and Tracking Tool for a coffee supply chain in Kenya ............................................................................................................... 45 CHAPTER 7: CONCLUSION AND RECOMMENDATION .......................................... 47 7.1 Introduction .................................................................................................................... 47 7.2 Conclusions .................................................................................................................... 47 7.3 Recommendations .......................................................................................................... 48 7.4 Future Work ................................................................................................................... 48 REFERENCES ....................................................................................................................... 49 APPENDICES .......................................................................................................................... 0 Appendix A: Similarity Report .................................................................................................. 0 Appendix B: Ethical Clearance Letter ....................................................................................... 1 53 52 51 vii LIST OF TABLES Table 2.1: Role of IT in SCM (adopted from (Jaana A., 2008))....................................... 20 Table 5.1: Compatibility test on Web browsers/platforms ............................................... 43 viii LIST OF FIGURES Figure 2.1: General Structure of data aggregation ............................................................ 14 Figure 2.2: Working Principles of data aggregation model (Adopted from Haneul et al. 2017) ................................................................................................................................. 15 Figure 2.3: Closed-Loop Automation Model based on OODA (Source: (Hern'andez & Silva-Ortigoza , 2019) ....................................................................................................... 17 Figure 2.4: Food integration reference Model (Source: Georgia Tech Integrated Food Chain Center, 2012) .......................................................................................................... 20 Figure 2.5: Conceptual Framework of the Data Aggregation and Tracking Tool for coffee supply chain ...................................................................................................................... 25 Figure 3.1: Agile Software Development Methodology (Steljes, 2012) .......................... 27 Figure 4.1: Data Aggregation and Tracking tool Architecture ......................................... 32 Figure 4.2: Transaction Data Tracking and Analytics Context Diagram ......................... 33 Figure 4.3: Level 1 DFD for Transactional Data Tracking and Analytics Tool ............... 34 Figure 4.4: ERD for the proposed Tool ............................................................................ 35 Figure 4.5: Login Page for Farmer ................................................................................... 36 Figure 4.6: Home Page ..................................................................................................... 36 Figure 4.7: Admin Login .................................................................................................. 37 Figure 4.8: Admin Home Page ......................................................................................... 37 Figure 5.1: Menu layout structure of the home page of the Data Aggregation and Analytics tool .................................................................................................................... 40 Figure 5.2:shows the farmers registration module. ........................................................... 40 Figure 5.3: Checking farmer Registration status .............................................................. 41 Figure 5.4:Cherry Delivery Capture form ........................................................................ 41 Figure 5.5: Showing Cherry Delivery Report aggregation ............................................... 42 Figure 5.6: shows a sample of the information generated from the tool based on the supplied criteria ................................................................................................................. 42 ix LIST OF ABBREVIATIONS 4IR - Fourth Industrial Revolution AI - Artificial Intelligence B2B - Business-to-Business B2C - Business-to-Consumer BS - Base station C2B - Consumer-to-Business C2C - Consumer-to- Consumer CBK - Coffee Board of Kenya CH - Cluster Head CH - Cluster Head CL - Control Loop CLA - Closed-Loop Automation E2E - End-to-End E-commerce – Electronic Commerce EDI - Electronic data interchange GDP - Gross Domestic Product GIS - Geographical Information Systems IoT - Internet of Things KPCU - Kenya Planters Cooperative Union OODA - Observe-Oriented-Decide-Act POS - Point-of-Sale RFID - Radio Frequency Identification SDC - State Department for Co-operatives SSA - Sub-Saharan Africa SU-IERC - Strathmore Institutional Ethics Review Committee WWW - World Wide Web x DEFINITION OF TERMS Transactional data - Information captured from transactions while running business processes Transaction - refers to a sequence of information exchange and the work related to it 1 1. CHAPTER 1: INTRODUCTION 1.1 Background to the Study Agriculture sector in Kenya continues to be a key driver of the rural economy. It is therefore one of the major drivers of growth for Kenya economy and major source of direct and indirect employment. According to the World Bank Kenya Economic Update: Transforming Agricultural Productivity to Achieve Food Security for All report (2019), agriculture accounted for 21.9% of the gross domestic product (GDP) in Kenya in period 2013-2017 (FAO, 2018). It also contributed to 65 % of the merchandise export in the year 2017, with 56% of the total labour force employed in the sector. Most households in Kenya are engaged exclusively in agricultural activities that contributed to 31.4% to poverty reduction for the rural population in Kenya. Therefore, it remains the single largest source of income for the poor in rural areas, based on the World Bank Economic Analysis. Agriculture has the potential of supporting governments in their efforts of alleviating poverty among its rural population in Sub-Saharan Africa (SSA) (Conceição, Levine, Lipton, & Warren- Rodríguez, 2016). But to achieve this, the governments must continue to strengthen and improve the performance of key sub-sectors in agriculture like the coffee sub-sector (Irungu, 2019). Coffee is considered to be one of the valuable commodities in the world trade and is second in total trade values as a source of foreign exchange to oil (Kamakia, 2016) At a global level, the estimated value of coffee trade is estimated to be $ 19 billion (Salado, 2018). It is also projected that the coffee industry can employ over 125 million labour force with good capital investment in the sector (Anh & Bokelmann, 2019). There have been however, some drop on overall performance of the crop over the years due to production and supply chain challenges that have compelled some coffee-producing countries to diversify to other economies (Aragie, 2018). In African context, the opportunities presented by the sector remains largely unexploited even with increased support for African agriculture over the last few years (Glatzel, Alpert, Brittain, & Conway, 2014) Some of the leading coffee producing countries in Africa include Ethiopia, 2 Uganda, Zambia, Zimbabwe, Ivory Coast, and Kenya accounting for more 57% of the coffee produced in the continent (Kuma, Dereje, Hirvonen, & Minten, 2019). Coffee farming in Kenya is largely done by small-scale farmers. Most of these farmers are organized into co-operatives societies and they account to approximately 60% of the total coffee production in the country (Gichichi, Mukulu, & Odhiambo, 2019). Approximately, 42,037 metric tonnes of coffee were produced in the year 2017 of which 56% was produced by small-scale farmers. It is noted that there was a drop in coffee production by approximately 155,000 sixty- kilogramme bags between the year 2018 and 2020. 1.1.1 Role of Data in Coffee industry Data is a valuable resource for both the producers (farmers), the policy makers (government) and the business leaders. To harness its full benefits, reliable and accurate data needs to be collected along the production and supply chain. It is argued that data can offer insights into what is happening in coffee and therefore absence of good data can negatively impact the various aspects of the production processes. For instance, without dependable data on correct farming practices, markets structures and pricing, soil types among others; coffee farmers are not likely to fully benefit from their production, since they are not able to made correct decisions based on prevailing circumstances. Access to right data and information is likely to boost the likelihood of a farmer making the right decision thus optimizing their returns on investments. Moreover, consistency on the input and output in the production and supply chain can be improved with the availability of right information. Key in obtaining reliable and accurate information is predicted on the ability to record and analyze the data at every stage of coffee production. The ability to collect and manage the data will also determine how successful the coffee industry will be. The responsibility of collecting and aggregating data is a shared responsibility. But it is critical to have the involvement of the farmers as the primary source of the data which will not only enhance the quality of data but also the trust on the data and hence promoting its usability. Policy makers too need reliable and accurate data for purposes of planning, decision making and tracking the performance of the sector. For the companies, data will help them understand the dynamical factors affecting the supply chain. 3 Information acquired from the recorded data can help producers to plan better for future investments in the coffee sector, prepare adequately for better quality coffees, mitigate potential problems, identify new opportunities, understand the processing methods and improve their overall management of the coffee production chain. Information obtained from data collected has also been used to evaluate the productivity and management of farms. 1.1.2 Agricultural Data Tracking Collecting, recording and tracking data at every stage of production, more importantly from harvesting to selling is vital to the farmers and may provide very vital insights. The obtained insights could be used to improve on the various practices employed and hence promote productivity of the crop. To effectively track data, farmers and very one involved must first understand the kind of data to be tracked at each stage of production. For instance, data to be collected and recorded at the harvesting stage is critical for the farmers in eventual determination of the final payment they are likely to get considering all other market forces. In the context of Kenya where the farmers are not involved in the value creation, the data recorded at the harvesting stage is vital and directly influences the final payments the farmers receive. In other countries where the value chain in the coffee production involves the value creation, other data collection points may include costs & expenses of production, crop applications, pest & diseases control practices, the production times, the red and Agricultural Data Aggregation & Tracking Technology has an important role to play in agriculture. It is however noted that a number of factors have mitigated the adoption of technology by a majority of the farmers. Some of these factors are high investment cost, low levels of skill and knowledge on the available technologies, availability of the right technologies (Caselli & Coleman, 2001). Specifically, the adoption of various technology in agriculture continue to create significant positive impact. Some of the technologies endorsed by the World Summit on the Information Society 2005 to be likely to have significant impact in agriculture include computers, Internet, geographical information systems (GIS), mobile telephony, radio and television. Other the 4 emerging and new technologies like Internet of Things (IoT), Cloud Computing, Artificial intelligence (AI), Blockchain technology, among others have revolutionized how agriculture is practiced. Some of the notable impact technology has had in agriculture is on real-time data collection through IoT devices. IoT has created an ecosystem of interconnection of devices or objects through the Internet, allowing them to communicate and share data. For instance, mobile telephones have reduced the information sharing gap thus improving information asymmetry; creating opportunities for networking and marketing; improved availability of data and speeded up the rate at which information for decision making is generated (Jensen, 2007). Artificial Intelligence (AI) has promoted the generation of accurate recommendations based on the availability of huge amount of data that is readily available courtesy of the adopted technology that have the capability of collating the huge amount of data. Technology has also led to the transformation of agriculture that has led to the transitions to precision agriculture and digital agriculture; improved traceability of agricultural products; enhanced agricultural information accessibility; marketing options among others. Blockchain technology has promoted data integrity and enhanced improved trust issues that has for a long time affected the agriculture sector. It allows entities to distribute, differentiate and verify data among the interacting entities. Through its immutability characteristic, every data recorded in a chain of blocks cannot be altered and can be monitored across the chain in a secure manner. Farmers can use the blockchain technology to authenticate the quality and origin of the data. Technologies like unmanned aerial vehicles (drones) and sensors have been used to monitor, record and transmit data in a real-time manner. The information collected has been used to promote smart agriculture. 1.2 Problem Statement Various researchers report different contributions to Kenya’s coffee production problems including attractiveness of more lucrative crops and alternative sources of income such as real estate, climatic changes and outdated farming practices (Kihoro & Gathungu, 2020; Lemma & 5 Megersa, 2021). (Cheluget, 2016), notes that despite the Kenyan government attempts to turn around the fortunes of the coffee farmers, the upheavals that have afflicted this crop have seen trends changing with increasing small-scale farmers abandoning the crop in the farms or, to the extreme, cutting down the whole crop and using the land for other promising alternative crops. Other contributors to the declining production are perception on theft due to manipulation of the recorded farmer’s deliveries along the supply chain. The supply chain, is characterized by persons directly involved in upstream and downstream flows of products, data and/ or information from a source to a customer (Mentzer, et al., 2001). In Kenya, the coffee supply chain is characterized by small producers, intermediaries, industrials and marketers. Presently, the coffee supply chain in Kenya is faced with a number of challenges including low productivity and low returns of profitability, poor technology transfers and inadequate value addition. These disadvantages have resulted to unsustainable coffee sector and with inharmonious poverty of people in areas with the highest coffee production. A key priority of the State Department for Co-operatives (SDC) is revitalization of the coffee Co- operatives through improved corporate governance, enhanced efficiency and service delivery to co-operative members. To achieve this, the Government has initiated various interventions in revitalization of the coffee industry through the State Department for Co-operatives (SDC) which among other functions is mandated to take lead in overseeing smooth, transparent operations and member management of Co-operatives societies. Coffee Co-operatives societies like other sector Co-operatives are expected to be efficient, transparent and accountable in their operations. The success of these initiatives is dependent on the availability of reliable and accurate data. Effective implementation of the Department’s functions is hampered by lack of real time, reliable and credible data from Coffee Co-operatives Societies or coffee collection stations. This has made it difficult for the State Department to effectively deliver services, make correct decisions and respond to the needs of stakeholders in order to achieve its mandate as required. For coffee, the critical required data includes coffee farmers/member profiles, co-operatives profiles and operations. Colezea et al. (2018) notes that to address this problem a comprehensive interactive web-based database driven tool that integrates key data nodes is crucial. The tool should provide real-time analytics that responds to the needs of the stakeholders. This will promote accountability, 6 transparency, information sharing and hence restore confidence among the coffee farmers and actors. It is also noted that many coffee producers have the desire and commitment to start using data collection as a tool, but lack of knowledge on where to collect or access data, track data as well as a fear that it will be misinterpreted, misused or cause errors, are common obstacles producers face. Therefore, a tool that provides a capability of collecting, accessing and tracking data that is simple enough will be critical. This study proposes a data aggregation and tracking tool that focuses on data and information analysis that can be used to improve coffee production through the leveraging of the power of data analytics and tracking. 1.3 Objectives 1.3.1 General Objective The aim of the study was to develop a Data Aggregation and Tracking Tool for data collection, records management, information sharing, provision of appropriate and farmer specific data analytics to improve transparency, efficiency and information integrity in the coffee supply chain. 1.3.2 Specific objectives To achieve the general objective of this study, the following specific objectives were adopted. i. To investigate the processes and actors in the coffee supply chain; ii. To examine the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain; iii. To identify the technologies that can be aligned with the actions to be implemented to ensure an effective coffee supply chain in Kenya; and iv. To develop and test a Data Aggregation and Tracking Tool for a coffee supply chain in Kenya. 1.3.3 Research Questions i. What are the processes and actors involved in the coffee supply chain? ii. What are the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain? 7 iii. How are the various technologies used to implement various actions implemented to ensure an effective coffee supply chain? iv. How can a Data Aggregation and Tracking Tool for a coffee supply chain in Kenya be developed and tested? 1.4 Justification of the study The digitalization of Agriculture Report for 2018-2019, notes that the use of digital technologies, innovations and data can transform business models and practices across the agricultural value chain leading to greater income for smallholder farmers (Tsan, Totapally, Hailu, & Adam, 2019). The Data aggregation and Tracking tool will be responding to this fact by helping in capturing the growers’ transactional data including coffee cherry deliveries which will improve transparency, efficiency, governance and information integrity in coffee co-operatives. It is expected that there will be increase in production and income to the farmers. The policy makers will have accurate and reliable data for decision making. Through the generated reports like pattern and correlations within large data sets to predict outcomes of deliveries, producers will be able to gain the insight required to make timely decisions, increase revenues, cut costs and improve customer relationships. 1.5 Scope The study focused on the production side to the coffee supply chain. Specifically, it focused on aggregating and tracking of the growers’ data and transactional information including coffee cherry deliveries. The analytics were build based on the principles of historical data mining. 8 2. CHAPTER 2: LITERATURE REVIEW 2.1 Introduction This chapter reviews literature on the coffee subsector across its various value chain, actors, challenges, coffee data and the theoretical approaches to data aggregation, models and processes and their application to the aspect of coffee subsector data collection, management and analysis. Information technology systems like Warehouse Management Systems that can be used to help in coordinating the planning of the operations of the coffee warehouses, RFID technology that has been used for automatic object identification of products in an inventory, POS tracking system management and E-commerce that has been adopted in the supply chain operations and has promoted coordination of the selling and buying operations. The Conceptual framework details the proposed data aggregation tool focusing on the data along the coffee supply chain 2.2 Coffee Production and Processes Coffee performs well in areas located in latitudes between 22º N and 26º S. Some of the environmental factors that influence growth and productivity of coffee include moderate temperature, water availability, soil type, land topography, wind and sunshine intensity (Descroix & Snoeck, 2009) Coffee performs well in optimal temperatures. For instance, Arabica coffee requires a temperature of 18ºC during the night and 22ºC during the day but can also do well in a temperature range of between 15ºC to 30º C. Robusta coffee can also do well in slightly elevated temperature ranges of between 22 and 28ºC (Descroix & Snoeck, 2009) Coffee does well in regions that experience moderately high rainfall and high atmospheric humidity. It is does grown in areas that practices rain-fed agriculture. For instance, Arabica does well in areas that experience annual rainfall of 1,400 to 2,000 mm. While Robusta requires on average an annual rainfall of between 2,000 to 2,500 mm. Any rainfall below theses ranges would result in a depressed production (Descroix & Snoeck, 2009). The ideal humidity for Robusta is between 70% to 75% and for Arabica is on average around 60%. 9 Optimal productivity of the coffee tree is achieved at between 5-7 years. Though, with proper crop husbandry, the trees can be very productive even after 50 years. The productivity of the coffee tree is influenced by the flowing and maturation of the coffee berries. These too are dependent on the coffee variety, climatic conditions, agricultural practices among other factors (Wintgens, 2009). Coffee trees do well in areas with alluvial and colluvial soils. Places with volcanic formation soils and that experiences good drainage are most ideal for coffee farming. The Soil depth of at least 2 meters is good for proper growth and development of taproot (Descroix & Snoeck, 2009). The coffee trees can also do well in flat and adulated lands. One unique characteristics of coffee production is the biennial pattern of fruit bearing by the trees with alternate years producing more. It is expected that in high-bearing years, the trees sacrifice new growth to production in order to support their heavy fruit production. In subsequent year, there is reduced fruit production to allow for the tree growth (Wintgens, 2009). Coffee cherries are processed after harvesting using one of the two methods: the dry method or the wet method. Processing of the coffee cherries aims at converting them to green beans, making them ready for roasting, grounding, and consumption directly. In case of the dry method of processing, the cherries are dried naturally in sunshine or by using mechanical dryers. After the coffee cherries are dried, they go through a process called hulling. Through this process the outer parchment layer is removed. Another process called polishing, which is an optional processing method to remove the silver-skin, which is the layer beneath the parchment layer is done. On the other hand, wet processing takes more time, is labour intensive and takes more is more resource. In this method, the cherries are sorted by immersion in water. It is expected that bad cherries will float to the top and can be removed. While those that are good and ripe will sink after sometime. Once the good and bad are sorted the good and ripe cherries are further processed by pulping and drying. At the end of either method, the green beans are then color sorted and graded based on size. After drying, the ideal moisture content of green beans is about 12%. Moisture content below 9% can result in shrunken and distorted beans (Krishnan, 2017). 10 2.3 Coffee Key Actors in Kenya The coffee industry in Kenya has a number of players. Key of these players are as follows: 1. The State Department of Co-operatives The core function of the State Department for Co-operatives (SDC) is to oversee the development of economically viable co-operative societies through formulation and enforcement of policy as well as legal and regulatory framework that meet the aspirations of the co-operative movement. A key priority of the State Department is revitalization of the coffee Co-operatives through improved corporate governance, enhanced efficiency and service delivery to co-operative members. The Government is working towards various interventions in revitalization of the coffee industry through the State Department for Co-operatives (SDC) which among other functions is mandated to take lead in overseeing smooth, transparent operations and member management by Co- operatives societies. Coffee Co-operatives societies like other sector Co-operatives are expected to be efficient, transparent and accountable in their operations. 2. Cooperative movement in the coffee subsector Kenya’s coffee sub-sector is organized into farmers’ cooperative societies whose membership include the coffee farmers who belong to an affiliate coffee factory. This subsector is regulated by the Coffee Board of Kenya (CBK) (Nyangito, 2001). The cooperative societies are expected to extend credit facilities to its members, oversee maintenance of factories, undertake human resource functions on behalf of the factories among other functions. 3. Coffee factories By large, smallholder farmers in Kenya are registered in a coffee factory. The factory management is put up by the farmers. Some of the management team lacks management skills and are sometimes marked with nepotism which has contributed to run-down of some of the (Nyangito, 2001). Notably, poor management has been cited to be the contributor to inflated charges put by the team on the coffee processing, storage, bulking, transportation and other overheads which consequently leads to low payout to the farmers (Nyangito, 2001) 4. Coffee Millers Milling of coffee in Kenya is done by registered coffee millers. Kenya Planters Cooperative Union (KPCU) is one of the dominant coffee millers in the country with an almost monopolizes the sector. 11 As a way of controlling and attracting the farmers, the KPCU has tried to offer diverse services like extending credit facilities to the farmers. Unfortunately, the recoveries of the loaning facilities have been bad and has almost cripple the company. 5. Coffee Board of Kenya The Coffee Board of Kenya is the regulating arm of the government. Other than regulating the industry, the board performs other functions like monitoring the processing, marketing, production and research on coffee matters. It also formulates policies governing the coffee subsector in the country. 6. Kenya Coffee Cooperatives Exporters Limited Established by the cooperative movement of Kenya, the Kenya Co-operative Coffee Exporters (KCCE) facilitates the coffee exporting by linking the small coffee scale coffee produces with international markets. It was aimed at providing a way of eliminating middlemen who had constrained the supply chain by providing a consistent and reliable market to the coffee farmers (KCCE, 2014). 7. Kenya Planters Cooperative Union (KPCU) It is one of the largest cooperative union in the country with a membership of more than 700,000 drawn from the small-scale farmers and cooperatives (FAO;, 2004). The union was established to champion the needs of its members and mobilization of membership a close the country. It has faced financial challenges which steam from leadership and management issues. 8. Kenya Coffee Producers Association (KCPA) The membership of KCPA is composite of coffee farmers spread across the country. Its core mandate is to provide the coffee producers with a forum to articulate issues that affect them. The association has small, medium and large-scale farmers (Danida, 2012). 2.4 Challenges in Coffee farming in Kenya Access to credit facilities There is limited access to credit facilities which sometimes come with high interest rates. The availability of the credit facilities to the coffee farmers is also imbedded by the stringent requirements. 12 Restrictive laws of establishing a coffee farm To establish a coffee farm in Kenya requires application for the license and obtaining an approval from the Coffee Board of Kenya to do so. This sometimes takes time due to inherent bureaucracy and it ends up locking out the prospective farmers (Kegonde, 2005). Production Costs and cost of inputs The cost of farm inputs is still high and out of reach by many farmers. The government of Kenya has however put in place measures to cushion farmers, for instance, the subsidization policy of agricultural inputs, but some of the policies are only limited to one particular inputs like fertilizer. Other factors that have pushed the production costs and input costs high are unpredictable input markets structures, the high cost of transportation due to poor road infrastructures and taxation (Kegonde, 2005). Inefficient structure of the production and supply chain The production and supply chain are marked with unable of inefficiencies and sometimes the farmer is deprived his/her gains from the crop. The existence of middlemen in the production and supply chain has added an extra layer that is overburdening the farmers. International factors The presence of global crisis affecting the coffee industry which include overproduction, collapse in international market prices, coffee diseases, unbalanced coffee value-chain among others have had negative impact on the local coffee production (UNCTAD, 1999). For instance, overproduction has led to a decline in coffee prices in the international markets and thus leading to reduced farmer earnings. Unpredictable prices The coffee prices and hence the eventual earning by farmers remain one of the causes of conflicts in the coffee industry between the producer (farmer) and the seller (trading partner). In response to this, the government of Kenya established a price stabilization regime. The mandate of the scheme was to establish a compensatory finance scheme or create a stock buffer to counter the price movements so as to guarantee minimum prices to the farmers (Coffee, 2014). The unpredictable prices have led to serious suspicion among the key players in the coffee production and hence the need to use technology to enhance transparency. Other noted concerns associated with unpredictable prices include corruption, mismanagement and unsupported costs. 13 Incomplete coffee supply chain data Since the farmer is not involved at the coffee selling (auctioning) stage, they’re often not sure if the actual quantity delivered at the coffee collection center was ultimately used in the computation of their final earnings. Research points to the need to review the operations of the auction system used in the selling of coffee to introduce more transparencies across the supply chain (Kegonde, 2005). 2.5 Data in Coffee Supply Chain With the advancement of technology, there has been advancements in data storage, processing, and sharing capacities as well as analytics and modeling competencies. This has led to enhanced capabilities of decision making by predicting various complexities involved in agricultural processes. The usage of data and technological innovations by farmers are now able to adopt to modern farming practices and hence increasing their productivity. For farmers to benefit from such advancement and the derived benefits like big-data analytics they however need to have quality data. Moreover, data-driven agriculture requires availability of data and excellent computing technologies that can support big-data that is characterized by high-volumes and real-time. The reliability, accuracy and availability of the required data is dependent on the collection of the agricultural data. In developing countries, the collection and storage of the agricultural data is cited to be a challenge. According to (Tamene, 2020) Ethiopia has in the last 60 years experienced the challenges of agricultural data collection and storage due to absence of systematic database that is accessible and interoperable. Furthermore, the data systems do not confirm to the Findable, Accessible, Interoperable and Reusable (FAIR) principle. Non-conformity of this principle means integrated analysis to obtain dependable information for effective decision making is not possible. Further, lack of standardized guidelines on data collection, format and culture of data sharing have undermined the ability to aggregate agricultural data; which ultimately undermined the effectiveness of agricultural decisions and interventions. (Iftikhar, 2016) notes that farmers need data in an aggregated form that can help them make decision. Unfortunately, obtaining an aggregated data is a challenge due to absence of proper integration systems. Data aggregation is also critical for purposes of maximizing the storage 14 and keeping data for a long period of time. This can be achieved through gradual and consistent aggregation of farmer’s data. This calls for keeping older data in a highly summarized manner while keeping the newer data in a lightly summarized manner (Skyt, 2008). The aggregation technique selected is dependent on the type of data value (Iftikhar, 2016). For instance, averaging could be used for data like temperature humidity and weight. For enhanced accountability, the aggregated data could be stored in the same log or data store table (Iftikhar, 2016). This will help to have comparison between the summarized aggregated data and the original detailed data. It is also noted by doing this the aggregated solution will ensure the queries that are specific to the application for viewing data remain the consistent throughout and the results obtained will be valid. 2.6 Data Aggregation The data aggregation can be described as the approach of gathering and collating data to significantly save on the resources used in handling data that is held as different data points also called data nodes. In wireless sensors network where the concept of data aggregation has gained traction, different data aggregation models have been used to collect and aggregate data in an efficient way leading to improved network lifetime (Sran, Gurujeet, & Sidhu, 2016). The general structure of data aggregation model is shown in figure 2.1. Figure 2.1: General Structure of data aggregation The working principles of a data aggregation model is illustrated in figure 2.2. 15 Figure 2.2: Working Principles of data aggregation model (Adopted from Haneul et al. 2017) Frej and Elleithy (2015) and Haneul, Lee and Pack (2017) have extended the general structure of the data aggregation model shown in figure 2.1, to develop a more elaborate model of the data aggregation shown in figure 2.2. They further explain the principles of data aggregation using the following stages considering data aggregation in a wireless sensor network: Stage1: Selection of sensor nodes: A selection of the sensor node deployed in the network where data is to be collected and its data value stored in the database. Stage 2: Creating a Cluster: Involves the grouping of nodes based on defined and related features for effective handling of the data and minimizing overall costs. Stage 3: Cluster Head (CH) selection: The cluster heads are used to manage all the data nodes within the cluster and communicating with the neighboring CH’s. Stage 4: Data Aggregation: It involves the collection of data and any queries from the user-end that are verifiable. At this stage, the data aggregation is done based on the appropriate aggregation approach. The process of aggregation proceeds in allocating a drop of the data coming through the source node by identifying the approaches that will help to gather the information from the aggregation nodes. Stage 5: The outcome of the aggregated data can be transmitted to the base station (BS) for further analysis. 16 The concept of closed-loop control provides a timely and trustworthy flow of data around the object being managed (Hern´andez and Silva-Ortigoza, 2019). To develop a closed-loop based on the zero-touch service management model, for the collection and transfer of data from source to the consumer, a model-based approach that defines metadata specifying sources which include individuals, databases devices, cloud etc. and consumers which include real- time analytics, dashboards visualizations etc. is adopted. 2.6.1 Data Aggregation in Closed-loop Automation 2.6.1.1 The Concept of Control Loop (CL) The concept of control loop has been used as the to guide in the management of networks and services. It provides a mechanism that utilizes the feedback loop for monitoring and self- regulating to achieve a set target. It uses the principle of adjusting the measured values of input variables in order to obtain a desired outcome (Hern´andez and Silva-Ortigoza, 2019). The CL uses the following: a managed entity which is the object on which the management is focused (in the context of this study will constitute the elements in the coffee supply chain), the producer of the input values or data (in the context of this data would be the coffee farmers), and the goal which identifies the target state of the managed entity that the control loop is expected to maintain based on set values. The CL can act on the input values based on the set goals by continuously consuming and producing information from others in a loop by following a sequence of steps, that is, monitoring, analyzing, deciding and executing. In its execution and depending on whether there is human intervention, the CL can be Closed Control Loop, that is, there is no human intervention or Open Control Loop, that is, having human intervention. In this study, a hybrid of the two has been considered in the designing of the ideal system model. Since at some stages, some elements of human intervention will be required. But by a large extend the closed control loop is adopted. One of the models used to construct a Closed loop for automation is called observe-oriented- decide-act (OODA). Figure 2.3 shows the general structure of the OODA model. 17 Figure 2.3: Closed-Loop Automation Model based on OODA (Source: (Hern'andez & Silva-Ortigoza , 2019) The OODA model allows for collection, analyzing, deciding and execution stages. The collection stage involves the collecting and pre-processing of data from the producer or managed entity. At the analysis stage, a derivation of the insights from the input data at the collection stage is done including the historical data. This stage is important because descriptive or diagnostic analysis may be done. At the decision stage insights from the analysis stage is used to decide on the next workflows in the loop and the kind of actions to be taken by which entities in the chain are defined at this level. The final stage is the execution stage which involves the enforcement of the actions identified in the decision stage. 2.6.1.2 Data Aggregation process in a CL The CL operates based on the input data which consist of measurements and observations captured during the data collection stage. In the context of this study this operation will occur at the coffee delivery centers. The owners of these data correspond with the registered coffee farmers within the geolocation where their coffee farms are. In a complex data aggregation system involving a number if data sources, the characteristics of these data sources is very critical in monitoring data sources. But this can be extremely difficult when the entities of the data sources have many autonomous components that can result into a combinatorial explosion that can make control unfeasible when no human operation in the loop is followed. Moreover, if a full end-to-end (E2E) control is intended for all the services running on an operator-managed 18 networks, it will create complex systems that include infrastructural and communication services, together with supporting network slices. Other sources of complexities in these networks involved in the data sources may be due to data management, that is, making data available through the different methods, for example, the pull-based methods or push-based methods; data accessibility which depends on different access controls, confidentiality and integrity methods; and data availability which can be affected by its validity that depends on when it was collected, accessed and the correctness of the processing of the data. From the above, it is possible to collect data of different formats or unstructured when an operator managed networks are employed. This will therefore make it difficult to follow standardized models for the management of data in a network. This is the case currently in the coffee subsector in Kenya. Furthermore, to transport or aggregate the data with different formats will require different models that will have an impact on accessibility and availability (Antonio P, Diego, Jose, Sonia, & Jes'us, 2021). In addition, the possibility of having different ad-hoc solutions for each data source is a reality considering the existence of multi-technology and the multi-vendor environment that exists, resulting into a non-desirable scenario for a CLA. Therefore, for a CL that is suitable for a case of Kenya, will require a closed CL for E2E network environment that can grow to support complex multi-domain scenarios, including the technology available and the management structure in place that supports the data collection and processing. This can only be achieved when the data aggregation mechanism is correctly designed to bridge the data collection and analysis stage in the CL network. This forms one of the focus areas of this study. Figure 2.3. Illustrates the key components of a data aggregation mechanism in a CL network. It gives a coherent and adaptable set of data for further analysis in the decision stage. In this study, the CL network adopted will ensure no new data sources and consumers stages are introduced at later stages making the proposed mode more stable thus ensuring data integrity and tracking of information more reliable. 19 2.7 Tracking Data Aggregation With increased data node and data generated along the coffee supply chain, the concept of data tracking is becoming central to both individuals and organizations. Furthermore, with emergence of networks of devices and the need to have access to these devices, the need for tracking data aggregation is also gaining prominence. The need is also based on the desired to satisfy user demands. Antonio et al. (2021) notes that with increasing complexity on data management, requires a shift from the present management and operational techniques to more adoptive approaches. They propose the adoption of what they call the zero-touch management. This involves the application of closed-loop control when setting up an automation process. According to (Hern'andez & Silva-Ortigoza , 2019) the closed-loop control applies a well-established corpus around a discipline of automatics, combining the mechanisms such as artificial intelligence techniques and the flexibility of using the network services and functional management enabled by evolution of technologies like the software-defined network and network function virtualization. 2.8 Technologies adoption in Management of Coffee Supply Chain Technology adoption in coffee production has been accelerated by the rate of innovations taking place in the sector. In the context of this study, innovation is considered to refer anything new successfully applied into economic and or social processes in the production chain. In the coffee production space, this may include the ways farmers manage their coffee farms and the value/supply chain. This study focused on the later, that is, the coffee supply chain. Possible innovations in this area aim at enhancing benefits to the farmer and may include innovations around data management and processing, and stabilization of incomes. The level of adoption of technology is dependent on the value/benefit the users in this case the farmers may derive from the innovation (Hartwich, and Scheidegger, 2010). To explain the role of Information Technology (IT) in the supply chain management (Jaana, 2008), developed two a-priori constructs that focused on the type of IT use in the supply chain management (SCM) and the drivers that influenced the use of the IT in the SCM. Their classification resulted into the categorization depicted in table 2.0.1. 20 Table 2.1: Role of IT in SCM (adopted from (Jaana A., 2008)) Type of IT use in SCM Reasons for using IT use in the SCM Transaction processing ● Reduction of costs ● Volume of transactions ● Speeding up information transfer ● Elimination of human errors Supply chain planning and collaborations Unpredictable and logistically demanding environment Order tracking and delivery coordination ● Project-orientation of the business ● In-transit delivery consolidation In this study the focus was on the transaction processing and tracking components of the SCM and therefore the technology analyzed was only limited to these two roles. According to (Edgar, Slee, Diego, Fernado, & Wei-Shu, 2017) standardization of the processes and activities in the supply chain is important to provide an integrated structure that ensure adoption of correct management practices, monitor performance and to analyze compliance leading to improvement of the food supply chain. They propose the adoption of the Food integration reference model by George Tech Integrated Food Chain Center of 2012 to guide in the development of a good system to manage the supply chain. Figure 2.4 shows the Food integration reference model. Figure 2.4: Food integration reference Model (Source: Georgia Tech Integrated Food Chain Center, 2012) 21 In the Fourth Industrial Revolution (4IR) otherwise referred to as 4.0, the agriculture envisioned also referred to as Agriculture 4.0, is driven by technological innovations (Contreras-Medina, 2020). The aim is to improve production and enhance efficiency in the supply/value chain and protection of the environment (De Clercq, Vats and Biel, 2018). Some of the technologies proposed include the climate-smart agricultural technology to reduce climate change impacts in places like Rajasthan India; web platforms for production monitoring to increase quality and quantity (Colezea et al., 2018). Other technologies that have had an impact in the coffee supply chain include: Block chain Technology The success and effective coffee supply chain will be good for farmers and other players in the supply chain. It is likely to lead to reduction in costs and improvement of efficiency. This will ultimately lead to stakeholders’ satisfaction. The Block chain technology which is based on the concept of distributed ledger technology that records transactions in a series of distributed copies has been considered to be one of the most innovative that can revolutionize how a supply chain operates. Due to its decentralized nature, it makes it more transparent and efficient. It is also secure and tamperproof due to its immutability principle and linkage the current record to the previous ones in a secure manner. It can be used to manage transactions through self-executing contracts by ensuring transactions recording. This can help transactions recording and product progression from the coffee delivery to various processing stages. The recorded transaction details are not public visible thus ensuring integrity and confidentiality of the transactions, leading to elimination of fraudulent activities in the supply chain (Sadouskaya, 2017). The block chain has also been used to record and trace the origin of coffee and facilitate transfer of payment without need for intermediaries (Kshitij, Biradar, Devendra and Madhavi, 2021). Block chain has also been used together with other technologies like mobile applications and robotics to track the overall supply process from production to delivery consequently increasing traceability, profitability and transparency (Hackett, 2017). Warehouse Management Systems Information technology like Warehouse Management Systems have been used and can be used to help in coordinating the planning of the operations of the warehouses. The warehouse plays a critical role in the storage of the equipment or products in the coffee supply chain and therefore a 22 warehouse management system will be handy. The warehouse management have been used to coordinate aspects like receipting of goods, inventory management, allocation of storage locations, keeping track of inventory in warehouses among others (Graham et.al, 2013). Electronic Data Interchange (EDI) and World Wide Web (WWW) Electronic data interchange (EDI) and World Wide Web (WWW) have significantly changed the way the coffee supply chain is managed. They have facilitated the interaction between the farmers and the buyers. For instance, the internet technology through the www network has improved communication between the farmers and the buyers (Watson et al., 1998). The internet has also facilitated on the creation of supply chains that are commercially viable (Philip and Pedersen, 1997). Presently, some farmers are able to conduct business online which has enhanced access to markets and improved their earning (Armstrong and Hagel, 1996). The adoption of EDI technology in the supply chain has enhanced information sharing across the supply chain. Radio Frequency Identification (RFID) The RFID technology has been used for automatic object identification of products in an inventory management. It uses the radio frequency tags that are used for transmitting resident data. The tags use unique identification number for identification of products. Data transmitted through the RFID tag can be read automatically using the RFID reader. RFID tools have been used to enhance transparency and operational efficiencies (d’Hont and Frieden, 2000). Point-of-Sale Tracking System (POS) The POS tracking system have been used in the supply chain to manage the selling transactions involving the farmer and the buyer. It provides a means of ensuring the retailing inventory is accurately kept. It also provides a mechanism of recording transaction and generating receipt if the transaction thereby promoting confidence to the farmers. E-Commerce The E-commerce has been adopted in the supply chain operations and has promoted coordination of the selling and buying operations. This has resulted in the reduction of operational cost thus promoting cost savings. For instance, since it involves usage of less paper transactions, the transaction time cycles are short. Therefore, it is characterized by speedy transactions per order which has resulted to enhanced collaboration between the sellers (farmers) and the buyers. 23 Furthermore, it has enabled farmers access other markets through the internet thus expanding the market reach for the farmers (McIvor et al., 2000). E-commerce has also led to the creation of different market structure through the establishment of different interaction models. The new interaction models that have resulted from e-commerce include business-to-business (B2B), business-to-consumer (B2C), consumer-to-business (C2B) and consumer-to- consumer (C2C). These interaction models have led to different operation structures within the supply chain that have resulted into improved performances and positive competitions (Stuart and McCutcheon, 2000). 2.9 Data Aggregation Algorithms In data aggregation concept of distributed data aggregation plays a critical role of allowing the distributed determination of global properties of an entity, which can be used to direct the execution of other applications in a distributed network like the supply chain. By focusing on the outcome of the data aggregation process, one can take it to be a subset of information fusion that aims at reducing the data volume (Nakamura, Loureiro and Frery, 2007). The process is therefore characterized by the computation of an aggregation function. According to (Jesus, Baquero and Almeida, 2015), the aggregation function is defined a function 𝑓 that takes a multiset of elements from a set 𝐼 and then produces an output from a set 𝑂. That is, 𝑓: 𝑁𝐼 ⟶ 𝑂 eqn 2.1 In this case the order the in which the elements are aggregated is not important and an input value may occur multiple times in the multiset. From equation 2.1, other more specific aggregation functions can be generated. Example is the decomposable function. Decomposable Functions The aggregation function given in equation 2.1 can be decomposed into several computations involving sub-multisets of the multiset. Example of this is called the self-decomposable function which can be given as: 𝑓(𝑋 ⊎ 𝑌) = 𝑓(𝑋) ⋄ 𝑓(𝑌) eqn. 2.2 Where 𝑋 and 𝑌 are non-empty multisets for a multiset addition operator ⊎ and a merge operator ⋄. 24 Now, since the aggregation results in same outcome for all possible partitions of the multisets, it therefore means the merge operator ⋄ has commutative and associative properties and therefore we can have MIN, MAX, SUM and COUNT functions defined by the following relationships: 𝑠𝑢𝑚{𝑥} = 𝑥 𝑆𝑢𝑚(𝑋 ⊎ 𝑌) = 𝑆𝑢𝑚(𝑋) + 𝑆𝑢𝑚(𝑌) 𝑐𝑜𝑢𝑛𝑡{𝑥} = 1 𝐶𝑜𝑢𝑛𝑡(𝑋 ⊎ 𝑌) = 𝐶𝑜𝑢𝑛𝑡(𝑋) + 𝐶𝑜𝑢𝑛𝑡(𝑌) 𝑀𝑎𝑥{𝑥} = 𝑥 𝑀𝑎𝑥(𝑋 ⊎ 𝑌) = 𝑚𝑎𝑥⁡(𝑚𝑎𝑥(𝑋),𝑚𝑎𝑥(𝑌)) 𝑀𝑖𝑛{𝑥} = 𝑥 𝑀𝑖𝑛(𝑋 ⊎ 𝑌) = 𝑚𝑖𝑛⁡(𝑚𝑖𝑛(𝑋),𝑚𝑖𝑛(𝑌)) We can also have decomposable function from equation 2.1 which is not necessarily self- decomposable such that if for some function g and a self-decomposable aggregation function h, it can be expressed as: 𝑓 = 𝑔 ∘ ℎ eqn. 2.3 An example of a decomposable but not self-decomposable function, is the average, which gives the average of the elements in the multiset. This can be computed as follows: 𝐴𝑉𝐸𝑅𝐴𝐺𝐸(𝑋) = 𝑔(ℎ(𝑋)),⁡⁡⁡𝑔𝑖𝑣𝑒𝑛⁡𝑡ℎ𝑎𝑡 ℎ({𝑥}) = (𝑥, 1) ℎ(𝑋 ⊎ 𝑌) = ℎ(𝑋) + ℎ(𝑌) 𝑔((𝑠, 𝑐)) = 𝑠 𝑐 Where h is a self-decomposable aggregation function which can values of an auxiliary set, in this case pairs, and + is the pointwise sum of pairs. For example, (x1, y1) + (x2, y2) = (x1 + x2, y1 + y2)). 2.10 Conceptual framework The proposed data aggregation tool focuses on the data along the coffee supply chain. It starts with the collection of the data from the data producer in this case the farmer at the coffee delivery center. At this point the actors involved are the farmer and the coffee collection clerk. Upon weighing the coffee cherries delivered, the weight is recorded into the data aggregation tool upon authentication of the farmer using the existing registration records. If the farmer account is valid within the location where the delivery is made, the new delivery recorded is 25 accepted and processed, otherwise it is held on a suspense account for further verification. The new record is aggregated into the farmer’s previous record and pushed to a secure data storage from which further analytics like payments to be made to the farmer are processed. A transaction is processed through the data model and notifications sent to the farmer through registered email account for future references and monitoring of the transactions. Figure 2.5 shows the conceptual framework of the proposed data aggregation and tracking tool for the coffee farmers in Kenya. Figure 2.5: Conceptual Framework of the Data Aggregation and Tracking Tool for coffee supply chain 26 3. CHAPTER 3: RESEARCH METHODOLOGY 3.1 Introduction Research Methodology is described as the process of systematically identifying and solving problems. Bhatnagar & Singh (2013) describes it as the art or science of doing research. This study was guided by the following objectives investigating the processes and actors in the coffee supply chain; examining the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain; identifying the technologies that can be aligned with the actions to be implemented to ensure an effective coffee supply chain in Kenya; and development and testing of the Data Aggregation and Tracking Tool for a coffee supply chain in Kenya. To achieve these objectives, various research methods were adopted following the Structured System Analysis and Design principles. The research was also guided by the nature of the problem being studied and the related work reviewed in chapter 2. The problem was conceptualized on the existing policy framework for the improvement of service delivery in the coffee subsector in Kenya, specifically based on the recommendations of the task force formed to review the challenges affecting the coffee supply chain. 3.2 System Development methodology Agile Development Systems Methodology The study adopted Agile system development methodology that uses various iterative and incremental software development approaches which integrates methods like scrum, crystal and lean development among others. Specifically, the study used iterations to facilitate continued review of feedback during development. This was helpful in the refinement of subsequent phases of the tool development. In each iteration the key phases considered included planning, requirements analysis, design, building and testing. Figure 3.1 shows the phases of agile methodology applied in this study. 27 Figure 3.1: Agile Software Development Methodology (Steljes, 2012) 3.3 Research Design The research designs offer a description on how the study was conducted. In this study, quasi- experimental research design was adopted due the desire to manipulate various study constructs in order to build and test the data aggregation and tracking tool. Descriptive research approaches were used to generate the statistical outcomes and provide the data interpretative requirements. Since the study was based on the documented evidence on the existence of the problem by the National Task-force on coffee sub-sector reforms formed by His Excellency the president vide Gazette Notice No. 1332 of 4th March, 2016 and presented its report on 6th May 2016, the main source data for this study was therefore secondary documents. Particularly, the data sought was used to determine the processes and actors involved in the coffee supply chain, and to understand the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain. Upon completion of the review of the existing documentary evidence and using the literature reviewed, a proposal of the kind of technology to be adopted and hence the appropriate structure of the data aggregation and tracking tool was proposed as detailed on chapter 4. 28 Planning phase In the planning phase, the identification of the reliable documents that describes fully the coffee supply chain was conducted. The documents were reviewed and analysis to understand the various processes and actors involved in the coffee supply chain. This was aimed at correctly understanding the problem contextually. System Requirement analysis Phase After the planning stage, requirement analysis was conducted. This involved mapping the various processes to the correct actors in the coffee supply chain. The system analysis approaches used in this study adopted the UML- based specification. The UML syntax provides flexibility of the representing system components, and it applies various approaches to the synthesis of models (Giese, 2018). It involved the system requirements analysis and the representation of the identified system requirements using the various system analysis tools. Since, the coffee subsector has a regulator who determines the roles for each actor, it was critical to map the roles to the various processes identified in the planning phase. A general interactions model at the data exchange level was conceptualized. This was helpful in formulating the system design models as detailed in section 4.4. Further, the system requirements were categorized into functional and non-functional requirements as detailed in section 4.2. System Design Phase The system designs were developed using the outcome from the requirement analysis phase. Since the study adopted a SSAD methodology, the study adopted the following system designs context diagram to illustrate the high level interaction of external actors with the system, the data flow diagram (DFD) to illustrate the processes, the data movement and storage between the system and the actors, and the entity relationship diagram (ERD) to demonstrate on how the various actors will interact at the system logic levels through data sharing/exchange based on the processes identified at the planning phase. The other system design adopted in this study was the system architecture for showing how the various components will interact to form the entire ecosystem of the proposed tool. 29 Prototype and Testing The phase focused on the implementation of the developed tool. The aim was to ensure all identified functional requirements of the tool are working as expected and all study objectives have been achieved. The main testing methods used were the unit testing to test the functionality of each module, the integration testing technique to check on how the various module integrated to give the expected outcome and the audit testing to ensure no errors are generated by the system. 3.4 Target Study Area Study was based on the Coffee subsector in Kenya. Therefore, the unit of analysis in this study was the coffee farmers. Since the study was based on the study findings and was supposed to propose a tool for implementation of one of the recommendations by the government taskforce on coffee subsector (ref. taskforce report chap 6.), the original target study area by the taskforce was adopted as the experimental frame for this study. All the attributes considered by the original study were considered as the substantive attributes for this study and were thus used to design and develop the proposed tool. 3.5 Data and Data Collection Methods Largely data used in this study was secondary data and therefore secondary data collection instruments and methods were utilized. The methodology used in requirements gathering was document review approach. By large, qualitative data that focused on processes and action points were considered. This data was sourced from existing government policy documents that details processes, taskforce reports and existing literature materials available at https://ushirika.go.ke/downloads/taskforce report on coffee reforms. 3.6 Data Analysis and Tool development The main data analysis approach used in this study was descriptive analysis. The descriptive data analysis helped to implement/establish the data aggregation functions at different nodes of the supply chain. Using the distributed data aggregation techniques specifically, the adoption of decomposition functions, the algorithms used in the implementation of the tool were established. An oracle application development platform, Oracle Application Express (APEX), was used to implement the developed algorithms and hence come-up with the data aggregation and tracking tool. https://ushirika.go.ke/downloads/taskforce 30 3.7 Research Quality The quality of a research is dependent on the reliability of the instruments used and validity of the study outcomes. The analysis of the reliability of this study involved the analysis of the possibility of replicating the same instruments and methods used throughout the study to achieve consistent results. On the other hand, validity focused on check the extent to which the tool truthfully and accurately processes what it was intended to measure. Content validity was mainly used to authenticate the research outcome by examining the test content to determine whether it aligns with the expected functioning of the tool as detailed in existing documents. 3.8 Ethical Considerations The study focused on processes and actions and therefore did not collect data on any study subject. The data used in the testing phase was simulated based on the policy expectation on how the coffee supply chain should operate. Ethical approval for the research was sought from Strathmore Institutional Ethics Review Committee (SU-IERC). The study outcome will be documented and made available in the library to be accessed by other readership. 31 4. CHAPTER 4: SYSTEM ANALYSIS AND DESIGN 4.1 Introduction The goal of the study was to develop a tool for data collection, records management, information sharing, provision of appropriate and farmer specific data analytics to improve transparency, efficiency and information integrity in coffee co-operatives. This chapter presents the system analysis and system design diagrams adopted in the development of the tool. 4.2 System Requirement Analysis The system requirements analysis gives a guide on how the proposed tool will function and the enabling system requirements. It therefore provides the basic building blocks for a system. The system requirements are categorized into two; functional and non-functional requirements. 4.2.1 Functional Requirements i. The tool should have a module to register and keep grower (farmer) records. ii. The tool should have different tables containing the growers’ data and transactional information including coffee cherry deliveries. iii. The tool should keep coffee farmers’ profiles in participating factories. iv. The tool should allow for online access through various web-browsers and web- configurations. v. The tool should provide online capability of viewing and monitoring all transactions. vi. The tool should provide ability to filter transactions. vii. Track and record data and checking farmer registration status. viii. Generate coffee cherry deliveries trends. 4.2.2 Non-functional Requirements i. Availability: The tool should be available and online accessed in real-time. ii. Accessibility: The system should be accessible and updated centrally. iii. Security: Ensure high security of the data collection system by having inbuilt access controls for internal and external users including an audit trail function. 32 4.3 System Architecture System architecture is composed of different information requirements, system components and other supporting technologies. Therefore, a system architecture is described as a basic framework of a system which consist components that work together to achieve a certain organizational system goal. Figure 4.1 shows the data aggregation and tracking system architecture. Figure 4.1: Data Aggregation and Tracking tool Architecture Figure 4.1, shows the application system components and their role in supporting the data aggregation and tracking process activities. The key parts of the architecture include the application concepts, the logical structure of the information systems that captures the roles of each actor and designing information modules. The architecture also shows the following key components of the proposed tool user management, communication management and the tracking management. 4.4 System Designs Based on the system architecture and the system analysis done in section 4.2, the system designs that describes the graphical representation of the system components and actors, data flows and storage in the proposed tool were developed. 33 The study adopted a Structured System Development methodology and therefore the following system design diagrams were considered the Context diagram, the data flow diagram (DFD) and the Entity Relationship Diagram (ERD). 4.4.1 Context Diagram This is also referred to as the DFD level 0. It offers a description of the operational space of the system. In the proposed system, the actors are divided into two categories, namely a) the actors who are directly involved in the coffee production, that is, the farmers, the cherry collection centres; and b) the system administrators and ministry officials (regulators). Figure 4.2: Transaction Data Tracking and Analytics Context Diagram 4.4.2 Level 1 DFD Level 1 DFD shows the decomposition of the context diagram to include data stores and associated system processes. The main system processes include registration, cherry delivery/transaction, cherry tracking and farmer managements. In the system the cherry delivery and tracking can only be performed by a register actor in the system. 34 Figure 4.3: Level 1 DFD for Transactional Data Tracking and Analytics Tool 35 4.4.3 Entity Relationship Diagram The Entity relationship diagram (ERD) which represents the conceptual data model of an information system was adopted. The ERD provides a basic building block on the structure of the database used in the development of the proposed tool. It also depicts the interaction levels of the system entities. Figure 4.4 shows the ERD for the proposed system after normalization. Figure 4.4: ERD for the proposed Tool 36 4.5 Wireframes The wireframes which are the basic layout of the web pages that illustrates how the various entities and system elements will appear on the tool were designed. It also demonstrates how the various system components will interact. 4.5.1 Login Page for Farmer Figure 4.5: Login Page for Farmer Figure 4.6: Home Page 37 4.5.2 Login Page for Admin Figure 4.7: Admin Login 4.5.3 Home Page for Admin Figure 4.8: Admin Home Page 38 5. CHAPTER 5: SYSTEM IMPLEMENTATION AND TESTING 5.1 Introduction The chapter gives the implementation and testing of the proposed tool. The implementation section covers the system logic part, explores different parts of the system, how they were implemented and how they function. The testing section of this chapter focuses on usability testing and functional testing to verify if the application attains the objectives of the proposed solution. It further uses the various system modules and screenshots to complement on the explanations. 5.2 Implementation The implementation of the data aggregation and tracking system was based on the system architecture, DFD, the ERD and the database schemas. The tool was implemented as a web application built on the Oracle database using Oracle application Express platform. The algorithm used in the development of the system was based on the following generalized model given by the mathematical equation 5.1. 𝑌𝑖,𝑗 = ∑ 𝑗𝜀𝐷𝑠 𝑋𝑖,𝑗 𝑙 𝛽𝑖,𝑗 𝑠 ⁡⁡⁡⁡𝑖 = 1,⋯ , 𝑛⁡𝑎𝑛𝑑⁡𝑗 = 1,⋯ ,𝑚⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡𝑒𝑞𝑛⁡5.1 where 𝑙 denotes the farmer’s attributes on which data values will be collected, 𝑖 denotes the cherry delivery attribute, 𝑗 denotes the farmer who delivers the cherries to cluster 𝑠, and 𝐷𝑠 represents the set of farmers in cluster s. 𝛽𝑖,𝑗 𝑠 denotes the coefficient of 𝑋𝑖,𝑗 𝑙 which is the value of the cherry delivery by the farmer. Equation 5.1, based on the general form of the decomposable function given by 𝑓(𝑋 ⊎ 𝑌) = 𝑓(𝑋) ⋄ 𝑓(𝑌) whose derived summation relationship is given as: 𝑆𝑢𝑚(𝑋 ⊎ 𝑌) = 𝑆𝑢𝑚(𝑋) + 𝑆𝑢𝑚(𝑌) The Algorithm for the implementation of equation 5.1 is given by: For each feature 𝑙𝜀𝐷𝑠 1. Fix a cherry delivery item 1. For all other items 𝑖 ≠ 1, check if 𝑗𝜀𝐷𝑠 2. If 𝑋𝑖,𝑗 𝑙 is valid for all 𝑖 ≠ 1, then assign 𝑋𝑖,𝑗 𝑙 to 𝑗𝜀𝐷𝑠 3. If 𝑋𝑖,𝑗 𝑙 is invalid for all 𝑖 ≠,⁡then assign 𝑋𝑖,𝑗 𝑙 to 𝐷𝑠 End for 39 4. Obtain aggregate 𝑌𝑖,𝑗 based on the deliveries 𝑋𝑖,𝑗 𝑙 of farmer 𝑗𝜀𝐷𝑠 Output: Aggregate 𝑌𝑖,𝑗 for 𝑗 A segment of the code snippet showing how the aggregation function was implemented in the tool is given as follows: select e.*, sum(e.quantity) over (partition by farmerid order by deliverydate rows between unbounded preceding and current row) as cumulative_quantity from cherry e 5.3 Tool Front-end The front-end of the tool was designed based on the functional requirements of the tool as detailed in section 4.2. The web interfaces of the tool are arranged according to the modules of the tool. Further, the front-end interfaces were designed by applying the following principles of user interface design familiarity (using familiar concepts and terminologies), context (allow user to know the context all the time), consistency (allow the same function elements to give the same results), control (allow the user to be in control of the interaction with the tool), and compatibility (have the ability to rendering to different devices). Further, menu layout structure was adopted to enhance navigation of the tool functional elements. Figure 5.1 shows the menu layout structure of the home page of the Data Aggregation and Analytics tool. 40 Figure 5.1: Menu layout structure of the home page of the Data Aggregation and Analytics tool 5.4 Registration Module This module will allow the system administrator to capture the farmer details based on existing farmers registration framework as per the Government of Kenya regulations. This should be the primary record of all the coffee farmers in Kenya. Figure 5.0.2 shows the farmers registration module. Figure 5.2:shows the farmers registration module. 41 5.5 Data Entry and Aggregation modules The data entry and aggregation module will allow the center clerk to enter the weight of the coffee cherry delivered after the tool has authenticated the farmer based on pre-recorded details as shown in figure 5.3. The tool should then aggregate the data as a cumulative value of the previous coffee cherry deliveries. Figures 5.4 -5.5 show the cherry delivery data entry form by the collection clerk and data aggregation interface. Figure 5.3: Checking farmer Registration status Figure 5.4:Cherry Delivery Capture form 42 Figure 5.5: Showing Cherry Delivery Report aggregation 5.6 Tracking and Analytics Module The tracking and analytics module will allow the farmers to query the tool for specific information based on transactions done on their account and generate the required analytics using a selected filtering criterion. Figure 5.6 shows a sample of the coffee delivery information trend generated from the tool based on the supplied criteria. Figure 5.6: shows a sample of the information generated from the tool based on the supplied criteria 43 5.7 System testing Agile system testing approach was adopted in this study. This method was used because in focuses on performance analysis based on the context of the developed tool and the work flows. Moreover, it is compatible with the system development methodology proposed in section 3.2. The specific test performed include: Compatibility testing This was performed to check if the developed tool was capable to rendering with different devices and web browsers. A summary of the test results is shown in table 5.1. Table 5.1: Compatibility test on Web browsers/platforms Type of web browser Compatibility acceptance Google Chrome Yes Mozilla Firefox (version 8.0 or higher) Yes 44 6. CHAPTER 6: DISCUSSION 6.1 Introduction This chapter provides a discussion of the study outcomes as compared to the existing literature. A review of the specific objectives viz-a’-vis the literature is given. It further covers the advantages and limitations of the developed tool. 6.2 A Discussion of the findings of the Research objectives The specific objectives of the study were to investigate the processes and actors in the coffee supply chain; examine the current challenges, expectations and actions to be implemented in the aggregation and tracking data of the coffee supply chain; identify the technologies that can be aligned with the actions to be implemented to ensure an effective coffee supply chain in Kenya; and develop and test the Data Aggregation and Tracking Tool for a coffee supply chain in Kenya. 6.2.1 Processes and actors in the coffee supply chain in Kenya From the reviewed literature documents, it was established that the main processes in the coffee supply chain in Kenya include coffee cherry delivery & collection, cherry processing and marketing (auctioning) (reference Report on the National Coffee Task force). The main actors in the supply chain include farmers, state department for cooperatives, Cooperative movements, coffee factories, coffee millers, coffee board of Kenya, Kenya Coffee cooperative exporters limited, Kenya Planters Cooperative union and the Kenya Coffee Planters Associations. It was also determined that the actors performed the identified processes with overlapping mandate leading to conflicts that leave the farmer exposed. This is more significant when data is not centrally collected and aggregated to give the clear picture of the data of a particular farmer. 6.2.2 Challenges and expectations of the coffee supply chain in Kenya The main actor in the coffee supply chain is the farmer. The study therefore sought to determine the challenges and expectations of farmers with respect to the actions undertaken along the coffee supply chain. From the study outcome, it was established the farmers faced a number of challenges. These challenges included access to credit facilities, restrictive laws of establishing a coffee farm, production costs and cost of inputs, inefficient structure of the production and supply chain, 45 Unpredictable prices, and incomplete coffee supply chain data. These challenges are similar to the ones identified by (Kegonde, 2005). Other challenges identified included data integrity, and unpredictable payment plan. The expectation of the farmers is to have a reliable supply chain that guarantees them profitability of their efforts. 6.2.3 Technologies for effective coffee supply chain in Kenya It was determined that farmers are very receptive in the adoption of the available technology to facilitate in the management of the coffee supply chan. The main desire for the farmers is to be able to track their data right from cherry delivery to payment. The factors that could influence the type of technology to be adopted include the perceived human errors, manipulation of data along the supply chain, openness in the management of data, information availability when needed by the farmer and continuous aggregation of the data. Most of these factors have been found to influence the type of system and technology adoption in the supply chain by supported by studies by (Hartwich, and Scheidegger, 2010) and (Jaana, 2008). Therefore, the ideal type of information system for the supply chain would be Transaction processing and Order tracking & delivery coordination systems. 6.2.4 Key components of the proposed Data Aggregation and Tracking Tool for a coffee supply chain in Kenya Based on the processes, actions, challenges and expectations of the farmers as detailed in section 6.2.2 and 6.2.3, the proposed data aggregation and tracking tool was developed. The proposed tool has the following components; data gathering module, data storage, data model and analytics module, and the user interface module. A relational database modelling techniques were used to design the database which was developed based on oracle database management system principles using Oracle application Express platform. To create a conceptual model and relationship between the various entities in the proposed tool, entity-relationship model and relational techniques were used. To achieve a good user experience, the following user experience designs were applied in the designing of the user interface module: simplicity principles to help in designing simple and easy to use interface, structure principle helped in organizing the user interface elements in meaningful and useful manner, feedback principle was applied to help design an interface that 46 keeps informing the user of the actions and returning exceptions or errors, and visibility principle that helped in designing an interface that limits the distractions with unnecessary information. 47 7. CHAPTER 7: CONCLUSION AND RECOMMENDATION 7.1 Introduction The chapter gives an overview of the key outcomes of the study objectives. It also gives the recommendations on the implementation of the tool based on the study outcome. Potential areas of extensions to the study are proposed as future study area. 7.2 Conclusions From the study the established that the key processes involved in the coffee supply chain in Kenya are coffee cherry delivery & collection, cherry processing and marketing. All these processes are dependent on the quality of data captured and how the data is handled. Therefore, the integrity of the data is critical for the success of the coffee supply chain in Kenya. The coffee supply chain in Kenya is made up of a number of actors. Some of the actors are established by law, while others are a creation of the stakeholders in the coffee subsector. The identified stakeholders included the farmers, state department for cooperatives, Cooperative movements, coffee factories, coffee millers, coffee board of Kenya, Kenya Coffee cooperative exporters limited, Kenya Planters Cooperative union and the Kenya Coffee Planters Associations. In the context of this study, the most critical actors are the farmers, coffee factories (coffee collection centers) and coffee board of Kenya. It was established that farmers experienced challenges which are both systemic, structural and operational in nature. In the context of these study, the focus was more on operational challenges. Therefore, in designing the proposed tool, the challenges considered included incompleteness of data in the coffee supply chain, inefficient structure of the supply chain, data integrity, and unpredictable payment data. The desire to have technology in the management of the coffee supply chain is largely driven by the desire to reduce perceived human errors, manipulation of data along the supply chain, increasing the degree of openness in the management of data, enhancing information availability when needed by the farmer and supporting ability to track in a continuous manner the data aggregation. Due to these reasons, the transaction processing and order tracking & delivery coordination systems are ideal for the coffee supply chain in Kenya. 48 7.3 Recommendations The success of the coffee supply chain will be dependent on human, system and structural factors. A complete consideration and involvement of the entities and factors in these areas will be critical in the successful implementation of the proposed tool. The quality and integrity of the data gathered at the initial stages will determine the reliability and validity or quality of information obtained at the end. It will be essential to have robust and reliable data collecti