MSc.SS Theses and Dissertations
Permanent URI for this community
Browse
Browsing MSc.SS Theses and Dissertations by Issue Date
Now showing 1 - 20 of 32
Results Per Page
Sort Options
- ItemDeveloping pediatric prognostic model using finite mixture models(Strathmore University, 2017) Ogero, Morris OndiekiBackground: World Health Organization (WHO) guidelines recommend early identification of patients who have emergency features for early medical intervention with the aim of reducing child mortality and morbidity. Prognostic models have been developed to be used in clinical setups, but their performance in external validations has been dismal. These poor performances have been attributed to suboptimal statistical methods used for derivation of these scores. Methods: The Bayesian finite mixture model was used to succinctly identify subpopulations in a population of 47,596 patients from different geographical regions. Mixed models were used to derive a final prognostic model taking into account subgroups of the population. Clinically relevant yet routinely available prognostic factors were used in model development. Results: Amongst the 23 risk factors used, the AVPU scale which measures unconsciousness was the strongest predictor of mortality with odds of (AOR=2.94, 95% CI= 2.57 - 3.36). Oedema (AOR= 2.66, 95% CI= 2.18 - 3.24), pallor (AOR=2.09, 95% CI= 1.86 - 2.36) and the presence of >= 3 severe comorbidities (AOR=2.19, 95% CI= 1.73 - 2.74) were also associated with an increased risk of death. Conclusion: Given that patient are not alike, a statistical methodology that clusters patients into homogeneous subpopulations should be used to account for the inherent variability in the medical patients. Computational methodology such as mixture models should be used to identify inherent subpopulations that underlie the population of medical patients under study. Limitation: The use of diagnostic episodes as one of predictors in the model was based on the clinician’s impression (not a laboratory test) thus the possibility of false positives could not be ruled out.
- ItemPredictive modelling in credit risk: a survival analysis case(Strathmore University, 2017) Omoga, Allan AnyonaSix survival analysis techniques are accessed by applying the techniques to a dataset consisting of 33,238 active credit facilities from a financial institution operating in Kenya. Namely, the Accelerated Failure Time (AFT) Models, Cox proportional hazard (PH) Model and the Mixture Cure Model (MCM) are considered in the comparisons. Evaluation of the techniques is conducted from a Statistical approach evaluation using the Area under the Curve (AUC) and financial evaluation using the annuity theory. The Cox Proportional Hazard (PH) and the Mixture cure model performs significantly well.
- ItemModelling the structure of dependence of stock markets in BRICS & KENYA: Copula GARCH approach(Strathmore University, 2017) Otieno, Kevin OmondiBackground: Dependence structure is used widely to describe relationships between risks and provides estimation of risks for risk management purposes. Modeling dependence structure of stock returns is a difficult task when returns are having non elliptical distributions. Objective: To examine the dependence pattern between the Kenya stock market return and BRICS stock market returns. Methods: In this dissertation, we estimated the dependence using copula GARCH, an approach that combines copula functions and GARCH models. We applied this method to a stock market returns consisting of stock indices of Brazil, Russia, India, China and South Africa (BRICS) and Kenya stock market. We first used GARCH(1,1) to model the marginal distributions of each stock returns using different GARCH(1,1) specifications. Copula was then used to analyze the dependence between the BRICS stock market returns and Kenya stock market returns using the standardized marginal distributions derived from GARCH(1,1) residuals. The best fitting copula parameter was determined using the log likelihood or AIC.Results: Empirical results showed that GJR-GARCH model provided the best fit for Brazil, Russia, China and Kenya while E-GARCH model provided the best fit for India and South Africa. As for modeling the dependence structure, student t copula parameter provided the best fit for the marginal distributions of the returns. Conclusion: Marginal models showed presence of volatility clustering which vanishes after crisis. To capture the dependence structure for bi variate data sets, Student t copula was considered to be the appropriate copula function. Recommendation: Further research should be extended to examine the multivariate structure, a joint distribution of BRICS in terms of Multivariate GARCH. Also research should focus on specific time periods in order to ensure effectiveness in measurement and management of risks.
- ItemDetermine the breaking point of Kenya debt an application of extreme value theory(Strathmore University, 2017) Mathenge, Jacqueline WachukaThe aim of the study is to determine the breaking point of Kenyan public debt through the use of Extreme Value Theorem (EVT). EVT focuses on the tail end of distributions to be able to identify maxima and minima points. With the rising debt levels since devolution, from Kenya Shillings (KES) 500 billion in 2013 to KES 2.5 trillion in 2015, and warnings from international bodies such as International Monetary Fund (IMF) and World Bank on rising debt levels, there is need to determine sustainability of debts beyond analyst speculations. The use of the special case of EVT known as Generalized Extreme Value (GEV) application looks at a degenerate distribution factor thus ensuring the tail end of the distribution, that is, the maxima converges to the GEV despite the distribution of the data set (no assumption on the distribution of the data set). From the study the Gumbel model was determined to be the most appropriate model and with a 95% threshold, the GEV projected total debt maxima to be KES 5 trillion. This is evidence that the current debt levels of KES 2.5 trillion is still sustainable but should however be monitored.
- ItemSpatial analysis of tuberculosis amongst teenagers in Kenya(Strathmore University, 2018) Muriithi, ElvisTuberculosis (TB) is the ninth leading cause of death worldwide and the leading cause from a single infectious agent, ranking above HIV/AIDS. Despite being preventable and curable, tuberculosis (TB) is the leading cause of death from infectious disease globally, with nearly 10 million people developing TB and 1.5 million people dying from TB in 2014. Among teenagers, it is a top fifteen cause of mortality. Much emphasis in terms of TB surveillance is given to adults especially people living with HIV. This study provides a spatial model to determine the dynamics of TB epidemic among the notified teenage TB cases in Kenya over three years (2013, 2014 and 2015). Conditional autoregressive model emerges as the best model compared to generalized mixed model. Our results show that TB incidence remained constant over time and there is no much variation by sex. We found significance pattern of TB epidemic by counties.
- ItemSpatial models for infants HIV/AIDS incidence using an integrated nested laplace approximation approach(Strathmore University, 2019) Mutua, Susan NzulaBackground: Kenya has made significant progress in the elimination of mother to child transmission of HIV through increasing access to HIV treatment and improving the health and wellbeing of women and children living with HIV. Despite this progress, broad geographical inequalities in infant HIV outcomes still exist. This study aimed at assessing the spatial distribution of HIV amongst infants, areas of abnormally high risk and associated risk factors for mother to child transmission of HIV. Methods Data were obtained from the Early infant diagnosis (EID) database that is routinely collected for infants under one year for the year 2017. We performed both areal and point-reference analysis. Bayesian hierarchical Poisson models with spatially structured random effects were fitted to the data to examine the effects of the covariates on infant HIV risk. Spatial random effects were modelled using Conditional autoregressive model (CAR) and stochastic partial differential equations (SPDEs). Inference was done using Integrated Nested Laplace Approximation. Posterior probabilities for exceedance were produced to assess areas where the risk exceeds 1. The Deviance Information Criteria (DIC) selection was used for model comparison and selection. Results: Among the models considered, CAR model (DIC = 306.36) performed better in terms of modelling and mapping HIV relative risk in Kenya. SPDE model outperformed the spatial GLM model based on the DIC statistic. The map of the spatial field revealed that the spatial random effects cause an increase or a decrease in the expected disease count in specific regions. Highly active antiretroviral therapy (HAART) and breastfeeding were found to be negatively and positively associated with infant HIV positivity respectively [-0.125, 95% Credible Interval = -0.348, -0.102], [0.178, 95% Credible Interval -0.051, 0.412]. Conclusion: The study provides relevant strategic information required to make investment decisions for targeted high impact interventions to reduce HIV infections among infants in Kenya.
- ItemA Bayesian approach to Geo spatial analysis of HIV viral load data(Strathmore University, 2019) Kareko, Joy Hilda MukamiHIV is currently ranked among the leading causes of death in Kenya and in the world, with an estimated 1.5 Million Kenyans living with HIV and 28,000 deaths recorded annually as a result of AIDS related illnesses. In 2014, UNAIDS launched a 90-90-90 strategy the aim was to diagnose 90 per cent of all HIV- positive persons, provide antiretroviral therapy (ART) for 90 percent of those diagnosed, and achieve viral suppression for 90 per cent of those treated by 2020. This study is motivated by the need to assess the 3rd 90; viral suppression for 90 per cent of those ART treated and seeks to analyze one statistical paradigm (Bayesian) that have conventionally been used for geospatial trends. Use of Bayesian approach has been used previously to assess the prevalence and incidence of diseases however, this dissertation seeks to evaluate Bayesian Approach to spatial trends of HIV Viral Load Suppression in Kenya. We revisit the theoretical framework of the Bayesian Approach and apply real data from the Kenyan setting spanning from 2012 to 2017. Results show Bayesian Approach to be robust, in depth and entails more information when modelling spatio-trends of Viral Load suppression. Further, First Line ART regimen, HIV-TB co-infection and retention rates are significant predictors of Viral Load suppression spread.
- ItemDistributions of zero-inflated models with application to HIV exposed infants(Strathmore University, 2019) Nekesa, Faith VictoryThe instances of data with excess zeros are commonly found in many disciplines, including the public health. Several models have been proposed when analyzing this kind of data. The World Health Organization (WHO) indicates that majority of the 1.8 million children who are at the present with HIV in sub-Saharan Africa got the HIV virus from their mothers probably during delivery, pregnancy or through breastfeeding, but the study shows there is a drop in the rate of infections due to interventions that have been put in place. Here we attempt to fit zero-inflated models to data in this setting. The objective is to systematically compare distributions of the various zero-inflated models with an application to HIV Exposed Infants (HEI). We revisit zero-inflated models, conducted the simulations and applied the models to HEI data. The models performance were evaluated by Akaike Information Criteria(AIC).The simulation results indicated ZAP had the lowest AIC value of 467.95 at 80% of zeros. The real data showed ZAP as the best fit for the simulation data since it had the lowest AIC value. From the simulations results of the AIC value and the real data results, it is clear that ZAP is the best fitting model.
- ItemAssessing efficient odds ratios: an application to surgical stage prediction in cervical cancer(Strathmore University, 2020) Jesang, Jean C.Background: Cervical cancer remains the second most commonly diagnosed cancer and the third leading cause of cancer death in developing countries. Improving clinicians' knowledge and understanding of surgical staging is critical in the fight against the disease. Kenya has limited research on accurately predicting the surgical stage following surgical treatment for cervical cancer. The uptake of predictive mechanisms by gynecologists has not been common. Objective: To assess prediction by comparing the odds ratios of three popular ordinal regression models i.e. the Multinomial Logistic Regression (MLR) model, the Continuation Ratio (CR) model and Adjacent Category Logistic (ACL) model when applying cervical cancer data in surgical stage prediction. Method: We systematically compared the performance of MLR, CR and the ACL as the predictive mechanisms and evaluated the most appropriate model in the cervical cancer setting. The study considered women who visited the Oncology department at the Moi Teaching and Referral Hospital's Chandaria Cancer and Chronic Diseases Center and were diagnosed and surgically treated for cervical cancer from January 2014 to December 2018. Results and conclusion: We presented the comparison between 3 different regression models for ordinal data within the cervical cancer setting. We choose to carry out an inferential and a predictive approach. The inferential approach found that the CR model without proportional odds yielded better results when comparing the Akaike Information Criterion (AIC), log likelihood ratio and residual deviance. In addition, the key prognostic factor associated with invasive cervical cancer was the FIGO clinical stage which in particular, had a higher influence on the surgical stage 2 outcomes compared to the lesser surgical stage categories. All the 5 independent features selected for classifying the patients into surgical stages were the FIGO clinical stage and partly, the presence or absence of cancer of symptomatic vaginal discharge. However, the predictive approach found that the MLR, CR and ACL models were not statistically different and not suitable for the prediction of the surgical stage among the women surgically treated for cervical cancer.
- ItemIdentifying the best method to correct for missing data, a case of HIV/TB co-infection in Kenya(Strathmore University, 2020) Mwaro, Joshua OwuoriHaving missing information is almost inevitable in research, but many researchers only report on complete cases. Here we review the missing data theory, missingness characteristics, look at the background information, importance of studying missing data, the most common ways of correcting for missing data then extend to Kenyan HIV/ TB co-infection setting. We review most of the existing methods of dealing with missing data and what other scholars have done in the missing data area. In the methodology section, we outline and give characteristics and features of the four methods for dealing with missing data (Analysis of complete cases only, Mean/Single imputation method, MLE method, and Multiple Imputation method.) which our study is focused on. We also test the four methods on the simulated data then apply the same procedure on the real Kenyan HIV/TB co-infection data. Results showed that analysis of data that was corrected for missingness using: complete cases only, weighted method, likelihood-based, and multiple imputation estimated the Kenyan HIV/TB co-infection rate to be 29%, 27%, 26%, and 21% respectively. The results indicate that MI is the best approach to correct for missing data as it does not overestimate the HIV & TB co-infection rate.
- ItemThe Zero Inflated Negative Binomial - Shanker distribution and its application to HIV exposed infant data(Strathmore University, 2020) Kibika, Stella AndiaMotivated by HIV exposed infants (HEI) sero-conversion data, we provide an extension of Zero- inflated Negative Binomial (ZINB) distribution to Zero-Inflated Negative Binomial { Shanker (ZINB-SH) distribution. We review the classical Poisson, and negative binomial distribution when applying count data and there zero- Inflated versions. After reviewing the conceptual and computational features of these methods, we generate a new extension which is intrinsically a combination of Zero- Inflated Negative Binomial and Shanker distribution. In this setting the ZINB-SH, distribution provides an alternative to the Poisson-Shanker distribution in particular, when data exhibits over dispersion brought by excess zeros. The HIV Exposed infant data is characterized by both structured and non-structured zeroes which makes the feature ideal in this context. We describe the properties of ZINB-SH distribution and estimate its parameters. Extensive simulations were conducted and the results in terms of goodness-of-_t, compared to the standard Negative Binomial, Shanker, Zero- Inflated Negative Binomial and Negative Binomial-Shanker distributions. The ZINB-SH distribution is competitive under different settings of simulation and does well as sample size increases. To validate the distribution we apply real typical HIV Exposed Infant data.
- ItemForecasting Kenya’s GDP using a hybrid neural network and ARIMA model(Strathmore University, 2020-03) Ngige, Isabel WanjiruBackground: Gross Domestic Product (GDP) is the market value of goods and services produced within a selected geographical area usually a coun- try in a selected interval in time often a year and can be measured and forecasted in di erent ways for use by governments and other market par- ticipants.Speci c users of information on GDP analysis include the United Nations0 Sustainable Development Goal assessment whose key indicator is economic growth as measured by GDP and the joint International Mon- etary Fund-World Bank methodology for conducting standardized debt- sustainability analyses in low-income countries. Objective:The main objective of this study was to assess the superiority as suggested by Literature of a Hybrid Autoregressive Integrated Moving Average(ARIMA) and feed forward Arti cial Neural Network (ANN) model over a pure ARIMA model in forecasting Kenya0s GDP. Methods: The ARIMA and the additive ANN-ARIMA Hybrid model is used to forecast absolute GDP values and the comparative forecast accuracy is tested using the RMSE and visualization plots.The Box-Jenkins method- ology is used to t the ARIMA model while the feed-forward Neural Network Autoregressive(NNAR) structure is used to model the neural network por- tion of the hybrid model .
- ItemAnalysis of recurrent events with associated informative censoring: application to HIV data(Strathmore University, 2020-06) Ejoku, JonathanIn this study, we adapt a commonly used Cox-based model for recurrent events; the Prentice, Williams and Peterson Total -Time (PWP-TT) that has been largely used under the assumption of non-informative censoring and evaluate it under an informative censoring setting. Empirical evaluation was undertaken with the aid of the semi-parametric framework for recurrent events suggested by (Huang and Wang, 2004) where a subject speci c latent variable is used to model the association between the recurrent event and hazard of the failure time. All implementations were made in R Studio software, using the reReg package (Chiou and Huang, 2019) and the method in the reReg function set to 'cox.HW'. For validation we used HIV data from a typical HIV care setting in Kenya. Results show that the PWP-TT model generally t the data well, with a comparison to the Andersen-Gill method showing similar estimates, while the ordinary Cox model estimates were too unreliable
- ItemClassification of X-rays images using Deep Convolutional Neural Network: COVID-19(Strathmore University, 2021) Bore, Laban KipchirchirThe increased amount of labeled X-ray image archives has triggered increased research work in the application of statistics, machine learning, deep learning, and computer vision across the different domains. The fresh studies on the application of deep transfer learning (60) CNN to detect and classify few COVID-19 datasets have had major success. COVID-19 dataset has been collected since the outbreak of the COVID-19 viruses in quarter four of 2019. COVID-19 virus confused the diagnosis, treatment, and care of patients because there is no cure and the virus mutates into different fatal variants. This has led to thousands of people dying, increased admission into hospital beds, ICU, and other health facilities. Hundreds of thousands of new infection cases are reported daily across the world. The overburdening of the health system by the COVID-19 virus has caused access to other health services difficult in the under-served world (89). Traditionally, medical doctors carry several tests such as full blood count tests to ascertain if the body is fighting certain pathogens, sputum tests, and chest X-rays. Doctors will examine patients' medical history, carry physical exams such as listening to the lungs with astethoscope for abnormal crackling sounds. The success of this traditional diagnosis process is dependants on the doctors' experience, skills. quality of X-ray images and the availability of patient's historical records. This is almost unattainable and unsustainable in the under-served countries in Africa. The motivation of this paper is to complement the traditional diagnosis and analysis of chest X-ray images by introducing machine classification approaches and state-of-the-art deep residual network ResNet18 (14, 35). According to WHO (58), diagnosis is a process and requires classification steps to inform research, health policies, and care of the patients. An alternative definition is a \pre-existing set of categories agreed upon by the medical profession to designate a specific condition" (43). We applied statistical learning model to separate and classify all the X-Rays images with patchy areas into one distinct class for further research, examination, analysis, and care of the patients. The observed white patchy areas in our X-Rays images was our statistical variables of interest in classifying Chest X-Rays images into COVID-19 and non-COVID-19, pg 3.2. In addition, the final model can be replicated in other non-covid datasets and extended to other related classification tasks. Deep CNN classification model(ResNet18) as a subfield of non-parametric statistics was used for classifying and predicting COVID-19 positive images. The datasets used were COVID-19 positive (184 cases) and the COVID-19 negative cases (5000) were aggregated from different sources. The COVID-19 negative cases was from 10 disease categories (Pneumonia, Pneumothorax, Lung opacity, Fracture, Atelectasis, Edema, pleural, etc). The finetuned deep CNN model (ResNet18) performed significantly with precision (87.5%), sensitivity (75%) and specificity (99.8%). Rerunning the model using larger datasets by adding noise through data augmentation demonstrated sensitivity (90%) and specificity (100%). Hence, when more dataset is fed into the neural model, the classification performance such as precision, AUC and recall improves significantly. This classification model can be used to aid radiologists or medical practitioners in chest X-ray image diagnosis and treatment (59) by categorization, diagnosis, detection, and prediction. Further extension of this research work will focus on using larger COVID-19 or non-COVID-19 datasets with more focus on systematic review around data acquisition, data certification, model development and pitfalls, and explanation construction (39).
- ItemRobust statistical learning for optimal classification of imbalanced data(Strathmore University, 2021) Juma, Samuel WanyonyiNeurobiological disorders such as Learning Disabilities (LD) are increasing becoming a major concern in education and health sectors, hence, precise identification of these disorders is critical. While neuropsychological assessments play an important role in diagnosis, there is limited conventional methodologies for test administration, scoring and interpretation of results. Consequently, there is frequent misclassification of children due to imprecise distinction between children with learning disabilities and those with learning difficulties. This research sought to apply statistical and Machine Learning (ML) approaches to strengthen the LD diagnostic process. This research addresses the challenges of learning from imbalanced data, a characteristic often associated with LD data due to low prevalence of the disorder. Imbalanced data poses a challenge in designing efficient ML solutions since standard classification models assumes fairly distributed classes. The study used experimental design to identify a suitable base learner, and corrective technique to tackle the challenge of imbalanced data. Statistical experiments performed were based on secondary data obtained from a Baseline Survey on Learning Disabilities conducted by Kenya Institute of Special Education in 2019. It was found that Support Vector Machine (SVM) is the best base learner for imbalanced data with the highest classification efficiency compared to other classification models. For data with high dimensionality, it was found that the classification power of Artificial Neural Network (ANN) was better than that of SVM despite the need for significantly higher computational effort. When data dimensionality is reduced, it was observed that classification power of ANN reduces significantly. SVM was also found to be a more flexible model whose classification power is least affected by changes in data dimensionality. It was found that both Adaptive Boosting (AdaBoost) and Adaptive Synthetic Sampling (ADASYN) equally perform well in tackling the imbalanced data, with AdaBoost performing slightly better, although the difference was not statistically significant. The study concludes that SVM and ANN can be used to model highly imbalanced data to achieve the highest classification accuracy with respect to the minority class. ADASYN and AdaBoost methods can be used jointly to build a more robust corrective algorithm to tackle highly imbalanced data.
- ItemUsing semi-Markov process to model incremental change in HIV staging with cost effect(Strathmore University, 2021) Andrew, Joram MaluluiOver the past years, parametric and non-parametric methods have been used in modelling cost and effectiveness according to one studied event or one health state. In this study we used semi-Markov model in which the distributions of sojourn times are explicitly defined. Weibull distribution was chosen and used in modelling the hazard function for each transition. Using a regression model for cost, a cumulative cost function of cost was developed enabling us to determine the estimated mean cost per patient in each state defined in the semi-Markov model. ICER was used for cost effectiveness analysis in comparing two strategies (Patients in DCM and patients not in DCM) of follow up. Using viral load, three states were defined; V L < 200ml, 200ml < V L < 1000ml, V L > 10000ml and an absorbing state death. The mean cost of the patients for each state 1, 2 and 3 was $765, $829 and $1395 respectively. The calculated ICER ratio was $483.8268/life-year-saved. The cost of keeping patients in state 1 (on DCM) was relatively cheaper and efficient compared to the other states.
- ItemPredictive modeling of Logistics Performance Index using Sparse Regression Models(Strathmore University, 2021) Odok, Eric OyengaThe Logistics Performance Index (LPI), developed by The World Bank, is the only interactive benchmarking tool countries use to identify challenges and opportunities in trade logistics. It was developed using Principal Component Analysis and is a mean average of severely correlated variable scores; this poses two major problems: the susceptibility to outliers of mean computed measures and multicollinearity in prediction leading to overfitting. It is therefore critical to choose prediction techniques carefully. Regression is one of the many techniques, which can reliably predict the correct LPI. This paper accessed four regression models through median computed LPI, which is less vulnerable to outliers; the multiple linear regression model (MLRM), ridge regression model, elastic net model and LASSO model. The first observation was that mean and median computed LPI’s were not different; in prediction, they both overfitted in the test data. Mean computed LPI, however, overfitted more than median. MLRM used all six variables to produce the best fit to the training set (𝑅𝑀𝑆𝐸 = 0.0497, 𝐴𝐼𝐶 = − 952), however, tested on unseen data, it was the least precise (𝑅𝑀𝑆𝐸 = 0.0438). On the other hand, LASSO did not fit the training set well (𝑅𝑀𝑆𝐸 = 0.3627, 𝐴𝐼𝐶 = −318) but was the most precise predictive model (𝑅𝑀𝑆𝐸 = 0.0436). LASSO, through variable shrinkage and selection, eliminated one irrelevant variable, timeliness. The two models were not significantly different (P = 0.2951, at 95% CI); the value addition through LASSO was parsimony. While MLRM used all six variables, LASSO used five to generate similar models. Policymakers could reliably use the top three variables that explained 80% of the variability in the model: logistics quality, infrastructure and tracking. Improving the physical infrastructure, increasing logistics management skills, and implementing intelligent technologies could improve trade competitiveness.
- ItemA Joint modelling approach of monthly anthropometry and time to death among hospitalized severe malnourished children in Kenya(Strathmore University, 2021) Maronga, Christopher SianyoBackground: In follow up studies, interest often lies in understanding the association between biomarkers measured over time and a time-to-event outcome. For this, a two-stage separate analysis or the use of time-dependent Cox models are often used. The former approach does not account for shared features between the two processes while the latter ignores the indigeneity in the biomarker, resulting in inefficient and biased estimates. The objective of this project was to _x joint models on longitudinal anthropometry and time to death among children hospitalized with complicated SAM in four hospitals in Kenya. Methods: Data from a randomised placebo-controlled trial for 1,778 children aged 2 to 59 months admitted to hospital with complicated Severe Acute Malnutrition (SAM) but without HIV was analysed. We used Linear mixed effects models to model longitudinal anthropometry and Cox proportional hazards model to assess the effect of a priori selected baseline covariates on mortality. The two models were linked through current value and slope association to create a joint model used to study the effect of longitudinal anthropometry on risk of death. Results: The joint model results showed that a unit centimetre gain in monthly midupper arm circumference (MUAC) was associated with 46.8% reduction in hazard of death, 0.532(95% CI: 0.476-0.596), while a unit gain in standard deviation (SD) for weight-forheight WHZ) was associated with 37.1% reduction in the risk of death, 0.629(95% CI:0.579- 0.683). A unit gain in SD for monthly weight-for-age (WAZ) and height-for-age (HAZ) was associated with 21.2%, 0.788(95% CI: 0.742-0.837) and 2.5%, 0.227(95% C.I: 0.008 - 6.556) reduction in risk of mortality respectively. Conclusion: In studying the relationship between survival outcome and covariates, researchers often use baseline values of the covariates which fails to account for the interdependencies. Using joint modelling framework, we quantified the association between four longitudinal anthropometry and risk of death. Through current value and slope association, MUAC and WHZ have the strongest association with risk of death respectively hence are better metrics and can be used to screen and identify high-risk children.
- ItemForecasting of the inflation rates in Kenya: a comparison of ANN, ARIMA and SARIMA(Strathmore University, 2021) Kogei, Victor KipronoMonetary policies like price stability are regulated by the Central Bank of Kenya (CBK). Price stability is a key indicator of stable and predictable inflation. Accuracy and reliability in forecasting the inflation rates or predicting its trend correctly are very essential to investors, academia and policymakers. This call for the need to have models with an accurate prediction of the inflation rates to spur investment and economic growth. The use of an intelligence-based model has been found to be robust in forecasting financial and economic series like inflation rates and stock prices. This research, therefore, employs the use of the artificial neural network to forecast the inflation rates in Kenya and compared its performance with statistical models ARIMA and SARIMA. The artificial neural network models emulate the information processing capabilities of neurons of the human brain, thus making them flexible to map input and output well. A major advantage of ANNs is its ability to capture linear and non-linear data due to lack of assumptions, unlike statistical models. The inflation rates data, Gross domestic product (GDP) and exchange rates were the variables used. The variables are monthly data from January 2012 to February 2021. The prediction performances of the three models were evaluated through RMSE, MAE and MAPE. The results obtained show that artificial neural networks outperformed ARIMA and SARIMA models. The implication is that the government can adopt an artificial neural network for forecasting inflation rates in Kenya.
- ItemModeling of count data with an informative time component in the presence of overdispersion(Strathmore University, 2022) Owiti, Levi Alfred OreroIn real-world count data, several methods have been applied to handle the common problem of overdispersion. However, these methods have not comprehensively considered unique features that may exist in the data. This study sought to address robust statistical modelling of count response data that contains temporal features. The study proposed a Bayesian Negative Binomial model that will handle over dispersion while taking into account the temporal features of the data. Two count data models were compared and extended to incorporate an informative time component. To test the various models, this study conducted simulation studies under specified parameters to examine how the models behave under certain conditions. The data generation mechanism ensured the simulated data had seasonality as is with the real-world data on fire frequency, temperature, and rainfall. Further, the study examined the effect of the additional components on prediction intervals of the simulation studies for the different count models. The introduction of Bayesian techniques into the modeling was intended to create more accurate prediction intervals that take account of the prior distribution of the data. The Bayesian Negative Binomial model was better than the Negative Binomial model in terms of model bias. When validated on real data to confirm its effectiveness, the Bayesian model had better MASE and the prediction intervals enveloped the actual data in the testing dataset of fires in Kenya between the year 2000 and 2018.