MSc.SS Theses and Dissertations (2020)

Permanent URI for this collection


Recent Submissions

Now showing 1 - 5 of 5
  • Item
    Forecasting Kenya’s GDP using a hybrid neural network and ARIMA model
    (Strathmore University, 2020-03) Ngige, Isabel Wanjiru
    Background: Gross Domestic Product (GDP) is the market value of goods and services produced within a selected geographical area usually a coun- try in a selected interval in time often a year and can be measured and forecasted in di erent ways for use by governments and other market par- ticipants.Speci c users of information on GDP analysis include the United Nations0 Sustainable Development Goal assessment whose key indicator is economic growth as measured by GDP and the joint International Mon- etary Fund-World Bank methodology for conducting standardized debt- sustainability analyses in low-income countries. Objective:The main objective of this study was to assess the superiority as suggested by Literature of a Hybrid Autoregressive Integrated Moving Average(ARIMA) and feed forward Arti cial Neural Network (ANN) model over a pure ARIMA model in forecasting Kenya0s GDP. Methods: The ARIMA and the additive ANN-ARIMA Hybrid model is used to forecast absolute GDP values and the comparative forecast accuracy is tested using the RMSE and visualization plots.The Box-Jenkins method- ology is used to t the ARIMA model while the feed-forward Neural Network Autoregressive(NNAR) structure is used to model the neural network por- tion of the hybrid model .
  • Item
    Analysis of recurrent events with associated informative censoring: application to HIV data
    (Strathmore University, 2020-06) Ejoku, Jonathan
    In this study, we adapt a commonly used Cox-based model for recurrent events; the Prentice, Williams and Peterson Total -Time (PWP-TT) that has been largely used under the assumption of non-informative censoring and evaluate it under an informative censoring setting. Empirical evaluation was undertaken with the aid of the semi-parametric framework for recurrent events suggested by (Huang and Wang, 2004) where a subject speci c latent variable is used to model the association between the recurrent event and hazard of the failure time. All implementations were made in R Studio software, using the reReg package (Chiou and Huang, 2019) and the method in the reReg function set to 'cox.HW'. For validation we used HIV data from a typical HIV care setting in Kenya. Results show that the PWP-TT model generally t the data well, with a comparison to the Andersen-Gill method showing similar estimates, while the ordinary Cox model estimates were too unreliable
  • Item
    Assessing efficient odds ratios: an application to surgical stage prediction in cervical cancer
    (Strathmore University, 2020) Jesang, Jean C.
    Background: Cervical cancer remains the second most commonly diagnosed cancer and the third leading cause of cancer death in developing countries. Improving clinicians' knowledge and understanding of surgical staging is critical in the fight against the disease. Kenya has limited research on accurately predicting the surgical stage following surgical treatment for cervical cancer. The uptake of predictive mechanisms by gynecologists has not been common. Objective: To assess prediction by comparing the odds ratios of three popular ordinal regression models i.e. the Multinomial Logistic Regression (MLR) model, the Continuation Ratio (CR) model and Adjacent Category Logistic (ACL) model when applying cervical cancer data in surgical stage prediction. Method: We systematically compared the performance of MLR, CR and the ACL as the predictive mechanisms and evaluated the most appropriate model in the cervical cancer setting. The study considered women who visited the Oncology department at the Moi Teaching and Referral Hospital's Chandaria Cancer and Chronic Diseases Center and were diagnosed and surgically treated for cervical cancer from January 2014 to December 2018. Results and conclusion: We presented the comparison between 3 different regression models for ordinal data within the cervical cancer setting. We choose to carry out an inferential and a predictive approach. The inferential approach found that the CR model without proportional odds yielded better results when comparing the Akaike Information Criterion (AIC), log likelihood ratio and residual deviance. In addition, the key prognostic factor associated with invasive cervical cancer was the FIGO clinical stage which in particular, had a higher influence on the surgical stage 2 outcomes compared to the lesser surgical stage categories. All the 5 independent features selected for classifying the patients into surgical stages were the FIGO clinical stage and partly, the presence or absence of cancer of symptomatic vaginal discharge. However, the predictive approach found that the MLR, CR and ACL models were not statistically different and not suitable for the prediction of the surgical stage among the women surgically treated for cervical cancer.
  • Item
    The Zero Inflated Negative Binomial - Shanker distribution and its application to HIV exposed infant data
    (Strathmore University, 2020) Kibika, Stella Andia
    Motivated by HIV exposed infants (HEI) sero-conversion data, we provide an extension of Zero- inflated Negative Binomial (ZINB) distribution to Zero-Inflated Negative Binomial { Shanker (ZINB-SH) distribution. We review the classical Poisson, and negative binomial distribution when applying count data and there zero- Inflated versions. After reviewing the conceptual and computational features of these methods, we generate a new extension which is intrinsically a combination of Zero- Inflated Negative Binomial and Shanker distribution. In this setting the ZINB-SH, distribution provides an alternative to the Poisson-Shanker distribution in particular, when data exhibits over dispersion brought by excess zeros. The HIV Exposed infant data is characterized by both structured and non-structured zeroes which makes the feature ideal in this context. We describe the properties of ZINB-SH distribution and estimate its parameters. Extensive simulations were conducted and the results in terms of goodness-of-_t, compared to the standard Negative Binomial, Shanker, Zero- Inflated Negative Binomial and Negative Binomial-Shanker distributions. The ZINB-SH distribution is competitive under different settings of simulation and does well as sample size increases. To validate the distribution we apply real typical HIV Exposed Infant data.
  • Item
    Identifying the best method to correct for missing data, a case of HIV/TB co-infection in Kenya
    (Strathmore University, 2020) Mwaro, Joshua Owuori
    Having missing information is almost inevitable in research, but many researchers only report on complete cases. Here we review the missing data theory, missingness characteristics, look at the background information, importance of studying missing data, the most common ways of correcting for missing data then extend to Kenyan HIV/ TB co-infection setting. We review most of the existing methods of dealing with missing data and what other scholars have done in the missing data area. In the methodology section, we outline and give characteristics and features of the four methods for dealing with missing data (Analysis of complete cases only, Mean/Single imputation method, MLE method, and Multiple Imputation method.) which our study is focused on. We also test the four methods on the simulated data then apply the same procedure on the real Kenyan HIV/TB co-infection data. Results showed that analysis of data that was corrected for missingness using: complete cases only, weighted method, likelihood-based, and multiple imputation estimated the Kenyan HIV/TB co-infection rate to be 29%, 27%, 26%, and 21% respectively. The results indicate that MI is the best approach to correct for missing data as it does not overestimate the HIV & TB co-infection rate.