MSc.SS Theses and Dissertations (2021)
Permanent URI for this collection
Browse
Browsing MSc.SS Theses and Dissertations (2021) by Subject "Sparse Regression Models"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemPredictive modeling of Logistics Performance Index using Sparse Regression Models(Strathmore University, 2021) Odok, Eric OyengaThe Logistics Performance Index (LPI), developed by The World Bank, is the only interactive benchmarking tool countries use to identify challenges and opportunities in trade logistics. It was developed using Principal Component Analysis and is a mean average of severely correlated variable scores; this poses two major problems: the susceptibility to outliers of mean computed measures and multicollinearity in prediction leading to overfitting. It is therefore critical to choose prediction techniques carefully. Regression is one of the many techniques, which can reliably predict the correct LPI. This paper accessed four regression models through median computed LPI, which is less vulnerable to outliers; the multiple linear regression model (MLRM), ridge regression model, elastic net model and LASSO model. The first observation was that mean and median computed LPIβs were not different; in prediction, they both overfitted in the test data. Mean computed LPI, however, overfitted more than median. MLRM used all six variables to produce the best fit to the training set (π πππΈ = 0.0497, π΄πΌπΆ = β 952), however, tested on unseen data, it was the least precise (π πππΈ = 0.0438). On the other hand, LASSO did not fit the training set well (π πππΈ = 0.3627, π΄πΌπΆ = β318) but was the most precise predictive model (π πππΈ = 0.0436). LASSO, through variable shrinkage and selection, eliminated one irrelevant variable, timeliness. The two models were not significantly different (P = 0.2951, at 95% CI); the value addition through LASSO was parsimony. While MLRM used all six variables, LASSO used five to generate similar models. Policymakers could reliably use the top three variables that explained 80% of the variability in the model: logistics quality, infrastructure and tracking. Improving the physical infrastructure, increasing logistics management skills, and implementing intelligent technologies could improve trade competitiveness.