Predictive modeling of Logistics Performance Index using Sparse Regression Models
dc.contributor.author | Odok, Eric Oyenga | |
dc.date.accessioned | 2022-06-13T09:01:09Z | |
dc.date.available | 2022-06-13T09:01:09Z | |
dc.date.issued | 2021 | |
dc.description | A Research Thesis Submitted to the Graduate School in partial fulfillment of the requirements for the Award of Master of Science Degree in Statistical Sciences at Strathmore University | en_US |
dc.description.abstract | The Logistics Performance Index (LPI), developed by The World Bank, is the only interactive benchmarking tool countries use to identify challenges and opportunities in trade logistics. It was developed using Principal Component Analysis and is a mean average of severely correlated variable scores; this poses two major problems: the susceptibility to outliers of mean computed measures and multicollinearity in prediction leading to overfitting. It is therefore critical to choose prediction techniques carefully. Regression is one of the many techniques, which can reliably predict the correct LPI. This paper accessed four regression models through median computed LPI, which is less vulnerable to outliers; the multiple linear regression model (MLRM), ridge regression model, elastic net model and LASSO model. The first observation was that mean and median computed LPIβs were not different; in prediction, they both overfitted in the test data. Mean computed LPI, however, overfitted more than median. MLRM used all six variables to produce the best fit to the training set (π πππΈ = 0.0497, π΄πΌπΆ = β 952), however, tested on unseen data, it was the least precise (π πππΈ = 0.0438). On the other hand, LASSO did not fit the training set well (π πππΈ = 0.3627, π΄πΌπΆ = β318) but was the most precise predictive model (π πππΈ = 0.0436). LASSO, through variable shrinkage and selection, eliminated one irrelevant variable, timeliness. The two models were not significantly different (P = 0.2951, at 95% CI); the value addition through LASSO was parsimony. While MLRM used all six variables, LASSO used five to generate similar models. Policymakers could reliably use the top three variables that explained 80% of the variability in the model: logistics quality, infrastructure and tracking. Improving the physical infrastructure, increasing logistics management skills, and implementing intelligent technologies could improve trade competitiveness. | en_US |
dc.identifier.uri | http://hdl.handle.net/11071/12820 | |
dc.language.iso | en | en_US |
dc.publisher | Strathmore University | en_US |
dc.subject | Predictive modeling | en_US |
dc.subject | Logistics Performance Index | en_US |
dc.subject | Sparse Regression Models | en_US |
dc.title | Predictive modeling of Logistics Performance Index using Sparse Regression Models | en_US |
dc.type | Thesis | en_US |
Files
Original bundle
1 - 2 of 2
Loading...
- Name:
- Predictive modeling of Logistics Performance Index using Sparse Regression Models.pdf
- Size:
- 1.6 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full - text thesis
Loading...
- Name:
- Eric Odok - Thesis coverpage.pdf
- Size:
- 747.33 KB
- Format:
- Adobe Portable Document Format
- Description:
- Cover page
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: