Predictive modeling of Logistics Performance Index using Sparse Regression Models

dc.contributor.authorOdok, Eric Oyenga
dc.date.accessioned2022-06-13T09:01:09Z
dc.date.available2022-06-13T09:01:09Z
dc.date.issued2021
dc.descriptionA Research Thesis Submitted to the Graduate School in partial fulfillment of the requirements for the Award of Master of Science Degree in Statistical Sciences at Strathmore Universityen_US
dc.description.abstractThe Logistics Performance Index (LPI), developed by The World Bank, is the only interactive benchmarking tool countries use to identify challenges and opportunities in trade logistics. It was developed using Principal Component Analysis and is a mean average of severely correlated variable scores; this poses two major problems: the susceptibility to outliers of mean computed measures and multicollinearity in prediction leading to overfitting. It is therefore critical to choose prediction techniques carefully. Regression is one of the many techniques, which can reliably predict the correct LPI. This paper accessed four regression models through median computed LPI, which is less vulnerable to outliers; the multiple linear regression model (MLRM), ridge regression model, elastic net model and LASSO model. The first observation was that mean and median computed LPI’s were not different; in prediction, they both overfitted in the test data. Mean computed LPI, however, overfitted more than median. MLRM used all six variables to produce the best fit to the training set (𝑅𝑀𝑆𝐸 = 0.0497, 𝐴𝐼𝐢 = βˆ’ 952), however, tested on unseen data, it was the least precise (𝑅𝑀𝑆𝐸 = 0.0438). On the other hand, LASSO did not fit the training set well (𝑅𝑀𝑆𝐸 = 0.3627, 𝐴𝐼𝐢 = βˆ’318) but was the most precise predictive model (𝑅𝑀𝑆𝐸 = 0.0436). LASSO, through variable shrinkage and selection, eliminated one irrelevant variable, timeliness. The two models were not significantly different (P = 0.2951, at 95% CI); the value addition through LASSO was parsimony. While MLRM used all six variables, LASSO used five to generate similar models. Policymakers could reliably use the top three variables that explained 80% of the variability in the model: logistics quality, infrastructure and tracking. Improving the physical infrastructure, increasing logistics management skills, and implementing intelligent technologies could improve trade competitiveness.en_US
dc.identifier.urihttp://hdl.handle.net/11071/12820
dc.language.isoenen_US
dc.publisherStrathmore Universityen_US
dc.subjectPredictive modelingen_US
dc.subjectLogistics Performance Indexen_US
dc.subjectSparse Regression Modelsen_US
dc.titlePredictive modeling of Logistics Performance Index using Sparse Regression Modelsen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Predictive modeling of Logistics Performance Index using Sparse Regression Models.pdf
Size:
1.6 MB
Format:
Adobe Portable Document Format
Description:
Full - text thesis
Loading...
Thumbnail Image
Name:
Eric Odok - Thesis coverpage.pdf
Size:
747.33 KB
Format:
Adobe Portable Document Format
Description:
Cover page
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: