Leveraging Machine Learning in housing price prediction in Nairobi county

Date
2023
Authors
Nduati, J. W.
Journal Title
Journal ISSN
Volume Title
Publisher
Strathmore University
Abstract
Housing prices have in the recent years dominated social and economic discussions in both developed and developing countries such as Kenya. Literature shows that modelling housing pricing in Kenya is predominantly based on conventional statistical theory and methodologies that are increasingly becoming sub-optimal in the wake of big data and technological advancements. To fill this gap, this research leveraged Machine Learning (ML) in housing price modelling and prediction in Nairobi County. The project further sought to achieve four objectives, namely: 1) to determine the features that significantly influence housing prices in Nairobi County; 2) to compare different ML models and techniques used to predict housing prices in Nairobi County; 3) to predict housing prices in Nairobi County using trained ML models based on unseen data in production; and, 4) to make policy recommendations on determination of housing prices. Primary data was collected from homeowners and potential homeowners from Shauri Moyo and Kibra. Further, Key Informant Interviews (KII) were undertaken with respondents from the State Department for Housing. The survey achieved a response rate of 85.3% and established that participants strongly agreed that; presence of tarmacked road within 5 Kilometres (Km), electricity connection/generator, a hospital within 5Km, a school within 5 Km, internet connectivity, age of the house et cetera significantly determined housing prices. Secondary data on the other hand was retrieved from Property24.co.ke. Fourteen ML/ Deep Learning models were trained, optimized and tested based on the evaluation metrics; Root Mean Squared Error (RMSE) and R-Squared. Insights from the secondary data showed that; number of bedrooms, bathrooms, parking lots, location and type of the house accounted for at least 88% of variations in the predicted house sale price in half of the ML models. The best ML candidate was the Light-Gradient Boosted Machine (Light GBM) with a RMSE of 11.21635 and an R-Squared score of 88.65%. The least performing was the Elastic Net model with a RMSE of 15.405066 and an R-Squared score of 78.59%. Four models with the best predictive accuracy were used to predict housing prices in Nairobi County based on real world data points from 9 random locations, resulting to predictions that had minimal disparities. The study recommended to property developers and policy makers as stakeholders in the housing sector to: allocate resources towards consistent and efficient collection, storage and sharing of quality data on housing features that significantly influenced housing prices in Nairobi County. Additionally, the study advocated for data-oriented Government policies in the housing sector and the implementation of new-age technologies such as AI/ML for efficient modelling and prediction of housing prices. Keywords: Machine learning; price prediction; supervised learning; housing features.
Description
Full- text thesis
Keywords
Citation
Nduati, J. W. (2023). Leveraging Machine Learning in housing price prediction in Nairobi county [Strathmore University]. http://hdl.handle.net/11071/13407