Coronary Heart disease prediction in the USA and factors that favor its occurrence
Date
2021
Authors
Gachanja, Jeremy Kibiru
Journal Title
Journal ISSN
Volume Title
Publisher
Strathmore University
Abstract
Coronary Heart Disease (CHD) is the leading cause of deaths in adults in Europe ~md North
America (WHO, 2017) . Early detection and treatment of this disease is thus a matter of life and
death (Gonsalves, Thabtah, Mohammad, & Singh, 2019). This project has compared the predictive
power of five machine learning algorithms namely: Support Vector Machine, Naive Bayes,
Logistic Regression, Decision Trees and Neural Networks, in predicting this disease. The objective
of this study was to determine which of the five algorithms was best suited for CHD prediction
and what level of the CHD risk factors favored the occmrence of CHD. This study had fourteen
CHD risk factors that is gender, age, smoking habit, number of cigarettes smoked, use of blood
pressure medication, prevalent stroke, prevalent hypetiension, diabetes, total cholesterol, diastolic
and systolic blood pressure, BMI, heart rate, and education. However, this study found that only
age, systolic and diastolic blood pressure, prevalent hypertension, blood pressure medication and
diabetes had a significant correlation with CHD occurrence. This study used these seven CHD risk
factors to model CHD occurrence in the five algorithms. This study found that the logistic
regression was best suited for predicting CHD, followed by Naive Bayes then Decision Tree and
lastly SVM and Neural Networks. This work found that CHD positive individuals had high
cholesterol (235mm on average), high blood sugar (a maximum of 394mm), had a smoking habit (10.82 cigarettes per day on average), were obese (overweight BMI of 26.63 on average) and had high blood pressure (a maximwn of 295/140 Mm Hg and 143/86 Mm Hg on average
Description
Submitted in partial fulfillment of the requirements for the Degree of Financial Economics at Strathmore University