Comparison of neural networks and tree-based ensemble methods in detecting correlates of breast cancer survival
Date
2022
Authors
Katam, Ruth Jepchirchir
Journal Title
Journal ISSN
Volume Title
Publisher
Strathmore University
Abstract
Breast cancer is common among women impacting about 2.1 million women each year, and causing a big number of cancer-related deaths. Most times doctors have a struggle in diagnosing the stage to determine accurately and needed medication. Therefore, accurate detection of correlates of breast cancer survival is paramount. This study sought to compare the performance of Neural Networks and Tree-based Ensemble methods to predict breast cancer survival, elucidating on factors causing breast cancer based on clinical data for timely intervention. The accuracy score, recall score, precision score, Area under Receiver- Operating Characteristic Curve, and F1 score were used to evaluate the performance of each model in discerning between breast cancer survivors and non-survivors. XGboost and LSTM exhibited an outstanding performance in the classification of Breast cancer patients. However, XGboost was the most optimal model. The results depicted that age at diagnosis, pam50+ claudin low subtype her2, 3 gene classifier subtype high, profile,radiotherapy,Nottingham prognostic index,type of breast surgery breast conserving, type of breast surgery mastectomy, mutation count, lymph nodes examined positive, tumor stage, tumor size, 3 gene classifier subtype low profile, pre inferred menopausal state and Post inferred menopausal state. among others were the most important correlates of survival from breast cancer.
Description
Submitted in partial fulfillment of the requirements for the degree of Master of Science in Statistical Science of Strathmore University