SU+ @ Strathmore University Library Electronic Theses and Dissertations This work is availed for free and open access by Strathmore University Library. It has been accepted for digital distribution by an authorized administrator of SU+ @Strathmore University. For more information, please contact library@strathmore.edu 2024 Utilizing Convolution Neural Networks for enhanced lung cancer classification through CT scan analysis. Koris, Phylis Jepchumba Strathmore Institute of Mathematical Sciences Strathmore University Recommended Citation Korir, P. J. (2024). Utilizing Convolution Neural Networks for enhanced lung cancer classification through CT scan analysis [Strathmore University]. http://hdl.handle.net/11071/15648 Follow this and additional works at: http://hdl.handle.net/11071/15648 https://su-plus.strathmore.edu/ https://su-plus.strathmore.edu/ http://hdl.handle.net/11071/2474 mailto:library@strathmore.edu http://hdl.handle.net/11071/15648 http://hdl.handle.net/11071/15648 Utilizing Convolution Neural Networks for Enhanced Lung Cancer Classification Through CT Scan Analysis Korir, Phylis Jepchumba Submitted in partial fulfilment of the requirements for the degree of Master of Science in Data Science and Data Analytics Strathmore Institute of Mathematical Sciences Strathmore University Nairobi, Kenya June 2024 This thesis is available for Library use through open access on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. Declaration I declare that this work has not been previously submitted and approved for award of a degree by this or any other University. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made in the thesis itself. © No part of this thesis may be reproduced without the permission of the author and Strathmore University. Name: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Korir Phylis Jepchumba. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .April 4, 2024 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approval The thesis of Korir Phylis Jepchumba was reviewed and approved by the Supervisors. Prof. Dr. Javier Serrano Autonomous University of Barcelona. Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .April 4, 2024 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prof. Samuel M. Mwalili Strathmore University. Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .April 4, 2024 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract Lung cancer is the major cause of cancer mortality, which poses significant challenges to accurate and timely diagnosis, especially in resource-constrained regions like Kenya. The traditional method of diagnosing lung cancer through Computed Tomography (CT) scans often involves manual interpretation, leading to potential delays and inaccuracies. This research aims to harness the power of Artificial Intelligence (AI) to improve the diagnostic process. This research study developed a Convolution Neural Network (CNN) model for enhanced classification of cancer utilizing CT scan images by fine-tuning the pre-trained ResNet50 architecture. Utilizing Pytorch, a leading deep learning framework for computer vision, the model was trained on a curated dataset from the public Lung Image Database Consortium (LIDC), a medical imaging database for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis The collected CT scan image include various types of lung cancer, such as adenocarcinoma, squamous cell carcinoma, large cell carcinoma, and normal tissue. Data pre-processing techniques such as resizing, normalization, converting and data augmentation techniques were used to ensure compatibility with the pre-trained model. The model’s performance was evaluated with a range of metrics, demonstrating an accuracy of 87.5%, precision of 80.97%, and an F1 score of 77.4%. These results indicate a promising capability for the model to accurately classify types of lung cancer, supporting its potential use in clinical settings. The pre-trained model was then integrated into a web-based application using the Flask framework, with a frontend designed with Vue.js to provide an intuitive user experience for image upload functionality. The Flask API facilitates communication between the frontend and the ResNet 50-based machine learning model. When a CT scan image is uploaded, it is sent to the Flask backend as an HTTP request. The Flask application processes these requests, extracting the image data and preparing it for analysis by interfacing with the ResNet 50 model, which then classifies the images and retrieves the results. iii Table of contents List of figures viii List of tables ix List of abbreviations x Acknowledgement xi Dedication xii 1 Introduction 1 1.1 Background to the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Introduction to Lung cancer . . . . . . . . . . . . . . . . . . . . . 1 1.1.2 Medical Imaging in lung Cancer Diagnosis . . . . . . . . . . . . . 2 1.1.3 Challenges in CT scan Analysis . . . . . . . . . . . . . . . . . . . 3 1.1.4 Role of Artificial Intelligence in healthcare . . . . . . . . . . . . . 3 1.2 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 General Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.2 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Significance of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.6 Scope of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.7 Limitations of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Literature Review 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Theoretical Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Theory of Convolutional Neural Networks (CNNs) . . . . . . . . . 9 2.2.2 Theory of Pattern Recognition and Feature Extraction . . . . . . . 10 2.2.3 Transfer Learning and Fine-Tuning Theory . . . . . . . . . . . . . 12 2.3 Empirical review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Methodology 20 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Data Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Image Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Convolution Neural Network Architecture . . . . . . . . . . . . . . . . . . 21 3.4.1 Convolution layers . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4.2 Filters/Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4.3 Stride and Padding . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4.4 Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4.5 Pooling layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.6 Fully connected layer . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 Proposed Convolutional Neural Network: ResNet50 . . . . . . . . . . . . . 28 3.6 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.6.1 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.6.2 Loss Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.7 Evaluation of Model Performance . . . . . . . . . . . . . . . . . . . . . . 32 3.8 Model Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4 System Design and Architecture 36 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2.1 Model API Requirements . . . . . . . . . . . . . . . . . . . . . . 36 v 4.3 Overview of System Architecture . . . . . . . . . . . . . . . . . . . . . . . 37 4.4 Frontend Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4.1 User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4.2 Image Upload Functionality . . . . . . . . . . . . . . . . . . . . . 40 4.4.3 Error Handling and User Feedback . . . . . . . . . . . . . . . . . . 41 4.5 Backend Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.5.1 API Integration for Machine Learning Model . . . . . . . . . . . . 41 4.5.2 Handling Image Processing Requests . . . . . . . . . . . . . . . . 41 5 System Implementation and Testing 43 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2 User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.3 Image Upload Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.4 Image Upload Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.5 Classification of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6 Discussion of Results 47 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2 Image Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2.1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2.2 Resizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6.2.3 CT-Scan images Exploration . . . . . . . . . . . . . . . . . . . . . 48 6.3 Model Analysis and Performance Metrics . . . . . . . . . . . . . . . . . . 49 6.3.1 Summary of Model Performance . . . . . . . . . . . . . . . . . . . 49 6.3.2 Training and Validation Loss . . . . . . . . . . . . . . . . . . . . . 49 6.3.3 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6.3.4 Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.3.5 Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.3.6 F1 Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.4 System Design and Deployment . . . . . . . . . . . . . . . . . . . . . . . 55 vi 7 Conclusions, Recommendations and Future Work 56 7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 7.2 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 7.3 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 References 58 Appendix A Ethical approval 63 Appendix B Similarity Index 64 Appendix C Python code 69 C.1 Model Preparation Code . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 C.2 API Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 C.3 User Interface/ Frontend Code . . . . . . . . . . . . . . . . . . . . . . . . 74 vii List of figures Figure 3.1: Dataset Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Figure 3.2: Overview of CNN Architecture . . . . . . . . . . . . . . . . . . . . 22 Figure 3.3: ReLu Activation Function . . . . . . . . . . . . . . . . . . . . . . . 25 Figure 4.1: Overview of System Architecture . . . . . . . . . . . . . . . . . . . 38 Figure 5.1: User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . 44 Figure 5.2: Image Upload Process . . . . . . . . . . . . . . . . . . . . . . . . . 45 Figure 5.3: Validating Image File Formats for Upload . . . . . . . . . . . . . . 46 Figure 6.1: CT scan images of four distinct classes: normal lung, adenocarcinoma, squamous cell carcinoma, and large cell carcinoma types. . . . . . . 48 Figure 6.2: Training and Validation Loss . . . . . . . . . . . . . . . . . . . . . 50 Figure 6.3: Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Figure 6.4: Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Figure 6.5: Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Figure 6.6: F1 Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 viii List of tables Table 3.1: Parameters of the ResNet50 Model for Lung Cancer Classification . 29 Table 6.1: Training Performance Metrics . . . . . . . . . . . . . . . . . . . . . 49 Table 6.2: Test Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 49 ix List of abbreviations AI Artificial Intelligence API Application Programming Inter- face CT Computed Tomography CNN Convolution Neural Network DL Deep Learning PET Positron Emission Tomography MRI Magnetic Resonance Imaging ResNet Residual Network ReLU Rectified Linear Unit LIDC Lung Image Database Consor- tium CAD computer-assisted diagnostic x Acknowledgement First and foremost, I offer my sincerest gratitude to God for His unwavering guidance and strength throughout this journey. It is through His grace that I have been able to complete this project. I am deeply grateful for the guidance and supervision provided by my esteemed supervisor and lecturer Dr. John Olukuru for the guidance valuable advice and oversight throughout this journey. I would also like to convey my heartfelt appreciation to my supervisor, Prof. Dr. Javier Serrano, for his important support and supervision throughout this research. His experience and insights have helped shape the direction and success of my research. My gratitude extends to Strathmore University, particularly the Institute of Mathematical Sciences (SIMS), and the @iLabAfrica research centre. Their commitment to delivering world-class education and imparting practical and technical knowledge has been essential in enriching my academic and professional growth. xi Dedication This thesis is dedicated to my parents, who were instrumental in molding my educational path. Their steadfast support and motivation throughout my early years were the pillars of my academic success. I would like to give a heartfelt thanks to my mother, whose limitless love, support, and prayers were the driving forces behind my persistence. Additionally, I am profoundly grateful to my supervisor, Prof. Dr. Javier Serrano, for his invaluable guidance, and to my cousin Peter Kosgei, who has been a relentless source of inspiration and encouragement during this academic journey. xii Chapter 1 Introduction 1.1 Background to the Study 1.1.1 Introduction to Lung cancer Lung cancer is one of the most prevalent and deadly forms of cancer worldwide (Thandra et al., 2021), primarily caused by the uncontrolled growth of abnormal cells in lung tissues. This malignancy arises due to several risk factors, with cigarette smoking being the prominent contributor, accounting for a significant majority of cases (Bade and Cruz, 2020). Exposure to secondhand smoke, occupational carcinogens like asbestos and radon, genetic susceptibility, and preexisting lung conditions also contribute to its onset. Emerging primarily in the cells of the lungs, lung cancer is the second most commonly diagnosed cancer globally. The disease encompasses two main types: non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC). NSCLC, including subtypes like adenocarcinoma, squamous cell carcinoma, and large cell carcinoma, constitutes the majority of cases (Yin et al., 2022). From a global perspective, lung cancer’s impact on public health is substantial. According to the World Health Organization (WHO), lung cancer accounted for 11.4 % of all new cancer cases globally in 2020, and it was responsible for 18.0 % of all cancer-related deaths, making it the leading cause of cancer mortality. Leveraging advanced technologies like Convolution Neural Networks (CNNs) for CT scan analysis could potentially lead to higher early detection rates and reduced lung cancer-related mortality (Sung et al., 2021). As medical imaging data availability and computational capabilities grow, integrating artificial intelligence techniques becomes increasingly pertinent in refining lung cancer diagnosis. 1 In Kenya, as in many other low- and middle-income countries, lung cancer poses specific challenges due to limited resources and access to advanced medical facilities. The Kenya National Cancer Control Strategy 2017-2022 highlights the escalating concern of cancer in the country, with lung cancer being a significant contributor to cancer-related mortality (Schaefers et al., 2022). Implementing innovative technologies such as Convolution Neural Network in CT scan analysis offers the potential to enhance diagnostic accuracy, especially in regions with constrained resources. By automating the analysis process, the CNN model could aid medical professionals in identifying potential lung cancer cases earlier, potentially leading to improved survival rates. 1.1.2 Medical Imaging in lung Cancer Diagnosis A computed tomography (CT) scan is a medical imaging procedure that uses X-rays and powerful computer technology to produce comprehensive cross-sectional images of the body’s internal components (Cui et al., 2023). These images, often referred to as slices or sections, offer a comprehensive view of the anatomical features within the scanned region. CT scans play an essential role in the detection, diagnosis, and monitoring of various medical conditions, including lung cancer (Al-Sharify et al., 2020). In the context of lung cancer, the primary functions of CT scans are early detection, detailed visualization of the lungs, and the identification of suspicious nodules or masses that may indicate the presence of cancer, even at its earliest stages. Additionally, once lung cancer is suspected or confirmed, CT scans are instrumental in staging the disease, precisely determining the extent of cancer spread within the lungs and its potential involvement with nearby lymph nodes or other organs (Farsad, 2020). Radiologists, as highly trained medical professionals with expertise in interpreting medical images, carefully review CT scan images of the chest for any abnormalities, particularly in the lung parenchyma (lung tissue). They look for pulmonary nodules, which are small, round, or oval-shaped abnormalities in the lung tissue, varying in size and appearance. Additionally, they also compare current CT scans with previous ones, if available, to detect changes in 2 the size or appearance of nodules or the development of new ones. Radiologists generate detailed reports based on their findings, including the location and characteristics of any nodules or lesions, which guide in making informed decisions regarding patient care. 1.1.3 Challenges in CT scan Analysis The reliance on human expertise for CT scan analysis in lung cancer diagnosis poses several significant issues. Radiological evaluations are inherently subjective, as they depend on the expertise and experience of the radiologist. This subjectivity can lead to variations in diagnoses, potentially impacting patient outcomes. Different radiologists may interpret the same CT scan differently, introducing variability in the assessment of lung nodules and cancerous lesions. Furthermore, manual interpretation of CT scans is susceptible to human error, which can have dire consequences in cancer diagnosis. Fatigue, distractions, or cognitive biases may influence the radiologist’s judgment. Errors in judgment can result in missed or delayed cancer diagnoses, leading to late-stage detection and reduced treatment efficacy. The manual analysis of CT scans also involves sending scans to centralized facilities or external experts for interpretation, incurring additional time and cost burdens. The need for expert radiologists and the associated costs can strain healthcare budgets and limit accessibility to quality diagnostics. Consistency and precision are paramount in lung cancer diagnosis, and ensuring consistent and accurate results across various healthcare facilities is a significant challenge. To address these challenges effectively, there is a compelling need for innovative solutions like Convolution Neural Network. These neural networks excel in image recognition and classification tasks, making them highly suitable for CT scan analysis. 1.1.4 Role of Artificial Intelligence in healthcare Artificial intelligence (AI) is playing an increasingly significant role in transforming health- care across the globe (Secinaro et al., 2021). In the context of Kenya and other developing countries, AI is making significant contributions to various aspects of healthcare delivery. AI-powered solutions help healthcare professionals in diagnosing diseases, managing patient 3 data, optimizing treatment plans and improving healthcare outcomes. This technology is particularly relevant in resource constrained settings like Kenya, where it has potential to bridge the gaps in healthcare access and enhance quality of medical services. The use of AI in medical imaging is on the rise, revolutionizing the field of radiology and diagnostics in Kenya and beyond (Mwaniki et al., 2023). AI algorithms evaluate medical images with high accuracy, finding anomalies, lesions, and patterns. AI-driven image analysis provides real-time assistance to radiologists and clinicians, enabling faster and more precise diagnoses. Therefore, integrating this technology into Kenyan healthcare facilities can improve the accessibility of expert diagnostics. The integration of AI into medical diagnostics offers several potential benefits, including enhanced diagnostic accuracy by recognizing intricate patterns and subtle anomalies in medical images, improved efficiency by processing vast amounts of medical data, timely diagnosis and treatment planning, and the ability to optimize healthcare resource allocation. Ultimately, AI reduces costs by minimizing unnecessary tests and streamlining workflows, making it a valuable asset for healthcare systems, particularly in resource-limited settings (Panayides et al., 2020). CNNs have found extensive applications in medical image analysis, offering significant advancements in diagnosis and treatment. They are commonly used for tasks such as segmentation, classification, and detection in various medical imaging modalities. 1.2 Statement of the Problem The timely and accurate diagnosis of cancer is a pivotal factor in shaping effective treatment strategies and enhancing patient survival rates. In this context, medical imaging techniques, particularly Computed Tomography (CT) scans, have emerged as indispensable tools for detecting, diagnosing and accurately staging various forms of cancer. However, the interpreta- tion of CT scans demands a high degree of expertise, often leading to delays in diagnosis and subsequent treatment. This challenge is particularly in resource-constrained settings, such as many hospitals in Kenya, where a shortage of proficient radiologists and oncologists prevails. 4 Inadequate access to specialized medical personnel and the manual nature of CT scan analysis contribute to significant diagnostic delays and potential errors. The existing workflow often involves sending scans to centralized facilities or external experts for interpretation, causing additional time and cost burdens; this underscores the need for an automated, precise and efficient method to diagnose lung cancer using CT scans. Several studies have demonstrated the effectiveness of automated detection techniques utilizing artificial intelligence (AI) and machine learning methods in identifying lung nodules and detecting lung cancer (Asuntha and Srinivasan, 2020). However, significant challenges persist, particularly in the adoption and implementation of such cutting-edge technologies within the Kenyan healthcare system. Addressing these challenges is crucial to enable the timely and accurate diagnosis of lung cancer, thereby improving patient outcomes and mitigating the burden posed by this deadly disease. Despite significant progress in the field of lung cancer classification and detection through deep learning techniques, there has been a notable gap in translating these advancements into practical applications within the healthcare industry. This study seeks to bridge this gap by not only developing a state-of-the-art Convolutional Neural Network (CNN) model through the fine-tuning of the pre-trained ResNet50 architecture but also by integrating this model into a user-friendly web framework. This will create a robust, accessible tool that can be directly incorporated into healthcare systems, thereby enhancing the diagnostic process with the precision and efficiency offered by advanced machine learning technologies. 1.3 Research Objectives 1.3.1 General Objective To develop a Convolutional Neural Networks (CNN) model by fine tuning the pre-trained ResNet50 architecture to effectively recognize and classify various types of cancer by analyzing Computed Tomography (CT) scans. 5 1.3.2 Specific Objectives i. To conduct a comprehensive review of existing deep learning models applied to medical image analysis, specifically focusing on their application with CT scans. ii. To design and deploy Convolutional Neural Network model for lung cancer classifica- tion using CT scans. iii. To evaluate and validate the performance of the developed CNN-based lung cancer classification model 1.4 Research Questions i. What are the key strengths and limitations associated with various existing deep learning architectures when applied to medical image analysis? ii. What architectural modifications and feature extraction techniques are necessary to ensure that the CNN model can accurately distinguish between healthy lung tissues and varying stages of lung cancer? iii. How does the CNN-based model compare to existing methods in terms of its diagnostic accuracy and potential clinical relevance for lung cancer classification using CT scans? 1.5 Significance of the Study The study holds paramount significance for health practitioners by offering a transformative advancement in their diagnostic capabilities. This technology has the potential to revolu- tionize how practitioners interpret and analyze medical images, providing them with an intelligent tool that aids in the early-stage detection and accurate classification of lung cancer types. 6 The research makes a substantial contribution to the United Nations Sustainable Development Goal (SDG) of Good Health and Well-being (SDG 3). By utilizing cutting-edge AI-driven technologies to enhance the effectiveness of diagnosing lung cancer, this study directly addresses the need for accessible and affordable healthcare solutions, particularly in resource- limited settings. Additionally, the research contributes to the broader advancement of artificial intelligence in healthcare. As AI technologies continue to evolve, their integration into medical practice can optimize resource allocation and alleviate the burden on healthcare professionals. By introducing a successful CNN-based model, this study could inspire further research and development in AI-assisted medical imaging, shaping the future of healthcare by combining human expertise with computational power 1.6 Scope of the Study This study focuses on developing on finetuning Convolutional Neural Network ResNet50 model for the classification of various lung cancer types, utilizing CT scan image data sourced from Lung Image Database Consortium (LIDC), a medical imaging database and Mendeley database . The study primarily encompass four lung cancer categories: normal lung tissue, squamous cell carcinoma, large cell carcinoma, and adenocarcinoma. The analysis will key machine learning steps such as data pre-processing which ensures uniformity, and the model was trained, validated, and evaluated for accuracy, sensitivity, specificity, and F1-score. The model was deployed using the Flask web application to serve as a user interface and allow healthcare professionals and users to interact with the AI-based tool seamlessly. This research aims to contribute to more precise lung cancer diagnosis and improved patient outcomes by providing a reliable AI-based tool for lung cancer classification 7 1.7 Limitations of the Study While the proposed study offers promising findings into enhancing lung cancer diagnosis, it is essential to recognize and address key limitations that could impact the study’s outcomes which can influence the study’s scope, generalizability, and the depth of its impact. The effectiveness of the CNN model relies on the diversity and richness of the dataset used for training and validation. A constrained dataset, representing a limited range of lung cancer stages, subtypes, and demographic profiles, can restrict the model’s ability to capture the complex variations in the real-world cases leading to reduced accuracy and limited applicability. Another critical consideration pertains to the limitations posed by available resources, both in terms of computational capabilities and access to high-quality medical data. Training deep learning models, such as CNNs, requires substantial computational power and memory, particularly when dealing with large-scale medical image datasets. Limited access to high- performance computing resources might hinder the study’s progress and limit the scale of experimentation. 8 Chapter 2 Literature Review 2.1 Introduction This chapter discusses the foundational literature on utilizing CNNs for enhanced lung cancer classification through CT Scan analysis. It also discusses the key theories that support this study, empirical review on related works and the research gap. 2.2 Theoretical Review This section discusses the key theories that support this study. The anchor theory for this study is the Theory of Convolutional Neural Networks (CNNs). The other supporting theories discussed in this section include the Theory of Pattern Recognition and Feature Extraction and the Transfer Learning and Fine-Tuning Theory. 2.2.1 Theory of Convolutional Neural Networks (CNNs) The Theory of Convolutional Neural Networks (CNNs) was initially proposed by Kunihiko Fukushima in 1980, with his creation of the "neocognitron," a hierarchical, multilayered artificial neural network (Fukushima, 2020). However, the modern form of CNNs, especially their application in deep learning, was significantly advanced by Yann LeCun in the late 1980s and early 1990s. (Bengio et al., 2021), particularly on the application of backpropagation algorithms to neural networks, laid the groundwork for the development of CNNs as they are known today. 9 The theory behind CNNs involves understanding their unique structure and functionality. CNNs are designed to automatically and efficiently extract and learn features from images. This accomplished by the three layers of CNN which include convolutional layers for filtering and generating feature maps, pooling layers for dimensionality reduction of feature maps and fully connected layers for classification. The key principle is that these networks can identify intricate patterns in image data, allowing for the effective recognition and classification of images (Yamashita et al., 2020). The primary assumption of CNNs is that spatial hierarchies of features can be learned from image data. This means that the network assumes that certain patterns or characteristics (like textures, edges, and shapes) can be detected and used to understand more complex structures within the image (Yamashita et al., 2020). Criticism of CNNs often revolves around their "black box" nature, where the decision- making process can be opaque and difficult to interpret. This is a significant issue in medical applications, where understanding the rationale behind a diagnosis or classification is crucial (Hassija et al., 2024). Another criticism is their dependency on substantial quantities of labeled data for training, which is a challenge in specialized fields like medical imaging (Castiglioni et al., 2021). This theory is highly relevant for this study because of CNNs’ ability to discern intricate patterns in image data makes them well-suited for identifying the often subtle and complex signs of lung cancer in CT scans. They can differentiate benign and malignant lesions, which is a critical task in lung cancer diagnosis. The hierarchical feature learning of CNNs allows them to recognize cancerous patterns that that might not be discernible to human vision or conventional image processing methods. However, the challenges highlighted by the criticisms need to be addressed, particularly in ensuring the interpretability of the CNN models and their decisions, given the critical nature of medical diagnostics. 2.2.2 Theory of Pattern Recognition and Feature Extraction This theory is a multidisciplinary concept from the fields of computer vision, machine learning, and neural networks (Paolanti and Frontoni, 2020). While no single individual 10 can be credited with its inception, notable contributions have been made by researchers like Yann LeCun, Geoffrey Hinton, and Yoshua Bengio (Bengio et al., 2021). Their work in the late 20th and early 21st centuries, particularly on deep learning and neural networks, has significantly shaped our understanding of pattern recognition and feature extraction (Fieguth, 2022). The theory posits that CNNs are capable of recognizing patterns and extracting features from complex datasets, such as medical images. This is achieved through the network’s ability to automatically learn hierarchical feature representations from data. For instance, in the context of lung CT scans, CNNs can learn to identify features ranging from basic textures and shapes to more complex patterns indicative of pathological changes. The primary assumption of this theory is that meaningful and detectable patterns exist in the data, and these can be learned and recognized by the CNN. It assumes that these patterns are representative of underlying phenomena (like the presence of cancerous nodules) and that they can be distinguished from irrelevant variations in the image data (Abdar et al., 2021). One significant criticism of this approach is its reliance on the quality and quantity of the train dataset. If the training dataset is not representative, CNN model may take in incorrect or misleading patterns. Additionally, there is the challenge of interpretability; understanding what specific features the CNN is identifying and why it makes certain classifications can be difficult, which is a crucial consideration in medical applications where decision-making needs to be transparent and justifiable (Alzubaidi et al., 2021). For lung cancer classification through CT scans, this theory is highly relevant. The ability of CNNs to discern and learn from complex patterns in medical images makes them well- suited for identifying and differentiating between cancerous and non-cancerous tissues. This capability is crucial in detecting lung cancer, where the early identification of malignant nodules can significantly impact patient outcomes. However, the challenges of ensuring that the model is trained on comprehensive and representative data, as well as making the model’s decision-making process understandable and reliable, are essential considerations for applying this theory in your research. 11 2.2.3 Transfer Learning and Fine-Tuning Theory Transfer Learning and Fine-Tuning Theory was developed from the work of (Krizhevsky et al., 2023), particularly with the development of AlexNet in 2012. They demonstrated the effectiveness of deep learning and, by extension, the potential of transfer learning in complex tasks. The concept of Transfer Learning and Fine-Tuning in the field of Convolutional Neural Networks (CNNs) represents a significant advancement in the field of machine learning and has important implications for medical imaging, including lung cancer classification. The central idea of transfer learning is that a model developed for one task can be repurposed as the starting point for another model in a related task. It entails fine-tuning a CNN on a more specialized dataset after it has been pre-trained on a larger, more generic dataset (such as ImageNet). The theory is grounded in the belief that the features learned by the network on the first task can be applicable and beneficial for performance on the second task (Tripuraneni et al., 2020). A key assumption in transfer learning is that the high-level features learned in the initial task are generalizable and can be effectively applied to different but related problems. In the case of medical imaging, this assumes that features learned from general images are relevant and useful for analyzing medical scans. Criticism of transfer learning centers around its effectiveness when there is a significant discrepancy between the source (initial) and target (new) tasks. Some scholars, like (Bengio et al., 2011), have pointed out that the effectiveness of transfer learning can diminish when the new task is too divergent from the original task. Additionally, there can be issues related to overfitting, particularly if the fine-tuning dataset is small or not diverse enough. Transfer learning is particularly well-suited for lung cancer classification through CT scans. Given the limited availability of large, annotated medical imaging datasets, being able to leverage a pre-trained ResNet50 CNN allows for a more robust and nuanced understanding of the complex patterns present in medical scans. Fine-tuning the pre-trained network on specific lung CT scan datasets can lead to more accurate and effective classification of lung cancer signs. However, care must be taken to ensure the pre-training and fine-tuning phases 12 are well-aligned in terms of the features relevant for lung cancer detection, addressing the concerns raised in criticisms of transfer learning (Salehi et al., 2023). 2.3 Empirical review This section comprehensively explores related studies, methodologies employed, key findings, and existing gaps in the field of classifying lung cancer types through CT scan analysis. 2.3.1 Related works (Fallahpoor et al., 2023) undertook a comprehensive review to explore the application of deep learning (DL) techniques in positron emission tomography and computed tomography imaging. The study was motivated by the growing use of positron emission tomography and computed tomography in fields like oncology and neurology, and the challenges in manual image interpretation due to its time-consuming nature and requirement for extensive disease-specific knowledge. The review focuses exclusively on the use of AI, specifically DL, in analysing positron emission tomography and computed tomography images, filling a gap in existing literature which included modalities and various Artificial intelligence applications broadly. The researchers analysed 99 studies published between 2017 and 2022, identifying effective DL models and pre-processing algorithms for positron emission tomography and computed tomography imaging, while also pointing out current constraints such as the insufficient annotated datasets and challenges in model explainability. This study emphasizes the potential of DL in improving lesion detection, tumour segmentation, and disease classification in PET/CT imaging, but it does not delve into the direct application of these techniques in lung cancer classification through CT scan analysis, which is a more specific and distinct application area. Thanoon et al. (2023) conducted an extensive review of deep learning techniques for lung cancer screening and diagnosis using CT images. Lung cancer, one of the deadliest dis- eases worldwide, demands early identification for improving patient survival rates, with 13 CT imaging being a key modality for screening and diagnosis. The review looked into an array of deep learning (DL) techniques developed for this purpose, focusing on two primary methodologies: classification and segmentation. It critically examines the advantages and limitations of current deep learning models in this context. The study demonstrates that deep learning methods hold significant potential for precise and effective lung cancer screening and diagnosis using CT scans. The review also provides insights into potential future en- hancements in the application of deep learning, aiming to advance computer-assisted lung cancer diagnosis systems. This comprehensive analysis serves as a valuable resource for understanding the current state and future prospects of DL in lung cancer detection through CT imagery. Wang (2022) explored the use of deep learning in diagnosing lung cancer, highlighting recent advancements, challenges, and future directions in the field. It emphasizes the critical role of medical imaging tools like CT, MRI, PET, and X-ray in early lung cancer detection, but also notes their limitations such as high false positives and inability to automatically classify cancer images. Wang’s research underscores the potential of deep learning to enhance lung nodule detection and classification accuracy. The study reviews existing medical imaging methods and deep learning-based techniques, focusing on image pre-processing, dataset analysis, lung image segmentation, nodule detection, and classification. However, it also identifies significant challenges, such as the need for extensive annotated image datasets and the interpretability of deep learning models. Looking forward, the paper suggests directions for future research, including the development of new network structures, incorporation of clinical knowledge into model training, and enhancement of model robustness. The study concludes that deep learning could potentially outperform radiologists in specific tasks, but also acknowledges the need to address legal, privacy, and clinical verification challenges for widespread clinical application. The study by (Nageswaran et al., 2022) explored the classification and prediction of lung cancer using machine learning and image processing. It identifies the urgent need for effective lung nodule identification in chest CT scans for early lung cancer detection. The research utilizes 83 CT scans from 70 patients, applying image processing techniques like 14 noise reduction and feature extraction, and the k-means algorithm for image segmentation. Key machine learning methods used for classification include Artificial Neural Networks (ANN), K-Nearest Neighbors (KNN), and Random Forest (RF), with the ANN model showing superior accuracy in lung cancer prediction. The study’s methodology involves using a geometric mean filter for image pre-processing, enhancing image quality, and then segmenting the images using k-means for identifying regions of interest. Performance comparison of different machine learning predictors is based on accuracy, sensitivity, and specificity. The research concludes that lung cancer, being a deadly disease, necessitates the use of Computer-Aided Diagnosis (CAD) systems for its early detection. This study’s findings underscore the potential of machine learning and image processing technologies in improving the accuracy of lung cancer detection systems, thereby contributing to advanced imaging techniques for practical implementation in medical diagnostics. The study by (Mohamed et al., 2023) focused on an innovative approach for automatic detection and classification of lung cancer in CT scans, employing a hybrid model that integrates deep learning with the Ebola Optimization Search Algorithm (EOSA). This research addresses the critical challenge of accurately diagnosing lung cancer, a leading cause of cancer-related deaths globally. The study’s cornerstone is the EOSA-CNN model, which optimizes the selection of weights and biases essential for the classification process within a Convolutional Neural Network (CNN). Tested on a comprehensive lung cancer dataset from the Iraq-Oncology Teaching Hospital National Center for Cancer Diseases, the EOSA-CNN model outperformed traditional methods, achieving a high classification accuracy of 93.21%. The methodology encompassed advanced image pre-processing techniques and the strategic application of EOSA to refine the CNN architecture, significantly enhancing its diagnostic accuracy. This breakthrough demonstrates the powerful synergy of machine learning and metaheuristic algorithms, offering a promising new avenue for early lung cancer detection and more informed treatment strategies. The study by (Sasikala et al., 2018) looked into a Convolutional Neural Network (CNN) based approach for the detection and classification of lung cancer from chest CT images. Recognizing lung cancer as one of the deadliest diseases in developing countries, the paper emphasizes the challenge of early cancer detection. The methodology involves extracting lung regions from CT images, segmenting these regions to 15 identify tumors, and then training the CNN architecture with this data. The key objective is to categorize lung tumors as malignant or benign. Utilizing a dataset from the Lung Image Database Consortium and Image Database Resource Initiative, which includes 1000 CT scans, the study applies median filtering for image preprocessing. The CNN architecture, incorporating layers such as convolutional, RELU, pooling, and fully connected layers, achieves an impressive 96% accuracy in classifying lung cancer, outperforming traditional neural network systems. This advancement suggests significant potential for deep learning in enhancing early lung cancer detection and treatment. The research conducted by (Deepa et al., 2023) proposed a novel method for lung cancer classification using Convolutional Neural Networks (CNNs). Addressing the growing prevalence of lung cancer, which is attributed to harmful consumption habits and environmental factors, this study focuses on improving the accuracy and efficiency of lung cancer detection. The novel strategy uses the multi-view aspect model and various digital image processing techniques, as well as the power of CNNs, to categorize various forms of lung cancer. This strategy tries to improve diagnostic accuracy in diagnosing lung cancer. The study evaluates the model’s effectiveness using a variety of criteria, including Matthew’s correlation coefficient, Cohen’s Kappa score, and log loss. These metrics give a comprehensive assessment of the model by measuring the correlation between projected and actual classifications, determining agreement beyond chance, and evaluating the model’s probability estimations (Mulrenan et al., 2022) presents a comprehensive review on the utilization of Artificial Intelligence (AI) in diagnosing COVID-19 using Computed Tomography scans and chest X-rays. The study investigates the application of AI in differentiating COVID-19 from other respiratory infections, with a focus on research articles sourced from databases like ArXiv, MedRxiv, PubMed, and Google Scholar. In total, twenty studies met the inclusion criteria: eleven centered on chest X-ray and twelve on CT, involving datasets ranging from 239 to 19,250 images. The reviewed studies reported a wide range of sensitivities, specificities, and Area Under the Curve (AUC) values from 0.789 to 1.00. Despite AI showing exceptional diagnostic capabilities in detecting COVID-19 manifestations in imaging, the review highlights significant challenges in its broader application, such as the absence of relevant comparators in studies, the need for larger datasets, and independent testing. This review emphasizes the potential and existing 16 limitations of AI in enhancing COVID-19 diagnosis via imaging, contributing critical insights to the field of medical diagnostics during the pandemic. The research by (Darmawan et al., 2022) looked into the classification of Non-Small Cell Lung Cancer (NSCLC) using Convolutional Neural Networks (CNN). Lung cancer, cat- egorized into Small Cell Lung Cancer (SCLC) and NSCLC, is predominantly found in heavy smokers, with NSCLC constituting about eighty to eighty five percent of all lung cancer cases, impacting both men and women. This study specifically focuses on classifying NSCLC into subtypes: squamous cell carcinoma, adenocarcinoma, large cell carcinoma, and normal lung tissue. For this purpose, the researchers utilized a dataset comprising 1000 CT scan images for each class. The classification process involved three critical stages: pre-processing, classification, and validation. During pre-processing, all input images were resized and converted to grayscale to ensure uniformity, which is crucial for accurate analysis. The study evaluated two different CNN architectures, VGG19 and ResNet50, to determine which one performs better in classifying NSCLC. Particularly noteworthy was the model’s performance in the normal class during validation, where it achieved perfect scores in preci- sion, sensitivity, F1-score, and specificity, all reaching 100%. These findings underscore the effectiveness of the ResNet50 architecture in accurately classifying NSCLC from CT scan images, highlighting its potential as a reliable tool for lung cancer diagnosis and aiding in the determination of appropriate treatment strategies based on the specific cancer subtype. In a study done by (Mwaniki et al., 2023), to assess the knowledge, attitudes, and practices of radiologists Kenya towards Artificial Intelligence in the field of Diagnostic Radiology (DR) were assessed. This cross-sectional descriptive study, which utilized a web-based questionnaire shared with members of the Kenya Association of Radiologists, revealed that a significant majority of participants had basic AI knowledge, primarily gained from presenta- tions, but less than half were familiar with machine learning, artificial neural networks, and deep learning concepts. The most recognized application of AI in radiology was detection, with other uses like segmentation and workflow management being less known. Notably, AI application in daily radiology practice was limited, with only 12.6% of respondents using AI tools. Despite this, a positive attitude towards AI/ML in radiology was observed, 17 with two-thirds of participants expressing willingness to be involved in the development and training of ML algorithms. However, the current knowledge of AI applications did not significantly influence their decision to pursue a career in radiology. The study concluded that while there is basic AI knowledge among radiologists and residents, deeper understanding of related concepts is lacking, and the actual use of AI in practice is minimal. The authors recommend introducing AI/ML courses in radiology residency programs and providing continuous medical education to bridge this knowledge gap. 2.4 Research Gaps The existing body of research on the application of deep learning and Convolutional Neural Networks (CNNs) in the field of lung cancer detection and classification through CT scan analysis reveals several conceptual, contextual, and methodological gaps which this study aims to address. Conceptually, while studies like those by Fallahpoor et al. (2023),Thanoon et al. (2023), and Wang (2022) acknowledge how deep learning can enhance lung cancer diagnostics, they predominantly focus on general applications of AI in medical imaging or deep learning for broader oncological purposes. There’s a lack of in-depth exploration of CNN specifically tailored for lung cancer classification in CT scans, as this study proposes. Moreover, the study by Nageswaran et al. (2022) while concentrating on lung cancer classification using machine learning, doesn’t delve into the nuances of CNNs, which are crucial for capturing the intricate details in CT images. Contextually, most of these studies do not address the unique challenges faced in specific regions like Kenya. For instance, the study by (Mwaniki et al., 2023) highlights the limited AI/ML knowledge among Kenyan radiologists, indicating a gap in localizing AI applications to the specific needs and constraints of the region. This study intends to fill this gap by developing a CNN model that considers the local demographic and healthcare context, which is critical for accurate lung cancer detection in a Kenyan setting. 18 Methodologically, while studies like those by (Sasikala et al., 2018) and (Deepa et al., 2023) successfully apply CNN for lung cancer detection, there’s a lack of emphasis on creating localized models that address specific regional challenges such as the shortage of skilled radiologists and advanced medical facilities in Kenya. Moreover, these studies don’t fully explore the potential of CNN in processing and interpreting CT scan data in a way that’s tailored for the Kenyan population. By focusing on these aspects, this study aims to not only enhance the accuracy of lung cancer classification through CT scans but also make it more applicable and accessible for healthcare professionals in Kenya. Despite these gaps, the field has seen considerable progress in utilizing deep learning for lung cancer detection. However, translating these technological advancements into practical tools for the healthcare industry remains a challenge. This study aims to bridge this gap by developing a cutting-edge CNN model, fine-tuned from the pre-trained ResNet50 architecture, and integrating this model into a user-friendly web framework to augment the diagnostic process,. 19 Chapter 3 Methodology 3.1 Introduction This chapter provides a comprehensive discussion of the methodology employed in this study. It covers the dataset utilized, its collection process, data pre-processing techniques applied, as well as the machine learning models employed, this include Convolutional Neural Network (CNN) and ResNet50. Additionally, it outlines the evaluation metrics utilized and the deployment approach adopted. 3.2 Data Understanding The data from public database Lung Image Database Consortium and Mendely data was researched and collected.Lung Image Database Consortium image collection (LIDC-IDRI) is a web-accessible international resource for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis(Vendt and Nolan, 2023). Lung Image Database Consortium has a unique two-phase data collection process that involves expert thoracic radiologists analyzing and annotating each CT scan to provide spatial truth information about the presence or absence of lung nodules and their spatial extent when present. The CT scan images collected contained 2500 images labelled for adenocarcinoma, squamous cell carcinoma, large cell carcinoma and normal cells. Figure 3.3 illustrates the composition of CT scan images. 20 Figure 3.1: Dataset Images 3.3 Image Pre-processing The CT scan images collected were pre-processed to enhance the quality and accuracy of the model. To standardize the input data a common format was created from the DICOM pictures. We note that research by (Fedorov et al., 2016)standardized the CT scan images by converting to common image formats that can be used in machine learning modelling. The images were then converted to tensor to ensure efficient manipulation and processing by leveraging the Pytorch optimized operations. Converting images to tensors prove to be support automatic differentiation, which is important for training neural networks using gradient-based optimization algorithms(Kim, 2024). Data augmentation methods, including flipping, rotating, and zooming, were applied to enhance the diversity of the study dataset, thereby boosting the model’s ability to generalize. The images were uniformly resized to dimensions of 224 by 224 pixels, a resolution that has been consistently effective in similar research focused on developing image classification models, particularly when leveraging pre-trained models. This specific resolution is widely recognized for its compatibility with various pre-trained architectures and its efficiency in maintaining essential image features for accurate classification(Talebi and Milanfar, 2021). 3.4 Convolution Neural Network Architecture CNN is a class of deep learning techniques that analyze grid-patterned data, including photographs. It is modeled after the structure of the animal visual cortex and is intended 21 to automatically and adaptively learn spatial feature hierarchies, ranging from low-level patterns to high-level patterns (Yamashita et al., 2020). CNN consists of three types of layers: convolution, pooling, and fully connected layers. The convolution layer extracts features, the pooling layer reduces processing by down sampling the image, and the third, a fully connected layer, transfers the extracted features into final output, such as classification(Alzubaidi et al., 2021).Figure 3.2 illustrates the architrecture of convolutional neural network model and how it will used to classify the lung cancer types. Figure 3.2: Overview of CNN Architecture The input layer takes the lung images, which then pass through convolutional layers where various kernels extract features by performing a convolution operation. These features are patterns or specific aspects in the images such as edges, textures, or shapes relevant to identifying cancerous cells(Raza et al., 2023). 3.4.1 Convolution layers Convolutional layers are computational structures that enable a computer to derive meaningful insights from input data through the representation of various abstraction levels(Wang et al., 2020). This layer will automatically extract and learn the most relevant features from the CT scan images, which include specific shapes, textures, or patterns associated with lung cancer signatures. A CNN comprises an input layer x, accepting CT scan images, an output layer y, and numerous hidden layers h, with each layer hosting a variety of neurons. Within the 22 convolutional layers, every hidden unit modifies the received input from preceding layers through a process outlined by a nonlinear equation: h j = F ( b j +∑ i wi, jxi ) (3.1) Where wi, j are the weights associated with the connection between the ith input xi and the jth hidden unit, b j is the bias of the jth hidden unit, providing additional flexibility to the model by allowing each neuron to shift the non-linear activation function F(·) to the left or right to provide the final classification output h j which can be used to aid diagnosis. 3.4.2 Filters/Kernels Filters are small matrices used to detect specific features in the input images, such as edges, corners, textures, or more complex patterns in higher layers. Each filter is designed to respond to a specific type of feature at a given spatial scale of the image. Multiple filters can be applied to the input, each generating a separate feature map. This feature maps forms the output of the layer, with the depth equal to the number of filters used. This can be expressed using the following mathematical expression: y = X ∗ f (1) (3.2) where the symbol ∗ denotes the convolution operator, y represents the convolved results, also known as the feature map, X denotes the input image, and f stands for the filter or kernel, which is a small matrix of weights that the network learns during the training process. The convolutional filter is important in extracting critical diagnostic features from the scans. It methodically applies element-wise multiplications across different segments of the CT image, summing these interactions to produce distinct values for each output pixel. This operation is replicated across the entire scan, ensuring that regions with significant diagnostic 23 features such as the subtle edges, texture variations, specific patterns, and characteristic shapes associated with lung pathologies are accentuated in the resulting feature maps. 3.4.3 Stride and Padding The stride and padding method are very crucial in the network’s ability to extract salient features, preserve spatial information, and balance computational efficiency. Stride defines the size of the steps taken by the convolutional filters as they move across the image. A greater stride reduces the spatial dimensions of the output feature maps, which can help minimize computing load and focus on more pronounced aspects in the images without being bogged down in unnecessary information. Padding is used to add layers of pixels around the edge of the input images. This ensures that when the filter is applied, it can properly include the information at the edges and corners, which might otherwise be ignored or underrepresented. Effective padding ensures that the size of the output feature maps can be maintained, and important diagnostic features that may be located at the periphery of the scans are not missed. For padding p, filter size f × f and input image size n× n and stride s, our output image dimension is given by mathematical expression of form: Output dimension = [ (n+2p− f )+1 s ] × [ (n+2p− f )+1 s ] (3.3) 3.4.4 Activation Function The activation function captures desirable nonlinear features between the input images and output of the model. Activation function is introduced into the CNN architecture to learn the non-linear relationships the CT scan images and allow the model to capture complex patterns and relationships. The activation function is applied to output of each neuron by taking the weighted sum of the inputs and produce an output which can be passed to the next layer. The 24 ReLU activation function was used in this research study due to its proven track record in enhancing accuracy and also its ability to learn more complex representations of the data. The ReLU activation function is easy to compute and has a mathematical expression of the form: f(z) = max(0,z) (3.4) where z represents the input to the ReLU layer and f (z) represents the output of the ReLU activation function. For any given input value z, if z is greater than zero, the ReLU function outputs z itself; otherwise, it outputs zero. Figure 3.3 illustrates the Rectified Linear Unit (ReLU) function: Figure 3.3: ReLu Activation Function The ReLU (Rectified Linear Unit) activation function was chosen due to its advantages over traditional sigmoid functions, particularly in terms of feature activation and computational performance. Unlike sigmoid functions that can cause vanishing gradient problems due to their asymptotic nature, ReLU maintains a linear activation for positive inputs, which helps in propagating gradients more effectively during back propagation(Yu et al., 2020). This 25 characteristic of ReLU ensures that the network learns faster and more efficiently, avoiding the saturation and slow learning problems often associated with sigmoid functions. Moreover, ReLU’s simplicity—returning zero for negative inputs and performing no changes to positive inputs—reduces computational complexity, making it an ideal choice for the ResNet50 Convolutional Neural network model used in classifying lung cancer types from CT scans in this study 3.4.5 Pooling layers Pooling layers reduce the spatial size of the convoluted features, simplifying the complexity of the network and reducing computation time, while retaining essential information by down sampling the input representation (Thanoon et al., 2023). The architecture of pooling layer involves sliding a window across input feature map and applying operations such as max pooling or average pooling(Akhtar and Ragavendran, 2020). After multiple convolution and pooling layers, the feature maps are flattened into a single vector, which is essential for transitioning from a 2-dimension feature map to a 1-dimension feature vector, making it possible to feed into a fully connected layer. The pooling operation can be mathematically represented as follows: Given an input feature map L of size (M,N,K), where M is the height, N is the width, and K is the number of channels, and a pooling window size M×N of the output feature map is obtained as follows: yl,i, j,k = pool(al,m,n,k),∀(m,n) ∈ Ri, j (3.5) where: yl,i, j,k represents the output of the pooling operation for feature map l at position (i, j) in channel k. al,m,n,k is the input feature map value at position (m,n) in channel k for feature map l. Ri, j is the local area centered at (i, j) over which the pooling operation is executed pool(·) denotes the pooling function, which could be max pooling, min pooling, sum pooling, or average pooling, depending on the specific operation chosen. 26 In this research study, the inclusion of pooling layers within the Convolutional Neural Net- work architecture plays an important role in enhancing model generalization and mitigating overfitting. By effectively down-sampling the feature maps, pooling layers reduce the dimen- sionality of the data, which in turn lessens the computational load and the complexity of the model(Saeedan et al., 2018). This reduction is crucial for a dataset derived from lung CT scans, where the high resolution of images can lead to an overwhelming number of features. Pooling layers help the model to focus on the most salient features, promoting generalization by preventing the model from learning noise and irrelevant details. 3.4.6 Fully connected layer The fully connected layers, often called dense layers, function as classifiers that use the extracted and compressed features to determine the image’s class by assigning probabilities to various categories to the output (Anjum et al., 2023). The final output layer uses these probabilities to classify the image into categories, which in the context of lung cancer, could be various cancer types or normal cells. Each layer is crucial for accurately classifying lung cancer from CT scan images, and their combined use ensures that the model learns complex patterns and nuances in the data to make reliable predictions. In the fully connected layer, the outputs from the convolution and pooling layers are transformed into a one-dimensional array for further processing. The equation for a fully connected layer can be expressed as: y = f (Wx+b) (3.6) Where: • x represents the input vector to the fully connected layer. • W is the weight matrix associated with the connections between the input and the neurons in the layer. 27 • b is the bias vector. • f denotes the activation function, which introduces non-linearity into the model, allowing it to learn complex patterns. • y represents the vector of values produced by the layer, which can either be passed to subsequent layers in the network or, in the case of the final layer, used to produce the final classification output. 3.5 Proposed Convolutional Neural Network: ResNet50 Convolutional neural networks have made substantial progress since the outstanding success of previous research works such as AlexNet 2012(Singh and Sabrol, 2021). A modified version of the proposed architecture ResNet50, was the basis for the CNN architecture selected for this study. ResNet50 is a deep convolutional neural network architecture that consists of 50 layers, including residual blocks. These residual blocks enable the network to learn residual func- tions, making it easier to train deeper networks effectively(Koonce and Koonce, 2021). This research utilized the knowledge gained from the pre-trained ResNet50 model to classify lung CT scans, a task that would have been considerably more time-intensive if a Convolutional Neural Network had to be developed from scratch, particularly given the limited number of available CT images. Residual learning in ResNet50 is mathematically expressed as: H(x) = F(x)+ x (3.7) where x is the input to a layer, F(x) represents the residual mapping to be learned, and H(x) is the desired mapping. This formulation allows for the learning of residuals instead of directly fitting the desired underlying mapping(Shafiq and Gu, 2022). 28 ResNet 50 was employed as a pre-trained feature extractor, meaning the ResNet-50 archi- tecture, already trained on a comprehensive dataset, was used to derive features from the input images without changing the weights of its convolutional layers(Wen et al., 2020). This approach capitalizes on ResNet-50’s robust ability to identify and extract complex features and patterns, applying the pre-acquired knowledge to a specialized task of classification, thus optimizing for both efficiency and computational cost. In this model, the parameters directly adopted from the ResNet 50 architecture, were pre- served. In the process of fine-tuning the ResNet50 model, we permitted the model to adjust and refine its parameters, rather than keeping them fixed, starting from the baseline values provided by the pre-trained model. This strategy allowed the model to better tailor its learning to the specific characteristics of our dataset, thereby improving its proficiency in identifying lung cancer from CT scan images. Table 3.1 shows the parameters of ResNet50 that were used to achieve the objectives of this research study. Table 3.1: Parameters of the ResNet50 Model for Lung Cancer Classification Parameter Value Number of Epochs 200 Learning Rate 0.001% Weight Decay 0.0001 Model Name ResNet50 Use CUDA True Criterion nn.CrossEntropyLoss Optimizer Adam Optimizer Momentum 90% The advantages of using ResNet50 for lung cancer classification in CT scan analysis are significant. Firstly, ResNet50’s deep architecture enables it to learn intricate features from CT scan images, enhancing the model’s ability to differentiate between different types of lung cancer accurately. Secondly, the residual learning approach in ResNet50 aids in mitigating the degradation problem associated with training deep networks, leading to improved model 29 performance. Lastly, ResNet50’s proven success in various image classification tasks makes it a robust choice for enhancing lung cancer classification accuracy through CT scan analysis (McNeely-White et al., 2020). 3.6 Model Training The primary objective of this research study is to develop a Convolutional Neural Network model that will effectively classify the types of lung cancer by utilizing the use CT Scan images by fine-tuning the pre-trained ResNet50 CNN model.The CNN model was trained using PyTorch on a GPU machine. PyTorch, an open-source machine learning library built upon the Torch framework, is widely used for its versatility and efficiency. It finds extensive applications in computer vision and natural language processing due to its ability to efficiently handle complex neural network designs and efficiently process data on GPU hardware(Imambi et al., 2021).PyTorch’s robust ecosystem has been instrumental in similar research endeavors, where its application in developing image classification models has consistently led to superior outcomes(Ayyadevara and Reddy, 2020) . PyTorch has an advanced support for tensors, which are multidimensional arrays that served as the foundational data structure for handling the CT scan images(Imambi et al., 2021). These images were converted into tensors to leverage PyTorch’s efficient computation and dynamic computational graph capabilities, facilitating more intuitive and flexible model development and debugging processes. Compared to other frameworks, PyTorch’s dynamic nature allows for adjustments and optimizations on-the-fly, a feature particularly beneficial for the iterative experimentation required in this study(Stevens et al., 2020). This, combined with its user-friendly interface and extensive library support, made PyTorch an optimal choice for implementing the advanced neural network models needed for accurate lung cancer classification. The transfer learning is crucial for this research and it will extract all intricate patterns and predict the class of the CT scan images: adenocarcinoma, squamous cell carcinoma, large cell carcinoma and normal cells. The dataset for this study was split, typically in a 70-15-15 30 ratio, into training, validation, and testing sets to allow for effective model training, validation of its performance, and evaluation of its generalizability to new, unseen data. 3.6.1 Loss Function Loss is the loss value over the training the data after each epoch. To minimize the loss, the Cross-Entropy Loss optimization technique was used. The mathematical expression for cross entropy is expressed as follows: AverageLoss = 1 N N ∑ i=1 l(θ ;yn, 0i) (3.8) where: • N is the total number of samples in the dataset. • i indexes the individual samples in the dataset. • l(θ ;yn,oi) represents the loss function for the ith sample, calculated with respect to the model parameters θ ,the actual label yn, and the model’s predicted output oi. Cross-Entropy Loss, also known as Log Loss, is a performance metric that quantifies the difference between two probability distributions - the predicted probability distribution output by the model and the actual distribution represented by the labels in the training data(Mao et al., 2023). An evaluation done by (Hui and Belkin, 2020) shows that cross-entropy loss is the best in penalizing incorrect classifications.Cross-entropy is expressed in the form: H(y, p) =− M ∑ c=1 yo,c log(po,c) (3.9) where, H(y, p) represents the cross-entropy loss between the true labels y and the predicted probabilities p, where M is the number of classes. The symbol yo,c is a binary indicator (0 or 1) if class label c is the correct classification for observation o, and po,c is the predicted probability that observation o is of class c. 31 3.6.2 Loss Optimization Adam optimizer, short for Adaptive Moment Estimation known for its adaptive learning rate capabilities, was used.It is an advanced optimization algorithm used in machine learning to iteratively update network weights based on training data. It is recognized for effectively handling sparse gradients and offering robustness in terms of hyperparameter choices. Adam stands out by integrating the benefits of two other optimization techniques: the Adaptive Gradient Algorithm (AdaGrad), which adjusts the learning rate according to parameters, and Root Mean Square Propagation (RMSProp), which normalizes the gradient using a moving average of squared gradients, akin to momentum(Bera and Shrivastava, 2020). Early stopping was adopted to reduce the risk of overfitting. This technique monitors the validation loss and halts training when there’s no significant improvement, ensuring the model’s ability to generalize. Similar research studies have proven Adam optimizer as the most successful technique in image classification tasks. A comparative study by (Yaqub et al., 2020) concludes that Adam optimizer performs better compared to other first order optimizers. The mathematical equation for the Adam optimizer in image classification can be represented as follows: θt+1 = θt − η√ v̂t + ε · m̂t (3.10) Where: θt+1 is the updated parameter at time t + 1, θt is the parameter at time t, η is the learning rate, m̂t is the exponentially decayed average of past gradients, v̂t is the exponentially decayed average of past squared gradients, and ε is a small constant (typically 10−8) to prevent division by zero. 3.7 Evaluation of Model Performance The performance of the model was evaluated using evaluation metrics such as precision, recall, accuracy and F1 score. These parameters provide a thorough insight of the accuracy of 32 the model in classifying lung cancer kinds. Accuracy is a measure that quantifies the overall correctness of the model’s predictions. It is calculated as the ratio of correctly predicted instances (both true positives and true negatives) to the total number of instances in the dataset(Vujović et al., 2021). Mathematically, accuracy is defined as: Accuracy (%) = ( Number of Correct Predictions Total Number of Predictions ) ×100 (3.11) Precision measures the proportion of true positive predictions (correctly identified instances of a particular class) to the total number of instances predicted as that class, including both true positives and false positives(Zhou et al., 2021). Mathematically, precision is defined as: Precision = True Positives (TP) True Positives (TP) + False Positives (FP) (3.12) Recall, also known as sensitivity, measures the proportion of actual positives (true positives) that are correctly identified by the model(Erickson and Kitamura, 2021). Mathematically, recall is defined as: Recall = True Positives (TP) True Positives (TP) + False Negatives (FN) (3.13) The F1 score is the harmonic mean of precision and recall, providing a single metric that balances both the concerns of false positives and false negatives (Erickson and Kitamura, 2021). F1 = 2× Precision×Recall Precision+Recall (3.14) Precision, recall, accuracy and F1 score are crucial performance metrics for this research study of lung cancer classification due to their ability to provide a clear understanding of the model’s predictive capabilities. Similar studies, such as those by (Sharma et al., 2022) on dermatologist-level classification of skin cancer using deep neural networks, and (Zubair, 2020) on CheXNet, which can diagnose pneumonia from chest X-rays at a level exceeding 33 practicing radiologists, have also underscored the importance of these metrics. They highlight how precision, recall, and accuracy can collectively ensure that AI-driven diagnostic models are both reliable and effective in real-world medical settings. 3.8 Model Deployment After successfully developing and testing the Convolutional Neural Network model for lung cancer classification, the next important step is model deployment. In this study, Vue.js was selected to develop the frontend due to its ability to facilitate the creation of dynamic user experiences with an efficient and predictable update mechanism. Vue.js is a progressive JavaScript framework used to build user interfaces and single-page applications. It is known for its simplicity and flexibility, which makes it a popular choice among developers for creating dynamic and intuitive user experiences(Bielak et al., 2022).The main interface includes a prominent image upload section where practitioners can easily drag and drop CT scan images or browse files from their system. Flask framework was used as the backbone for integrating the CNN ResNet 50-based machine learning model into the system’s architecture. Flask Framework is a lightweight and flexible Python web framework designed for quick development of web applications by providing the tools, libraries, and technologies necessary. Its simplicity and scalability make it an excellent choice for building RESTful web services, such as Application Programming Interfaces for machine learning models(Ghimire, 2020).Similar studies such as those by (Yaganteeswarudu, 2020) on multi-disease prediction and a study by (Kumar, 2023) on improving cardiovascular conditions prove that Flask framework demonstrate remarkable results in deploying machine learning models into production. The Flask application exposes a RESTful API endpoint that facilitates the communication between the frontend and the machine learning model. When medical practitioners upload CT scan images via the Vue.js frontend, these images are sent to the Flask backend as HTTP requests.The Flask application then processes these requests, extracting the image data and preparing it for analysis. It interfaces with the ResNet 50 model, feeding the images as input 34 and retrieving the classification result thus ensuring a smooth, efficient transmission of data, enabling real-time analysis and feedback. Chapter 4 of this research study provides an in-depth exploration of the system design and architecture, detailing the frameworks and methodologies used in developing and deploy- ing the lung cancer ResNet50 classification model, while Chapter 5 complements this by presenting practical screenshots from the system implementation and testing phases. 35 Chapter 4 System Design and Architecture 4.1 Introduction This section offers a comprehensive overview of the system design and architecture utilized in developing the image classification model, anchored by the ResNet50 neural network. The architecture is structured into two principal segments: Frontend and Backend development. Additionally, this section outlines the model API and system requirements, essential for integration and communication between the model’s components. 4.2 System Requirements 4.2.1 Model API Requirements Below are the dependencies for the Model API’s functionality, outlining the specific Python packages and their versions required for integration and operation within the system’s architecture: a. Python ==3.9 b. Flask==3.0.2 c. joblib==1.3.2 d. matplotlib==3.8.3 e. numpy==1.26.4 36 f. pandas==2.2.1 g. PyYAML==6.0.1 h. requests==2.31.0 i. safetensors==0.4.2 j. scikit-learn==1.4.1.post1 k. scipy==1.12.0 l. seaborn==0.13.2 m. timm==0.9.16 n. torch==2.2.1 o. torchvision==0.17.1 p. Werkzeug==3.0.1 q. huggingface-hub==0.21.4 4.3 Overview of System Architecture The system architecture for lung cancer image classification shown in Figure 4.1 shows the workflow of the system. It is designed for optimal efficiency and user-friendliness, starting with a Vue.js-based user interface where medical practitioners upload CT scans for analysis. 37 Figure 4.1: Overview of System Architecture Initially, a medical practitioner uploads a CT scan image via the web application. This image is then encoded into base64 format directly within the web application converting the binary data into a text string. The base64 encoding was employed to facilitate the safe and efficient transmission of CT scan images from the client-side web application to the server-side Flask API. By encoding the images into base64, we could send the image data as part of a JSON object, which is a widely supported format for data interchange. This ensured that the image data remained intact during transit and was compatible with web technologies that typically handle text better than binary. After encoding, the web application sends the base64-encoded image to the backend server using an HTTP request. This request is received by the Flask API, chosen for its simplicity and efficiency in handling web requests. The Flask API decodes the base64 string back into a binary image that can be processed by the machine learning model. 38 The server then inputs this binary image into the pre-trained ResNet50 model, which has already been trained on a diverse dataset of labeled lung CT images. The ResNet50 model uses its learned weights to analyze the image and predict the presence and type of lung cancer. This deep learning model, known for its depth and accuracy, is particularly adept at recognizing complex patterns in medical imagery. Once the ResNet50 model has made its prediction, the results, which include the classification of the cancer type and the confidence level of the prediction, are sent back from the server to the web application via the Flask API. The API formats these results into a response that the web application can interpret. The web application then displays these results to the medical practitioner. The displayed information typically includes the type of lung cancer identified by the model and a confidence score that indicates how certain the model is of its prediction. This allows the practitioner to quickly and accurately understand the model’s diagnostic output. The detailed description of the workflow of the frontend and backend architecture is discussed in sub-sections 4.4.1 and 4.5.1. 4.4 Frontend Development 4.4.1 User Interface Design The web application was developed using Vue.js. Vue.js is a progressive JavaScript frame- work used to build user interfaces and single-page applications. It is known for its simplicity and flexibility, which makes it a popular choice among developers for creating dynamic and intuitive user experiences(Bielak et al., 2022). In this study, Vue.js was selected to develop the frontend due to its ability to facilitate the creation of dynamic user experiences with an efficient and predictable update mechanism. The user interface (UI) designed for medical practitioners prioritizes clarity and ease of use. It features a clean layout with intuitive navigation, designed to minimize the cognitive load 39 on users. The main interface includes a prominent image upload section where practitioners can easily drag and drop CT scan images or browse files from their system. They can upload CT scans for examination, restricting uploads to image formats only for consistency and processing reliability. Once uploaded, images are encoded to base64, a step that facilitates secure and efficient image data handling. The UI also incorporates a pop up that displays the results of the image analysis, including the classification outcomes. This is designed with accessibility in mind, ensuring that results are comprehensible even to practitioners who may not be familiar with machine learning terminology. Interactive elements, like tooltips and modals, provide on-demand guidance throughout the platform. 4.4.2 Image Upload Functionality The image upload functionality is an important aspect of the user interface design, enabling medical practitioners to submit CT scan images effortlessly for classification. This function- ality enables medical practitioners to upload CT scan images which then transitions to to the backend processes to undergoes initial pre-processing techniques. This allows image formats only to be uploaded. To ensure consistency and reliability in the analysis, the system is configured to accept only specific image formats, primarily JPEG and PNG. These formats are chosen due to their widespread use and compatibility with medical imaging devices and software. JPEG is preferred for its balance between image quality and file size, making it suitable for efficiently transmitting high-resolution CT scan images without significant loss of detail. PNG is included for its lossless compression feature, ensuring that no image data is lost during the compression process, which is crucial for maintaining the integrity of medical images(Yang et al., 2021). 40 4.4.3 Error Handling and User Feedback Effective error handling and user feedback mechanisms are essential components of the frontend development process, contributing to a positive user experience and mitigating frustration. In the context of this study, Vue.js enables the implementation of robust error handling mechanisms within the user interface design. The User interface employs validation techniques to ensure that only valid image files are submitted for classification, preventing errors and inconsistencies in the analysis process. Only image formats are accepted, ensuring data consistency and processing accuracy. 4.5 Backend Development 4.5.1 API Integration for Machine Learning Model Flask serves as the backbone for integrating the CNN ResNet 50-based machine learning model into the system’s architecture. The Flask application exposes a RESTful API end- point that facilitates the communication between the frontend and the machine learning model(Lathkar, 2021). When medical practitioners upload CT scan images via the Vue.js frontend, these images are sent to the Flask backend as HTTP requests. where they are then decoded from base64 to their original format for further processing. The Flask application then processes these requests, extracting the image data and preparing it for analysis. It interfaces with the ResNet 50 model, feeding the images as input and retrieving the classification result thus ensuring a smooth, efficient transmission of data, enabling real-time analysis and feedback. 4.5.2 Handling Image Processing Requests Upon receiving image data from the frontend, the Flask backend initiates a series of image processing steps crucial for preparing the data for the machine learning model. This includes 41 resizing the images to the required dimensions for the ResNet 50 model, normalizing the pixel values, and converting the images to the appropriate tensor format. Once the images are processed and analysed, Flask collates the classification results and sends them back to the frontend, where they are presented to the users. This seamless workflow, facilitated by the Flask Framework provides a rapid, accurate classifications of lung cancer from CT scans, empowering medical practitioners with actionable insights derived from advanced machine learning analysis. 42 Chapter 5 System Implementation and Testing 5.1 Introduction This chapters provides a practical aspect of bringing the theoretical design into practical aspect focusing on the implementation and testing of the system developed for this research study in lung cancer classification. The section include screenshots, to illustrate the user interface (UI) design, showcasing the layout and features tailored for medical practitioners to upload and analyze lung CT scan images. Additionally, it will present the system’s outputs, detailing how the results of image classification are displayed to the user. 5.2 User Interface Design The user interface of lung cancer classification and prediction model is designed specifically for medical practioners thus its user friendly layout. Figure 5.1 shows the user interface of this research study. 43 Figure 5.1: User Interface Design The user interface (UI) of the Lung Cancer classification Model is designed to be efficient, ensuring that medical practitioners can navigate it with ease. The Abstract section provides a summary of the research and model’s functionality. The interface features a Text box for practitioners to input their name, which personalizes the interaction and potentially tailors the analysis to individual user needs. The Image Upload section, has a Choose File button that facilitates the easy selection and upload of CT scan images, ensuring that the system is immediately ready for analysis with minimal navigation.Once the image is uploaded, the user clicks the Predict button which then serves as step of initiating the classification process. The classification process then analyses the CT scan image uploaded then displays the results in the Prediction Results section. 44 5.3 Image Upload Process The image upload process for the lung cancer classification model is designed to be simple and user-friendly as shown in Figure 5.2. Medical practitioners can enter their name for a personalized experience, select a CT scan image file, and then click ’Predict’ to initiate the analysis. The system quickly processes the image and displays the results, including the type of lung cancer detected and the confidence level of the diagnosis, alongside the actual CT scan image for visual verification. Figure 5.2: Image Upload Process 5.4 Image Upload Validation To ensure integrity and consistency, the model allows only CT scan images to be uploaded. The code section below in Figure 5.3 shows that only images are accepted into the model. 45 Figure 5.3: Validating Image File Formats for Upload By allowing only the image formats typically associated with medical imaging the system ensures that the input data aligns with the model’s pre-trained parameters and the expected diagnostic outcomes. 5.5 Classification of Results Once a medical practitioner uploads an image and initiates the analysis by clicking Predict button, the model processes the image and the results are then displayed in this dedicated area of the interface As shown in Figure 5.2. The results are conveyed clearly, showing both the type of lung cancer detected—such as adenocarcinoma, in the example shown in Figure 5.2 the probability associated with the prediction, indicated as 99.93%. Additionally, the results section includes a brief description of the identified cancer type, offering essential information on the particular sub-type of non-small cell lung cancer (NSCLC) diagnosed.A timestamp is also provided, which can be crucial for record-keeping and auditing processes. 46 Chapter 6 Discussion of Results 6.1 Introduction The objective of this research was develop a Convolutional Neural Network model for lung cancer classification through CT scan analysis by fine-tuning the ResNet50 pre-trained model. This section provides a detailed discussion of the results obtained from the study which includes examination of the findings, interpretation of outcomes, and a comprehensive analysis of the data. Both training and validation accuracy are quite close throughout the training process, suggesting that the model’s predictions should be reliable when used in real-world scenarios, like helping doctors to diagnose lung cancer types from CT scans. 6.2 Image Pre-processing 6.2.1 Normalization The CT scan images were normalized. Normalization ensures that the input image has a similar distribution to the images that the network was originally trained on, which helps the model to converge faster and perform better.A research study by (Li et al., 2021) highlights the feature normalization and data augmentation techniques for image classification tasks. It concludes that normalization is commonly applied when working with models that have been pre-trained. 47 6.2.2 Resizing Resizing images to a consistent size of 224x224 pixels is essential for uniformity, ensuring that each input fed into the Convolutional Neural Network (CNN) is standardized. It also ensures compatibility with pre-trained model ResNet50 (Koonce and Koonce, 2021), which are optimized for images of this specific dimension, allowing for the effective application of the model’s pre-learned weights and biases to the image dataset used in this research study. 6.2.3 CT-Scan images Exploration The dataset for CT scan images consisted of four distinct classes: normal lung, adenocarci- noma, squamous cell carcinoma, and large cell carcinoma types. The dataset was partitioned into three subsets: 70% for training, 15% for testing, and 15% for validation purposes. The images were converted to tensors. Tensors serve as the standard input format for deep learning models, enabling the efficient handling and manipulation of multi-dimensional data arrays within the neural network framework.Research studies such as that of (Panagakis et al., 2021) highlights the scalability benefits and techniques of converting images to tensors in computer vision tasks. The result output of the data pre-processing is illustrated in Figure 6.1 Figure 6.1: CT scan images of four distinct classes: normal lung, adenocarcinoma, squamous cell carcinoma, and large cell carcinoma types. 48 6.3 Model Analysis and Performance Metrics 6.3.1 Summary of Model Performance As a result of this fine-tuning process, the model demonstrated a notable performance, achieving a train loss of 0.001232% and an accuracy of 98.86%, with the peak validation accuracy reaching 88.4% at epoch 171. The entire training process took 1016 minutes and 35 seconds, reflecting the extensive learning and adaptation the model underwent to achieve these results. The summary of the performance metrics of the lung cancer is shown in Table 6.1 for training performance and Table 6.2 for validation performance below: Table 6.1: Training Performance Metrics Metric Value (%) Epoch 117 Epoch Time (seconds) 8.830657 Accuracy 95.7586 Loss 0.1292 Recall 88.9346 Precision 90.7869 F1 Score 89.6074 Table 6.2: Test Performance Metrics Metric Value (%) Accuracy 87.5000 Loss 1.1573 Recall 78.2991 Precision 80.9746 F1 Score 77.4778 6.3.2 Training and Validation Loss Figure 6.2 displayed in the graph illustrates the training and validation loss of a model across 200 epochs. 49 Figure 6.2: Training and Validation Loss The training loss shows a sharp initial decrease and then levels off to a stable, low value, which suggests that the model is effectively learning from the training dataset without overfitting. The validation loss also decreases and remains low, it’s clear that the model generalizes well to new, unseen data. 6.3.3 Accuracy Figure 6.3 depicts the accuracy of the lung cancer classification model over 200 training cycles. 50 Figure 6.3: Accuracy The accuracy on the training data—the images the model learned from increases sharply at first and then levels off, remaining high.Validation accuracy also rises quickly then fluctuates somewhat but generally maintains an upward trend. 6.3.4 Precision In the study’s precision show in Figure 6.4, both the training and validation precision start at similar levels and increase over time, with the training precision consistently higher than the validation precision. The curves suggest the model is reliable in identifying true cases of lung cancer without many false positives. Precision represents the model’s ability to correctly label an image as cancerous, and the high values imply a high trustworthiness of positive predictions by the model. 51 Figure 6.4: Precision 6.3.5 Recall For recall shown in Figure 6.5, the results show an increase in performance over time, eventually leveling off. The recall measures how well the model identifies all actual cases of lung cancer. The graph indicates that as the model is trained, it becomes better at detecting the majority of lung cance