SU+ @ Strathmore 

University Library 
 

Electronic Theses and Dissertations 

 
This work is availed for free and open access by Strathmore University Library.  

It has been accepted for digital distribution by an authorized administrator of SU+ @Strathmore University. 

For more information, please contact library@strathmore.edu 

 
2024 

 
Utilizing Convolution Neural Networks for 

enhanced lung cancer classification 

through CT scan analysis. 
 

Koris, Phylis Jepchumba 
Strathmore Institute of Mathematical Sciences 
Strathmore University 

 
Recommended Citation 

Korir, P. J. (2024). Utilizing Convolution Neural Networks for enhanced lung cancer classification through CT 

scan analysis [Strathmore University]. http://hdl.handle.net/11071/15648 

 
Follow this and additional works at: http://hdl.handle.net/11071/15648 

https://su-plus.strathmore.edu/
https://su-plus.strathmore.edu/
http://hdl.handle.net/11071/2474
mailto:library@strathmore.edu
http://hdl.handle.net/11071/15648
http://hdl.handle.net/11071/15648


Utilizing Convolution Neural Networks for Enhanced Lung

Cancer Classification Through CT Scan Analysis

Korir, Phylis Jepchumba

Submitted in partial fulfilment of the requirements for the degree of

Master of Science in Data Science and Data Analytics

Strathmore Institute of Mathematical Sciences

Strathmore University

Nairobi, Kenya

June 2024

This thesis is available for Library use through open access on the understanding that it is copyright
material and that no quotation from the thesis may be published without proper acknowledgement.


Declaration
I declare that this work has not been previously submitted and approved for award of a degree

by this or any other University. To the best of my knowledge and belief, the thesis contains

no material previously published or written by another person except where due reference is

made in the thesis itself.

© No part of this thesis may be reproduced without the permission of the author and

Strathmore University.

Name: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Korir Phylis Jepchumba. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Date:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .April 4, 2024 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Approval

The thesis of Korir Phylis Jepchumba was reviewed and approved by the Supervisors.

Prof. Dr. Javier Serrano

Autonomous University of Barcelona.

Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Date:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .April 4, 2024 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Prof. Samuel M. Mwalili

Strathmore University.

Signature: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Date:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .April 4, 2024 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ii


Abstract
Lung cancer is the major cause of cancer mortality, which poses significant challenges to

accurate and timely diagnosis, especially in resource-constrained regions like Kenya. The

traditional method of diagnosing lung cancer through Computed Tomography (CT) scans

often involves manual interpretation, leading to potential delays and inaccuracies. This

research aims to harness the power of Artificial Intelligence (AI) to improve the diagnostic

process. This research study developed a Convolution Neural Network (CNN) model for

enhanced classification of cancer utilizing CT scan images by fine-tuning the pre-trained

ResNet50 architecture. Utilizing Pytorch, a leading deep learning framework for computer

vision, the model was trained on a curated dataset from the public Lung Image Database

Consortium (LIDC), a medical imaging database for development, training, and evaluation

of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis

The collected CT scan image include various types of lung cancer, such as adenocarcinoma,

squamous cell carcinoma, large cell carcinoma, and normal tissue. Data pre-processing

techniques such as resizing, normalization, converting and data augmentation techniques

were used to ensure compatibility with the pre-trained model. The model’s performance was

evaluated with a range of metrics, demonstrating an accuracy of 87.5%, precision of 80.97%,

and an F1 score of 77.4%. These results indicate a promising capability for the model

to accurately classify types of lung cancer, supporting its potential use in clinical settings.

The pre-trained model was then integrated into a web-based application using the Flask

framework, with a frontend designed with Vue.js to provide an intuitive user experience for

image upload functionality. The Flask API facilitates communication between the frontend

and the ResNet 50-based machine learning model. When a CT scan image is uploaded, it is

sent to the Flask backend as an HTTP request. The Flask application processes these requests,

extracting the image data and preparing it for analysis by interfacing with the ResNet 50

model, which then classifies the images and retrieves the results.

iii


Table of contents

List of figures viii

List of tables ix

List of abbreviations x

Acknowledgement xi

Dedication xii

1 Introduction 1

1.1 Background to the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Introduction to Lung cancer . . . . . . . . . . . . . . . . . . . . . 1

1.1.2 Medical Imaging in lung Cancer Diagnosis . . . . . . . . . . . . . 2

1.1.3 Challenges in CT scan Analysis . . . . . . . . . . . . . . . . . . . 3

1.1.4 Role of Artificial Intelligence in healthcare . . . . . . . . . . . . . 3

1.2 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 General Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.2 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Significance of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.6 Scope of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.7 Limitations of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Literature Review 9


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Theoretical Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Theory of Convolutional Neural Networks (CNNs) . . . . . . . . . 9

2.2.2 Theory of Pattern Recognition and Feature Extraction . . . . . . . 10

2.2.3 Transfer Learning and Fine-Tuning Theory . . . . . . . . . . . . . 12

2.3 Empirical review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Methodology 20

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2 Data Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3 Image Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4 Convolution Neural Network Architecture . . . . . . . . . . . . . . . . . . 21

3.4.1 Convolution layers . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4.2 Filters/Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.4.3 Stride and Padding . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4.4 Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4.5 Pooling layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4.6 Fully connected layer . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.5 Proposed Convolutional Neural Network: ResNet50 . . . . . . . . . . . . . 28

3.6 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.6.1 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.6.2 Loss Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.7 Evaluation of Model Performance . . . . . . . . . . . . . . . . . . . . . . 32

3.8 Model Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 System Design and Architecture 36

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.2 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.2.1 Model API Requirements . . . . . . . . . . . . . . . . . . . . . . 36

v


4.3 Overview of System Architecture . . . . . . . . . . . . . . . . . . . . . . . 37

4.4 Frontend Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4.1 User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4.2 Image Upload Functionality . . . . . . . . . . . . . . . . . . . . . 40

4.4.3 Error Handling and User Feedback . . . . . . . . . . . . . . . . . . 41

4.5 Backend Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.5.1 API Integration for Machine Learning Model . . . . . . . . . . . . 41

4.5.2 Handling Image Processing Requests . . . . . . . . . . . . . . . . 41

5 System Implementation and Testing 43

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.3 Image Upload Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Image Upload Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.5 Classification of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Discussion of Results 47

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.2 Image Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.2.1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.2.2 Resizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.2.3 CT-Scan images Exploration . . . . . . . . . . . . . . . . . . . . . 48

6.3 Model Analysis and Performance Metrics . . . . . . . . . . . . . . . . . . 49

6.3.1 Summary of Model Performance . . . . . . . . . . . . . . . . . . . 49

6.3.2 Training and Validation Loss . . . . . . . . . . . . . . . . . . . . . 49

6.3.3 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.3.4 Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.3.5 Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.3.6 F1 Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.4 System Design and Deployment . . . . . . . . . . . . . . . . . . . . . . . 55

vi


7 Conclusions, Recommendations and Future Work 56

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7.2 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7.3 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

References 58

Appendix A Ethical approval 63

Appendix B Similarity Index 64

Appendix C Python code 69

C.1 Model Preparation Code . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

C.2 API Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

C.3 User Interface/ Frontend Code . . . . . . . . . . . . . . . . . . . . . . . . 74

vii


List of figures

Figure 3.1: Dataset Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Figure 3.2: Overview of CNN Architecture . . . . . . . . . . . . . . . . . . . . 22

Figure 3.3: ReLu Activation Function . . . . . . . . . . . . . . . . . . . . . . . 25

Figure 4.1: Overview of System Architecture . . . . . . . . . . . . . . . . . . . 38

Figure 5.1: User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . 44

Figure 5.2: Image Upload Process . . . . . . . . . . . . . . . . . . . . . . . . . 45

Figure 5.3: Validating Image File Formats for Upload . . . . . . . . . . . . . . 46

Figure 6.1: CT scan images of four distinct classes: normal lung, adenocarcinoma,

squamous cell carcinoma, and large cell carcinoma types. . . . . . . 48

Figure 6.2: Training and Validation Loss . . . . . . . . . . . . . . . . . . . . . 50

Figure 6.3: Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Figure 6.4: Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Figure 6.5: Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Figure 6.6: F1 Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

viii


List of tables

Table 3.1: Parameters of the ResNet50 Model for Lung Cancer Classification . 29

Table 6.1: Training Performance Metrics . . . . . . . . . . . . . . . . . . . . . 49

Table 6.2: Test Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 49

ix


List of abbreviations
AI Artificial Intelligence API Application Programming Inter-

face

CT Computed Tomography CNN Convolution Neural Network

DL Deep Learning PET Positron Emission Tomography

MRI Magnetic Resonance Imaging ResNet Residual Network

ReLU Rectified Linear Unit LIDC Lung Image Database Consor-

tium

CAD computer-assisted diagnostic

x


Acknowledgement
First and foremost, I offer my sincerest gratitude to God for His unwavering guidance and

strength throughout this journey. It is through His grace that I have been able to complete

this project.

I am deeply grateful for the guidance and supervision provided by my esteemed supervisor

and lecturer Dr. John Olukuru for the guidance valuable advice and oversight throughout

this journey. I would also like to convey my heartfelt appreciation to my supervisor, Prof.

Dr. Javier Serrano, for his important support and supervision throughout this research. His

experience and insights have helped shape the direction and success of my research.

My gratitude extends to Strathmore University, particularly the Institute of Mathematical

Sciences (SIMS), and the @iLabAfrica research centre. Their commitment to delivering

world-class education and imparting practical and technical knowledge has been essential in

enriching my academic and professional growth.

xi


Dedication
This thesis is dedicated to my parents, who were instrumental in molding my educational

path. Their steadfast support and motivation throughout my early years were the pillars of

my academic success. I would like to give a heartfelt thanks to my mother, whose limitless

love, support, and prayers were the driving forces behind my persistence. Additionally, I am

profoundly grateful to my supervisor, Prof. Dr. Javier Serrano, for his invaluable guidance,

and to my cousin Peter Kosgei, who has been a relentless source of inspiration and

encouragement during this academic journey.

xii


Chapter 1

Introduction

1.1 Background to the Study

1.1.1 Introduction to Lung cancer

Lung cancer is one of the most prevalent and deadly forms of cancer worldwide (Thandra

et al., 2021), primarily caused by the uncontrolled growth of abnormal cells in lung tissues.

This malignancy arises due to several risk factors, with cigarette smoking being the prominent

contributor, accounting for a significant majority of cases (Bade and Cruz, 2020). Exposure to

secondhand smoke, occupational carcinogens like asbestos and radon, genetic susceptibility,

and preexisting lung conditions also contribute to its onset. Emerging primarily in the cells of

the lungs, lung cancer is the second most commonly diagnosed cancer globally. The disease

encompasses two main types: non-small cell lung cancer (NSCLC) and small cell lung cancer

(SCLC). NSCLC, including subtypes like adenocarcinoma, squamous cell carcinoma, and

large cell carcinoma, constitutes the majority of cases (Yin et al., 2022).

From a global perspective, lung cancer’s impact on public health is substantial. According to

the World Health Organization (WHO), lung cancer accounted for 11.4 % of all new cancer

cases globally in 2020, and it was responsible for 18.0 % of all cancer-related deaths, making

it the leading cause of cancer mortality. Leveraging advanced technologies like Convolution

Neural Networks (CNNs) for CT scan analysis could potentially lead to higher early detection

rates and reduced lung cancer-related mortality (Sung et al., 2021). As medical imaging data

availability and computational capabilities grow, integrating artificial intelligence techniques

becomes increasingly pertinent in refining lung cancer diagnosis.

1


In Kenya, as in many other low- and middle-income countries, lung cancer poses specific

challenges due to limited resources and access to advanced medical facilities. The Kenya

National Cancer Control Strategy 2017-2022 highlights the escalating concern of cancer

in the country, with lung cancer being a significant contributor to cancer-related mortality

(Schaefers et al., 2022). Implementing innovative technologies such as Convolution Neural

Network in CT scan analysis offers the potential to enhance diagnostic accuracy, especially

in regions with constrained resources. By automating the analysis process, the CNN model

could aid medical professionals in identifying potential lung cancer cases earlier, potentially

leading to improved survival rates.

1.1.2 Medical Imaging in lung Cancer Diagnosis

A computed tomography (CT) scan is a medical imaging procedure that uses X-rays and

powerful computer technology to produce comprehensive cross-sectional images of the

body’s internal components (Cui et al., 2023). These images, often referred to as slices or

sections, offer a comprehensive view of the anatomical features within the scanned region.

CT scans play an essential role in the detection, diagnosis, and monitoring of various medical

conditions, including lung cancer (Al-Sharify et al., 2020).

In the context of lung cancer, the primary functions of CT scans are early detection, detailed

visualization of the lungs, and the identification of suspicious nodules or masses that may

indicate the presence of cancer, even at its earliest stages. Additionally, once lung cancer

is suspected or confirmed, CT scans are instrumental in staging the disease, precisely

determining the extent of cancer spread within the lungs and its potential involvement

with nearby lymph nodes or other organs (Farsad, 2020).

Radiologists, as highly trained medical professionals with expertise in interpreting medical

images, carefully review CT scan images of the chest for any abnormalities, particularly in

the lung parenchyma (lung tissue). They look for pulmonary nodules, which are small, round,

or oval-shaped abnormalities in the lung tissue, varying in size and appearance. Additionally,

they also compare current CT scans with previous ones, if available, to detect changes in

2


the size or appearance of nodules or the development of new ones. Radiologists generate

detailed reports based on their findings, including the location and characteristics of any

nodules or lesions, which guide in making informed decisions regarding patient care.

1.1.3 Challenges in CT scan Analysis

The reliance on human expertise for CT scan analysis in lung cancer diagnosis poses several

significant issues. Radiological evaluations are inherently subjective, as they depend on

the expertise and experience of the radiologist. This subjectivity can lead to variations in

diagnoses, potentially impacting patient outcomes. Different radiologists may interpret

the same CT scan differently, introducing variability in the assessment of lung nodules

and cancerous lesions. Furthermore, manual interpretation of CT scans is susceptible to

human error, which can have dire consequences in cancer diagnosis. Fatigue, distractions, or

cognitive biases may influence the radiologist’s judgment. Errors in judgment can result in

missed or delayed cancer diagnoses, leading to late-stage detection and reduced treatment

efficacy. The manual analysis of CT scans also involves sending scans to centralized facilities

or external experts for interpretation, incurring additional time and cost burdens. The need

for expert radiologists and the associated costs can strain healthcare budgets and limit

accessibility to quality diagnostics. Consistency and precision are paramount in lung cancer

diagnosis, and ensuring consistent and accurate results across various healthcare facilities is

a significant challenge. To address these challenges effectively, there is a compelling need

for innovative solutions like Convolution Neural Network. These neural networks excel in

image recognition and classification tasks, making them highly suitable for CT scan analysis.

1.1.4 Role of Artificial Intelligence in healthcare

Artificial intelligence (AI) is playing an increasingly significant role in transforming health-

care across the globe (Secinaro et al., 2021). In the context of Kenya and other developing

countries, AI is making significant contributions to various aspects of healthcare delivery.

AI-powered solutions help healthcare professionals in diagnosing diseases, managing patient

3


data, optimizing treatment plans and improving healthcare outcomes. This technology is

particularly relevant in resource constrained settings like Kenya, where it has potential to

bridge the gaps in healthcare access and enhance quality of medical services.

The use of AI in medical imaging is on the rise, revolutionizing the field of radiology and

diagnostics in Kenya and beyond (Mwaniki et al., 2023). AI algorithms evaluate medical

images with high accuracy, finding anomalies, lesions, and patterns. AI-driven image analysis

provides real-time assistance to radiologists and clinicians, enabling faster and more precise

diagnoses. Therefore, integrating this technology into Kenyan healthcare facilities can

improve the accessibility of expert diagnostics.

The integration of AI into medical diagnostics offers several potential benefits, including

enhanced diagnostic accuracy by recognizing intricate patterns and subtle anomalies in

medical images, improved efficiency by processing vast amounts of medical data, timely

diagnosis and treatment planning, and the ability to optimize healthcare resource allocation.

Ultimately, AI reduces costs by minimizing unnecessary tests and streamlining workflows,

making it a valuable asset for healthcare systems, particularly in resource-limited settings

(Panayides et al., 2020). CNNs have found extensive applications in medical image analysis,

offering significant advancements in diagnosis and treatment. They are commonly used

for tasks such as segmentation, classification, and detection in various medical imaging

modalities.

1.2 Statement of the Problem

The timely and accurate diagnosis of cancer is a pivotal factor in shaping effective treatment

strategies and enhancing patient survival rates. In this context, medical imaging techniques,

particularly Computed Tomography (CT) scans, have emerged as indispensable tools for

detecting, diagnosing and accurately staging various forms of cancer. However, the interpreta-

tion of CT scans demands a high degree of expertise, often leading to delays in diagnosis and

subsequent treatment. This challenge is particularly in resource-constrained settings, such as

many hospitals in Kenya, where a shortage of proficient radiologists and oncologists prevails.

4


Inadequate access to specialized medical personnel and the manual nature of CT scan analysis

contribute to significant diagnostic delays and potential errors. The existing workflow often

involves sending scans to centralized facilities or external experts for interpretation, causing

additional time and cost burdens; this underscores the need for an automated, precise and

efficient method to diagnose lung cancer using CT scans. Several studies have demonstrated

the effectiveness of automated detection techniques utilizing artificial intelligence (AI) and

machine learning methods in identifying lung nodules and detecting lung cancer (Asuntha

and Srinivasan, 2020). However, significant challenges persist, particularly in the adoption

and implementation of such cutting-edge technologies within the Kenyan healthcare system.

Addressing these challenges is crucial to enable the timely and accurate diagnosis of lung

cancer, thereby improving patient outcomes and mitigating the burden posed by this deadly

disease.

Despite significant progress in the field of lung cancer classification and detection through

deep learning techniques, there has been a notable gap in translating these advancements into

practical applications within the healthcare industry. This study seeks to bridge this gap by

not only developing a state-of-the-art Convolutional Neural Network (CNN) model through

the fine-tuning of the pre-trained ResNet50 architecture but also by integrating this model

into a user-friendly web framework. This will create a robust, accessible tool that can be

directly incorporated into healthcare systems, thereby enhancing the diagnostic process with

the precision and efficiency offered by advanced machine learning technologies.

1.3 Research Objectives

1.3.1 General Objective

To develop a Convolutional Neural Networks (CNN) model by fine tuning the pre-trained

ResNet50 architecture to effectively recognize and classify various types of cancer by

analyzing Computed Tomography (CT) scans.

5


1.3.2 Specific Objectives

i. To conduct a comprehensive review of existing deep learning models applied to medical

image analysis, specifically focusing on their application with CT scans.

ii. To design and deploy Convolutional Neural Network model for lung cancer classifica-

tion using CT scans.

iii. To evaluate and validate the performance of the developed CNN-based lung cancer

classification model

1.4 Research Questions

i. What are the key strengths and limitations associated with various existing deep

learning architectures when applied to medical image analysis?

ii. What architectural modifications and feature extraction techniques are necessary to

ensure that the CNN model can accurately distinguish between healthy lung tissues

and varying stages of lung cancer?

iii. How does the CNN-based model compare to existing methods in terms of its diagnostic

accuracy and potential clinical relevance for lung cancer classification using CT scans?

1.5 Significance of the Study

The study holds paramount significance for health practitioners by offering a transformative

advancement in their diagnostic capabilities. This technology has the potential to revolu-

tionize how practitioners interpret and analyze medical images, providing them with an

intelligent tool that aids in the early-stage detection and accurate classification of lung cancer

types.

6


The research makes a substantial contribution to the United Nations Sustainable Development

Goal (SDG) of Good Health and Well-being (SDG 3). By utilizing cutting-edge AI-driven

technologies to enhance the effectiveness of diagnosing lung cancer, this study directly

addresses the need for accessible and affordable healthcare solutions, particularly in resource-

limited settings.

Additionally, the research contributes to the broader advancement of artificial intelligence in

healthcare. As AI technologies continue to evolve, their integration into medical practice

can optimize resource allocation and alleviate the burden on healthcare professionals. By

introducing a successful CNN-based model, this study could inspire further research and

development in AI-assisted medical imaging, shaping the future of healthcare by combining

human expertise with computational power

1.6 Scope of the Study

This study focuses on developing on finetuning Convolutional Neural Network ResNet50

model for the classification of various lung cancer types, utilizing CT scan image data sourced

from Lung Image Database Consortium (LIDC), a medical imaging database and Mendeley

database . The study primarily encompass four lung cancer categories: normal lung tissue,

squamous cell carcinoma, large cell carcinoma, and adenocarcinoma. The analysis will key

machine learning steps such as data pre-processing which ensures uniformity, and the model

was trained, validated, and evaluated for accuracy, sensitivity, specificity, and F1-score. The

model was deployed using the Flask web application to serve as a user interface and allow

healthcare professionals and users to interact with the AI-based tool seamlessly. This research

aims to contribute to more precise lung cancer diagnosis and improved patient outcomes by

providing a reliable AI-based tool for lung cancer classification

7


1.7 Limitations of the Study

While the proposed study offers promising findings into enhancing lung cancer diagnosis, it

is essential to recognize and address key limitations that could impact the study’s outcomes

which can influence the study’s scope, generalizability, and the depth of its impact. The

effectiveness of the CNN model relies on the diversity and richness of the dataset used

for training and validation. A constrained dataset, representing a limited range of lung

cancer stages, subtypes, and demographic profiles, can restrict the model’s ability to capture

the complex variations in the real-world cases leading to reduced accuracy and limited

applicability.

Another critical consideration pertains to the limitations posed by available resources, both

in terms of computational capabilities and access to high-quality medical data. Training

deep learning models, such as CNNs, requires substantial computational power and memory,

particularly when dealing with large-scale medical image datasets. Limited access to high-

performance computing resources might hinder the study’s progress and limit the scale of

experimentation.

8


Chapter 2

Literature Review

2.1 Introduction

This chapter discusses the foundational literature on utilizing CNNs for enhanced lung cancer

classification through CT Scan analysis. It also discusses the key theories that support this

study, empirical review on related works and the research gap.

2.2 Theoretical Review

This section discusses the key theories that support this study. The anchor theory for this

study is the Theory of Convolutional Neural Networks (CNNs). The other supporting theories

discussed in this section include the Theory of Pattern Recognition and Feature Extraction

and the Transfer Learning and Fine-Tuning Theory.

2.2.1 Theory of Convolutional Neural Networks (CNNs)

The Theory of Convolutional Neural Networks (CNNs) was initially proposed by Kunihiko

Fukushima in 1980, with his creation of the "neocognitron," a hierarchical, multilayered

artificial neural network (Fukushima, 2020). However, the modern form of CNNs, especially

their application in deep learning, was significantly advanced by Yann LeCun in the late 1980s

and early 1990s. (Bengio et al., 2021), particularly on the application of backpropagation

algorithms to neural networks, laid the groundwork for the development of CNNs as they are

known today.

9


The theory behind CNNs involves understanding their unique structure and functionality.

CNNs are designed to automatically and efficiently extract and learn features from images.

This accomplished by the three layers of CNN which include convolutional layers for filtering

and generating feature maps, pooling layers for dimensionality reduction of feature maps and

fully connected layers for classification. The key principle is that these networks can identify

intricate patterns in image data, allowing for the effective recognition and classification of

images (Yamashita et al., 2020). The primary assumption of CNNs is that spatial hierarchies

of features can be learned from image data. This means that the network assumes that certain

patterns or characteristics (like textures, edges, and shapes) can be detected and used to

understand more complex structures within the image (Yamashita et al., 2020).

Criticism of CNNs often revolves around their "black box" nature, where the decision-

making process can be opaque and difficult to interpret. This is a significant issue in medical

applications, where understanding the rationale behind a diagnosis or classification is crucial

(Hassija et al., 2024). Another criticism is their dependency on substantial quantities of

labeled data for training, which is a challenge in specialized fields like medical imaging

(Castiglioni et al., 2021). This theory is highly relevant for this study because of CNNs’

ability to discern intricate patterns in image data makes them well-suited for identifying the

often subtle and complex signs of lung cancer in CT scans. They can differentiate benign

and malignant lesions, which is a critical task in lung cancer diagnosis. The hierarchical

feature learning of CNNs allows them to recognize cancerous patterns that that might not

be discernible to human vision or conventional image processing methods. However, the

challenges highlighted by the criticisms need to be addressed, particularly in ensuring the

interpretability of the CNN models and their decisions, given the critical nature of medical

diagnostics.

2.2.2 Theory of Pattern Recognition and Feature Extraction

This theory is a multidisciplinary concept from the fields of computer vision, machine

learning, and neural networks (Paolanti and Frontoni, 2020). While no single individual

10


can be credited with its inception, notable contributions have been made by researchers like

Yann LeCun, Geoffrey Hinton, and Yoshua Bengio (Bengio et al., 2021). Their work in the

late 20th and early 21st centuries, particularly on deep learning and neural networks, has

significantly shaped our understanding of pattern recognition and feature extraction (Fieguth,

2022).

The theory posits that CNNs are capable of recognizing patterns and extracting features from

complex datasets, such as medical images. This is achieved through the network’s ability to

automatically learn hierarchical feature representations from data. For instance, in the context

of lung CT scans, CNNs can learn to identify features ranging from basic textures and shapes

to more complex patterns indicative of pathological changes. The primary assumption of this

theory is that meaningful and detectable patterns exist in the data, and these can be learned

and recognized by the CNN. It assumes that these patterns are representative of underlying

phenomena (like the presence of cancerous nodules) and that they can be distinguished from

irrelevant variations in the image data (Abdar et al., 2021).

One significant criticism of this approach is its reliance on the quality and quantity of the

train dataset. If the training dataset is not representative, CNN model may take in incorrect

or misleading patterns. Additionally, there is the challenge of interpretability; understanding

what specific features the CNN is identifying and why it makes certain classifications can

be difficult, which is a crucial consideration in medical applications where decision-making

needs to be transparent and justifiable (Alzubaidi et al., 2021).

For lung cancer classification through CT scans, this theory is highly relevant. The ability

of CNNs to discern and learn from complex patterns in medical images makes them well-

suited for identifying and differentiating between cancerous and non-cancerous tissues. This

capability is crucial in detecting lung cancer, where the early identification of malignant

nodules can significantly impact patient outcomes. However, the challenges of ensuring

that the model is trained on comprehensive and representative data, as well as making the

model’s decision-making process understandable and reliable, are essential considerations

for applying this theory in your research.

11


2.2.3 Transfer Learning and Fine-Tuning Theory

Transfer Learning and Fine-Tuning Theory was developed from the work of (Krizhevsky

et al., 2023), particularly with the development of AlexNet in 2012. They demonstrated the

effectiveness of deep learning and, by extension, the potential of transfer learning in complex

tasks. The concept of Transfer Learning and Fine-Tuning in the field of Convolutional Neural

Networks (CNNs) represents a significant advancement in the field of machine learning and

has important implications for medical imaging, including lung cancer classification.

The central idea of transfer learning is that a model developed for one task can be repurposed

as the starting point for another model in a related task. It entails fine-tuning a CNN on

a more specialized dataset after it has been pre-trained on a larger, more generic dataset

(such as ImageNet). The theory is grounded in the belief that the features learned by the

network on the first task can be applicable and beneficial for performance on the second task

(Tripuraneni et al., 2020).

A key assumption in transfer learning is that the high-level features learned in the initial task

are generalizable and can be effectively applied to different but related problems. In the case

of medical imaging, this assumes that features learned from general images are relevant and

useful for analyzing medical scans.

Criticism of transfer learning centers around its effectiveness when there is a significant

discrepancy between the source (initial) and target (new) tasks. Some scholars, like (Bengio

et al., 2011), have pointed out that the effectiveness of transfer learning can diminish when

the new task is too divergent from the original task. Additionally, there can be issues related

to overfitting, particularly if the fine-tuning dataset is small or not diverse enough.

Transfer learning is particularly well-suited for lung cancer classification through CT scans.

Given the limited availability of large, annotated medical imaging datasets, being able to

leverage a pre-trained ResNet50 CNN allows for a more robust and nuanced understanding

of the complex patterns present in medical scans. Fine-tuning the pre-trained network on

specific lung CT scan datasets can lead to more accurate and effective classification of lung

cancer signs. However, care must be taken to ensure the pre-training and fine-tuning phases

12


are well-aligned in terms of the features relevant for lung cancer detection, addressing the

concerns raised in criticisms of transfer learning (Salehi et al., 2023).

2.3 Empirical review

This section comprehensively explores related studies, methodologies employed, key findings,

and existing gaps in the field of classifying lung cancer types through CT scan analysis.

2.3.1 Related works

(Fallahpoor et al., 2023) undertook a comprehensive review to explore the application of

deep learning (DL) techniques in positron emission tomography and computed tomography

imaging. The study was motivated by the growing use of positron emission tomography

and computed tomography in fields like oncology and neurology, and the challenges in

manual image interpretation due to its time-consuming nature and requirement for extensive

disease-specific knowledge. The review focuses exclusively on the use of AI, specifically

DL, in analysing positron emission tomography and computed tomography images, filling

a gap in existing literature which included modalities and various Artificial intelligence

applications broadly. The researchers analysed 99 studies published between 2017 and

2022, identifying effective DL models and pre-processing algorithms for positron emission

tomography and computed tomography imaging, while also pointing out current constraints

such as the insufficient annotated datasets and challenges in model explainability. This study

emphasizes the potential of DL in improving lesion detection, tumour segmentation, and

disease classification in PET/CT imaging, but it does not delve into the direct application

of these techniques in lung cancer classification through CT scan analysis, which is a more

specific and distinct application area.

Thanoon et al. (2023) conducted an extensive review of deep learning techniques for lung

cancer screening and diagnosis using CT images. Lung cancer, one of the deadliest dis-

eases worldwide, demands early identification for improving patient survival rates, with

13


CT imaging being a key modality for screening and diagnosis. The review looked into an

array of deep learning (DL) techniques developed for this purpose, focusing on two primary

methodologies: classification and segmentation. It critically examines the advantages and

limitations of current deep learning models in this context. The study demonstrates that deep

learning methods hold significant potential for precise and effective lung cancer screening

and diagnosis using CT scans. The review also provides insights into potential future en-

hancements in the application of deep learning, aiming to advance computer-assisted lung

cancer diagnosis systems. This comprehensive analysis serves as a valuable resource for

understanding the current state and future prospects of DL in lung cancer detection through

CT imagery.

Wang (2022) explored the use of deep learning in diagnosing lung cancer, highlighting recent

advancements, challenges, and future directions in the field. It emphasizes the critical role

of medical imaging tools like CT, MRI, PET, and X-ray in early lung cancer detection, but

also notes their limitations such as high false positives and inability to automatically classify

cancer images. Wang’s research underscores the potential of deep learning to enhance lung

nodule detection and classification accuracy. The study reviews existing medical imaging

methods and deep learning-based techniques, focusing on image pre-processing, dataset

analysis, lung image segmentation, nodule detection, and classification. However, it also

identifies significant challenges, such as the need for extensive annotated image datasets and

the interpretability of deep learning models. Looking forward, the paper suggests directions

for future research, including the development of new network structures, incorporation of

clinical knowledge into model training, and enhancement of model robustness. The study

concludes that deep learning could potentially outperform radiologists in specific tasks, but

also acknowledges the need to address legal, privacy, and clinical verification challenges for

widespread clinical application.

The study by (Nageswaran et al., 2022) explored the classification and prediction of lung

cancer using machine learning and image processing. It identifies the urgent need for

effective lung nodule identification in chest CT scans for early lung cancer detection. The

research utilizes 83 CT scans from 70 patients, applying image processing techniques like

14


noise reduction and feature extraction, and the k-means algorithm for image segmentation.

Key machine learning methods used for classification include Artificial Neural Networks

(ANN), K-Nearest Neighbors (KNN), and Random Forest (RF), with the ANN model

showing superior accuracy in lung cancer prediction. The study’s methodology involves

using a geometric mean filter for image pre-processing, enhancing image quality, and then

segmenting the images using k-means for identifying regions of interest. Performance

comparison of different machine learning predictors is based on accuracy, sensitivity, and

specificity. The research concludes that lung cancer, being a deadly disease, necessitates

the use of Computer-Aided Diagnosis (CAD) systems for its early detection. This study’s

findings underscore the potential of machine learning and image processing technologies in

improving the accuracy of lung cancer detection systems, thereby contributing to advanced

imaging techniques for practical implementation in medical diagnostics.

The study by (Mohamed et al., 2023) focused on an innovative approach for automatic

detection and classification of lung cancer in CT scans, employing a hybrid model that

integrates deep learning with the Ebola Optimization Search Algorithm (EOSA). This

research addresses the critical challenge of accurately diagnosing lung cancer, a leading cause

of cancer-related deaths globally. The study’s cornerstone is the EOSA-CNN model, which

optimizes the selection of weights and biases essential for the classification process within a

Convolutional Neural Network (CNN). Tested on a comprehensive lung cancer dataset from

the Iraq-Oncology Teaching Hospital National Center for Cancer Diseases, the EOSA-CNN

model outperformed traditional methods, achieving a high classification accuracy of 93.21%.

The methodology encompassed advanced image pre-processing techniques and the strategic

application of EOSA to refine the CNN architecture, significantly enhancing its diagnostic

accuracy. This breakthrough demonstrates the powerful synergy of machine learning and

metaheuristic algorithms, offering a promising new avenue for early lung cancer detection

and more informed treatment strategies. The study by (Sasikala et al., 2018) looked into a

Convolutional Neural Network (CNN) based approach for the detection and classification of

lung cancer from chest CT images. Recognizing lung cancer as one of the deadliest diseases

in developing countries, the paper emphasizes the challenge of early cancer detection. The

methodology involves extracting lung regions from CT images, segmenting these regions to

15


identify tumors, and then training the CNN architecture with this data. The key objective is

to categorize lung tumors as malignant or benign. Utilizing a dataset from the Lung Image

Database Consortium and Image Database Resource Initiative, which includes 1000 CT

scans, the study applies median filtering for image preprocessing. The CNN architecture,

incorporating layers such as convolutional, RELU, pooling, and fully connected layers,

achieves an impressive 96% accuracy in classifying lung cancer, outperforming traditional

neural network systems. This advancement suggests significant potential for deep learning

in enhancing early lung cancer detection and treatment. The research conducted by (Deepa

et al., 2023) proposed a novel method for lung cancer classification using Convolutional

Neural Networks (CNNs). Addressing the growing prevalence of lung cancer, which is

attributed to harmful consumption habits and environmental factors, this study focuses on

improving the accuracy and efficiency of lung cancer detection. The novel strategy uses

the multi-view aspect model and various digital image processing techniques, as well as the

power of CNNs, to categorize various forms of lung cancer. This strategy tries to improve

diagnostic accuracy in diagnosing lung cancer. The study evaluates the model’s effectiveness

using a variety of criteria, including Matthew’s correlation coefficient, Cohen’s Kappa score,

and log loss. These metrics give a comprehensive assessment of the model by measuring

the correlation between projected and actual classifications, determining agreement beyond

chance, and evaluating the model’s probability estimations (Mulrenan et al., 2022) presents

a comprehensive review on the utilization of Artificial Intelligence (AI) in diagnosing

COVID-19 using Computed Tomography scans and chest X-rays. The study investigates

the application of AI in differentiating COVID-19 from other respiratory infections, with a

focus on research articles sourced from databases like ArXiv, MedRxiv, PubMed, and Google

Scholar. In total, twenty studies met the inclusion criteria: eleven centered on chest X-ray and

twelve on CT, involving datasets ranging from 239 to 19,250 images. The reviewed studies

reported a wide range of sensitivities, specificities, and Area Under the Curve (AUC) values

from 0.789 to 1.00. Despite AI showing exceptional diagnostic capabilities in detecting

COVID-19 manifestations in imaging, the review highlights significant challenges in its

broader application, such as the absence of relevant comparators in studies, the need for

larger datasets, and independent testing. This review emphasizes the potential and existing

16


limitations of AI in enhancing COVID-19 diagnosis via imaging, contributing critical insights

to the field of medical diagnostics during the pandemic.

The research by (Darmawan et al., 2022) looked into the classification of Non-Small Cell

Lung Cancer (NSCLC) using Convolutional Neural Networks (CNN). Lung cancer, cat-

egorized into Small Cell Lung Cancer (SCLC) and NSCLC, is predominantly found in

heavy smokers, with NSCLC constituting about eighty to eighty five percent of all lung

cancer cases, impacting both men and women. This study specifically focuses on classifying

NSCLC into subtypes: squamous cell carcinoma, adenocarcinoma, large cell carcinoma,

and normal lung tissue. For this purpose, the researchers utilized a dataset comprising 1000

CT scan images for each class. The classification process involved three critical stages:

pre-processing, classification, and validation. During pre-processing, all input images were

resized and converted to grayscale to ensure uniformity, which is crucial for accurate analysis.

The study evaluated two different CNN architectures, VGG19 and ResNet50, to determine

which one performs better in classifying NSCLC. Particularly noteworthy was the model’s

performance in the normal class during validation, where it achieved perfect scores in preci-

sion, sensitivity, F1-score, and specificity, all reaching 100%. These findings underscore the

effectiveness of the ResNet50 architecture in accurately classifying NSCLC from CT scan

images, highlighting its potential as a reliable tool for lung cancer diagnosis and aiding in the

determination of appropriate treatment strategies based on the specific cancer subtype.

In a study done by (Mwaniki et al., 2023), to assess the knowledge, attitudes, and practices

of radiologists Kenya towards Artificial Intelligence in the field of Diagnostic Radiology

(DR) were assessed. This cross-sectional descriptive study, which utilized a web-based

questionnaire shared with members of the Kenya Association of Radiologists, revealed that a

significant majority of participants had basic AI knowledge, primarily gained from presenta-

tions, but less than half were familiar with machine learning, artificial neural networks, and

deep learning concepts. The most recognized application of AI in radiology was detection,

with other uses like segmentation and workflow management being less known. Notably,

AI application in daily radiology practice was limited, with only 12.6% of respondents

using AI tools. Despite this, a positive attitude towards AI/ML in radiology was observed,

17


with two-thirds of participants expressing willingness to be involved in the development

and training of ML algorithms. However, the current knowledge of AI applications did not

significantly influence their decision to pursue a career in radiology. The study concluded that

while there is basic AI knowledge among radiologists and residents, deeper understanding

of related concepts is lacking, and the actual use of AI in practice is minimal. The authors

recommend introducing AI/ML courses in radiology residency programs and providing

continuous medical education to bridge this knowledge gap.

2.4 Research Gaps

The existing body of research on the application of deep learning and Convolutional Neural

Networks (CNNs) in the field of lung cancer detection and classification through CT scan

analysis reveals several conceptual, contextual, and methodological gaps which this study

aims to address.

Conceptually, while studies like those by Fallahpoor et al. (2023),Thanoon et al. (2023), and

Wang (2022) acknowledge how deep learning can enhance lung cancer diagnostics, they

predominantly focus on general applications of AI in medical imaging or deep learning for

broader oncological purposes. There’s a lack of in-depth exploration of CNN specifically

tailored for lung cancer classification in CT scans, as this study proposes. Moreover, the

study by Nageswaran et al. (2022) while concentrating on lung cancer classification using

machine learning, doesn’t delve into the nuances of CNNs, which are crucial for capturing

the intricate details in CT images.

Contextually, most of these studies do not address the unique challenges faced in specific

regions like Kenya. For instance, the study by (Mwaniki et al., 2023) highlights the limited

AI/ML knowledge among Kenyan radiologists, indicating a gap in localizing AI applications

to the specific needs and constraints of the region. This study intends to fill this gap by

developing a CNN model that considers the local demographic and healthcare context, which

is critical for accurate lung cancer detection in a Kenyan setting.

18


Methodologically, while studies like those by (Sasikala et al., 2018) and (Deepa et al., 2023)

successfully apply CNN for lung cancer detection, there’s a lack of emphasis on creating

localized models that address specific regional challenges such as the shortage of skilled

radiologists and advanced medical facilities in Kenya. Moreover, these studies don’t fully

explore the potential of CNN in processing and interpreting CT scan data in a way that’s

tailored for the Kenyan population. By focusing on these aspects, this study aims to not only

enhance the accuracy of lung cancer classification through CT scans but also make it more

applicable and accessible for healthcare professionals in Kenya.

Despite these gaps, the field has seen considerable progress in utilizing deep learning for

lung cancer detection. However, translating these technological advancements into practical

tools for the healthcare industry remains a challenge. This study aims to bridge this gap by

developing a cutting-edge CNN model, fine-tuned from the pre-trained ResNet50 architecture,

and integrating this model into a user-friendly web framework to augment the diagnostic

process,.

19


Chapter 3

Methodology

3.1 Introduction

This chapter provides a comprehensive discussion of the methodology employed in this

study. It covers the dataset utilized, its collection process, data pre-processing techniques

applied, as well as the machine learning models employed, this include Convolutional Neural

Network (CNN) and ResNet50. Additionally, it outlines the evaluation metrics utilized and

the deployment approach adopted.

3.2 Data Understanding

The data from public database Lung Image Database Consortium and Mendely data was

researched and collected.Lung Image Database Consortium image collection (LIDC-IDRI)

is a web-accessible international resource for development, training, and evaluation of

computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis(Vendt

and Nolan, 2023). Lung Image Database Consortium has a unique two-phase data collection

process that involves expert thoracic radiologists analyzing and annotating each CT scan to

provide spatial truth information about the presence or absence of lung nodules and their

spatial extent when present. The CT scan images collected contained 2500 images labelled

for adenocarcinoma, squamous cell carcinoma, large cell carcinoma and normal cells. Figure

3.3 illustrates the composition of CT scan images.

20


Figure 3.1: Dataset Images

3.3 Image Pre-processing

The CT scan images collected were pre-processed to enhance the quality and accuracy of

the model. To standardize the input data a common format was created from the DICOM

pictures. We note that research by (Fedorov et al., 2016)standardized the CT scan images

by converting to common image formats that can be used in machine learning modelling.

The images were then converted to tensor to ensure efficient manipulation and processing

by leveraging the Pytorch optimized operations. Converting images to tensors prove to be

support automatic differentiation, which is important for training neural networks using

gradient-based optimization algorithms(Kim, 2024). Data augmentation methods, including

flipping, rotating, and zooming, were applied to enhance the diversity of the study dataset,

thereby boosting the model’s ability to generalize. The images were uniformly resized to

dimensions of 224 by 224 pixels, a resolution that has been consistently effective in similar

research focused on developing image classification models, particularly when leveraging

pre-trained models. This specific resolution is widely recognized for its compatibility with

various pre-trained architectures and its efficiency in maintaining essential image features for

accurate classification(Talebi and Milanfar, 2021).

3.4 Convolution Neural Network Architecture

CNN is a class of deep learning techniques that analyze grid-patterned data, including

photographs. It is modeled after the structure of the animal visual cortex and is intended

21


to automatically and adaptively learn spatial feature hierarchies, ranging from low-level

patterns to high-level patterns (Yamashita et al., 2020). CNN consists of three types of layers:

convolution, pooling, and fully connected layers. The convolution layer extracts features, the

pooling layer reduces processing by down sampling the image, and the third, a fully connected

layer, transfers the extracted features into final output, such as classification(Alzubaidi et al.,

2021).Figure 3.2 illustrates the architrecture of convolutional neural network model and how

it will used to classify the lung cancer types.

Figure 3.2: Overview of CNN Architecture

The input layer takes the lung images, which then pass through convolutional layers where

various kernels extract features by performing a convolution operation. These features are

patterns or specific aspects in the images such as edges, textures, or shapes relevant to

identifying cancerous cells(Raza et al., 2023).

3.4.1 Convolution layers

Convolutional layers are computational structures that enable a computer to derive meaningful

insights from input data through the representation of various abstraction levels(Wang et al.,

2020). This layer will automatically extract and learn the most relevant features from the CT

scan images, which include specific shapes, textures, or patterns associated with lung cancer

signatures. A CNN comprises an input layer x, accepting CT scan images, an output layer

y, and numerous hidden layers h, with each layer hosting a variety of neurons. Within the

22


convolutional layers, every hidden unit modifies the received input from preceding layers

through a process outlined by a nonlinear equation:

h j = F

(
b j +∑

i
wi, jxi

)
(3.1)

Where wi, j are the weights associated with the connection between the ith input xi and the

jth hidden unit, b j is the bias of the jth hidden unit, providing additional flexibility to the

model by allowing each neuron to shift the non-linear activation function F(·) to the left or

right to provide the final classification output h j which can be used to aid diagnosis.

3.4.2 Filters/Kernels

Filters are small matrices used to detect specific features in the input images, such as edges,

corners, textures, or more complex patterns in higher layers. Each filter is designed to

respond to a specific type of feature at a given spatial scale of the image. Multiple filters can

be applied to the input, each generating a separate feature map. This feature maps forms the

output of the layer, with the depth equal to the number of filters used. This can be expressed

using the following mathematical expression:

y = X ∗ f (1) (3.2)

where the symbol ∗ denotes the convolution operator, y represents the convolved results, also

known as the feature map, X denotes the input image, and f stands for the filter or kernel,

which is a small matrix of weights that the network learns during the training process.

The convolutional filter is important in extracting critical diagnostic features from the scans.

It methodically applies element-wise multiplications across different segments of the CT

image, summing these interactions to produce distinct values for each output pixel. This

operation is replicated across the entire scan, ensuring that regions with significant diagnostic

23


features such as the subtle edges, texture variations, specific patterns, and characteristic

shapes associated with lung pathologies are accentuated in the resulting feature maps.

3.4.3 Stride and Padding

The stride and padding method are very crucial in the network’s ability to extract salient

features, preserve spatial information, and balance computational efficiency.

Stride defines the size of the steps taken by the convolutional filters as they move across the

image. A greater stride reduces the spatial dimensions of the output feature maps, which can

help minimize computing load and focus on more pronounced aspects in the images without

being bogged down in unnecessary information.

Padding is used to add layers of pixels around the edge of the input images. This ensures that

when the filter is applied, it can properly include the information at the edges and corners,

which might otherwise be ignored or underrepresented. Effective padding ensures that the

size of the output feature maps can be maintained, and important diagnostic features that

may be located at the periphery of the scans are not missed.

For padding p, filter size f × f and input image size n× n and stride s, our output image

dimension is given by mathematical expression of form:

Output dimension =

[
(n+2p− f )+1

s

]
×
[
(n+2p− f )+1

s

]
(3.3)

3.4.4 Activation Function

The activation function captures desirable nonlinear features between the input images and

output of the model. Activation function is introduced into the CNN architecture to learn the

non-linear relationships the CT scan images and allow the model to capture complex patterns

and relationships. The activation function is applied to output of each neuron by taking the

weighted sum of the inputs and produce an output which can be passed to the next layer. The

24


ReLU activation function was used in this research study due to its proven track record in

enhancing accuracy and also its ability to learn more complex representations of the data.

The ReLU activation function is easy to compute and has a mathematical expression of the

form:

f(z) = max(0,z) (3.4)

where z represents the input to the ReLU layer and f (z) represents the output of the ReLU

activation function. For any given input value z, if z is greater than zero, the ReLU function

outputs z itself; otherwise, it outputs zero.

Figure 3.3 illustrates the Rectified Linear Unit (ReLU) function:

Figure 3.3: ReLu Activation Function

The ReLU (Rectified Linear Unit) activation function was chosen due to its advantages over

traditional sigmoid functions, particularly in terms of feature activation and computational

performance. Unlike sigmoid functions that can cause vanishing gradient problems due to

their asymptotic nature, ReLU maintains a linear activation for positive inputs, which helps

in propagating gradients more effectively during back propagation(Yu et al., 2020). This

25


characteristic of ReLU ensures that the network learns faster and more efficiently, avoiding

the saturation and slow learning problems often associated with sigmoid functions. Moreover,

ReLU’s simplicity—returning zero for negative inputs and performing no changes to positive

inputs—reduces computational complexity, making it an ideal choice for the ResNet50

Convolutional Neural network model used in classifying lung cancer types from CT scans in

this study

3.4.5 Pooling layers

Pooling layers reduce the spatial size of the convoluted features, simplifying the complexity

of the network and reducing computation time, while retaining essential information by down

sampling the input representation (Thanoon et al., 2023). The architecture of pooling layer

involves sliding a window across input feature map and applying operations such as max

pooling or average pooling(Akhtar and Ragavendran, 2020). After multiple convolution

and pooling layers, the feature maps are flattened into a single vector, which is essential for

transitioning from a 2-dimension feature map to a 1-dimension feature vector, making it

possible to feed into a fully connected layer.

The pooling operation can be mathematically represented as follows: Given an input feature

map L of size (M,N,K), where M is the height, N is the width, and K is the number of

channels, and a pooling window size M×N of the output feature map is obtained as follows:

yl,i, j,k = pool(al,m,n,k),∀(m,n) ∈ Ri, j (3.5)

where: yl,i, j,k represents the output of the pooling operation for feature map l at position (i, j)

in channel k. al,m,n,k is the input feature map value at position (m,n) in channel k for feature

map l. Ri, j is the local area centered at (i, j) over which the pooling operation is executed

pool(·) denotes the pooling function, which could be max pooling, min pooling, sum pooling,

or average pooling, depending on the specific operation chosen.

26


In this research study, the inclusion of pooling layers within the Convolutional Neural Net-

work architecture plays an important role in enhancing model generalization and mitigating

overfitting. By effectively down-sampling the feature maps, pooling layers reduce the dimen-

sionality of the data, which in turn lessens the computational load and the complexity of the

model(Saeedan et al., 2018). This reduction is crucial for a dataset derived from lung CT

scans, where the high resolution of images can lead to an overwhelming number of features.

Pooling layers help the model to focus on the most salient features, promoting generalization

by preventing the model from learning noise and irrelevant details.

3.4.6 Fully connected layer

The fully connected layers, often called dense layers, function as classifiers that use the

extracted and compressed features to determine the image’s class by assigning probabilities

to various categories to the output (Anjum et al., 2023). The final output layer uses these

probabilities to classify the image into categories, which in the context of lung cancer, could

be various cancer types or normal cells. Each layer is crucial for accurately classifying lung

cancer from CT scan images, and their combined use ensures that the model learns complex

patterns and nuances in the data to make reliable predictions. In the fully connected layer,

the outputs from the convolution and pooling layers are transformed into a one-dimensional

array for further processing.

The equation for a fully connected layer can be expressed as:

y = f (Wx+b) (3.6)

Where:

• x represents the input vector to the fully connected layer.

• W is the weight matrix associated with the connections between the input and the

neurons in the layer.

27


• b is the bias vector.

• f denotes the activation function, which introduces non-linearity into the model,

allowing it to learn complex patterns.

• y represents the vector of values produced by the layer, which can either be passed to

subsequent layers in the network or, in the case of the final layer, used to produce the

final classification output.

3.5 Proposed Convolutional Neural Network: ResNet50

Convolutional neural networks have made substantial progress since the outstanding success

of previous research works such as AlexNet 2012(Singh and Sabrol, 2021). A modified

version of the proposed architecture ResNet50, was the basis for the CNN architecture

selected for this study.

ResNet50 is a deep convolutional neural network architecture that consists of 50 layers,

including residual blocks. These residual blocks enable the network to learn residual func-

tions, making it easier to train deeper networks effectively(Koonce and Koonce, 2021). This

research utilized the knowledge gained from the pre-trained ResNet50 model to classify lung

CT scans, a task that would have been considerably more time-intensive if a Convolutional

Neural Network had to be developed from scratch, particularly given the limited number of

available CT images.

Residual learning in ResNet50 is mathematically expressed as:

H(x) = F(x)+ x (3.7)

where x is the input to a layer, F(x) represents the residual mapping to be learned, and H(x)

is the desired mapping. This formulation allows for the learning of residuals instead of

directly fitting the desired underlying mapping(Shafiq and Gu, 2022).

28


ResNet 50 was employed as a pre-trained feature extractor, meaning the ResNet-50 archi-

tecture, already trained on a comprehensive dataset, was used to derive features from the

input images without changing the weights of its convolutional layers(Wen et al., 2020). This

approach capitalizes on ResNet-50’s robust ability to identify and extract complex features

and patterns, applying the pre-acquired knowledge to a specialized task of classification, thus

optimizing for both efficiency and computational cost.

In this model, the parameters directly adopted from the ResNet 50 architecture, were pre-

served. In the process of fine-tuning the ResNet50 model, we permitted the model to adjust

and refine its parameters, rather than keeping them fixed, starting from the baseline values

provided by the pre-trained model. This strategy allowed the model to better tailor its learning

to the specific characteristics of our dataset, thereby improving its proficiency in identifying

lung cancer from CT scan images. Table 3.1 shows the parameters of ResNet50 that were

used to achieve the objectives of this research study.

Table 3.1: Parameters of the ResNet50 Model for Lung Cancer Classification

Parameter Value

Number of Epochs 200

Learning Rate 0.001%

Weight Decay 0.0001

Model Name ResNet50

Use CUDA True

Criterion nn.CrossEntropyLoss

Optimizer Adam Optimizer

Momentum 90%

The advantages of using ResNet50 for lung cancer classification in CT scan analysis are

significant. Firstly, ResNet50’s deep architecture enables it to learn intricate features from CT

scan images, enhancing the model’s ability to differentiate between different types of lung

cancer accurately. Secondly, the residual learning approach in ResNet50 aids in mitigating

the degradation problem associated with training deep networks, leading to improved model

29


performance. Lastly, ResNet50’s proven success in various image classification tasks makes

it a robust choice for enhancing lung cancer classification accuracy through CT scan analysis

(McNeely-White et al., 2020).

3.6 Model Training

The primary objective of this research study is to develop a Convolutional Neural Network

model that will effectively classify the types of lung cancer by utilizing the use CT Scan

images by fine-tuning the pre-trained ResNet50 CNN model.The CNN model was trained

using PyTorch on a GPU machine. PyTorch, an open-source machine learning library

built upon the Torch framework, is widely used for its versatility and efficiency. It finds

extensive applications in computer vision and natural language processing due to its ability

to efficiently handle complex neural network designs and efficiently process data on GPU

hardware(Imambi et al., 2021).PyTorch’s robust ecosystem has been instrumental in similar

research endeavors, where its application in developing image classification models has

consistently led to superior outcomes(Ayyadevara and Reddy, 2020) .

PyTorch has an advanced support for tensors, which are multidimensional arrays that served

as the foundational data structure for handling the CT scan images(Imambi et al., 2021).

These images were converted into tensors to leverage PyTorch’s efficient computation and

dynamic computational graph capabilities, facilitating more intuitive and flexible model

development and debugging processes. Compared to other frameworks, PyTorch’s dynamic

nature allows for adjustments and optimizations on-the-fly, a feature particularly beneficial

for the iterative experimentation required in this study(Stevens et al., 2020). This, combined

with its user-friendly interface and extensive library support, made PyTorch an optimal

choice for implementing the advanced neural network models needed for accurate lung

cancer classification.

The transfer learning is crucial for this research and it will extract all intricate patterns and

predict the class of the CT scan images: adenocarcinoma, squamous cell carcinoma, large

cell carcinoma and normal cells. The dataset for this study was split, typically in a 70-15-15

30


ratio, into training, validation, and testing sets to allow for effective model training, validation

of its performance, and evaluation of its generalizability to new, unseen data.

3.6.1 Loss Function

Loss is the loss value over the training the data after each epoch. To minimize the loss, the

Cross-Entropy Loss optimization technique was used. The mathematical expression for cross

entropy is expressed as follows:

AverageLoss =
1
N

N

∑
i=1

l(θ ;yn, 0i) (3.8)

where:

• N is the total number of samples in the dataset.

• i indexes the individual samples in the dataset.

• l(θ ;yn,oi) represents the loss function for the ith sample, calculated with respect to

the model parameters θ ,the actual label yn, and the model’s predicted output oi.

Cross-Entropy Loss, also known as Log Loss, is a performance metric that quantifies the

difference between two probability distributions - the predicted probability distribution output

by the model and the actual distribution represented by the labels in the training data(Mao

et al., 2023). An evaluation done by (Hui and Belkin, 2020) shows that cross-entropy loss is

the best in penalizing incorrect classifications.Cross-entropy is expressed in the form:

H(y, p) =−
M

∑
c=1

yo,c log(po,c) (3.9)

where, H(y, p) represents the cross-entropy loss between the true labels y and the predicted

probabilities p, where M is the number of classes. The symbol yo,c is a binary indicator (0

or 1) if class label c is the correct classification for observation o, and po,c is the predicted

probability that observation o is of class c.

31


3.6.2 Loss Optimization

Adam optimizer, short for Adaptive Moment Estimation known for its adaptive learning rate

capabilities, was used.It is an advanced optimization algorithm used in machine learning to

iteratively update network weights based on training data. It is recognized for effectively

handling sparse gradients and offering robustness in terms of hyperparameter choices. Adam

stands out by integrating the benefits of two other optimization techniques: the Adaptive

Gradient Algorithm (AdaGrad), which adjusts the learning rate according to parameters, and

Root Mean Square Propagation (RMSProp), which normalizes the gradient using a moving

average of squared gradients, akin to momentum(Bera and Shrivastava, 2020).

Early stopping was adopted to reduce the risk of overfitting. This technique monitors the

validation loss and halts training when there’s no significant improvement, ensuring the

model’s ability to generalize. Similar research studies have proven Adam optimizer as the

most successful technique in image classification tasks. A comparative study by (Yaqub

et al., 2020) concludes that Adam optimizer performs better compared to other first order

optimizers. The mathematical equation for the Adam optimizer in image classification can

be represented as follows:

θt+1 = θt −
η√

v̂t + ε
· m̂t (3.10)

Where: θt+1 is the updated parameter at time t + 1, θt is the parameter at time t, η is the

learning rate, m̂t is the exponentially decayed average of past gradients, v̂t is the exponentially

decayed average of past squared gradients, and ε is a small constant (typically 10−8) to

prevent division by zero.

3.7 Evaluation of Model Performance

The performance of the model was evaluated using evaluation metrics such as precision,

recall, accuracy and F1 score. These parameters provide a thorough insight of the accuracy of

32


the model in classifying lung cancer kinds. Accuracy is a measure that quantifies the overall

correctness of the model’s predictions. It is calculated as the ratio of correctly predicted

instances (both true positives and true negatives) to the total number of instances in the

dataset(Vujović et al., 2021). Mathematically, accuracy is defined as:

Accuracy (%) =
(

Number of Correct Predictions
Total Number of Predictions

)
×100 (3.11)

Precision measures the proportion of true positive predictions (correctly identified instances

of a particular class) to the total number of instances predicted as that class, including both

true positives and false positives(Zhou et al., 2021). Mathematically, precision is defined as:

Precision =
True Positives (TP)

True Positives (TP) + False Positives (FP)
(3.12)

Recall, also known as sensitivity, measures the proportion of actual positives (true positives)

that are correctly identified by the model(Erickson and Kitamura, 2021). Mathematically,

recall is defined as:

Recall =
True Positives (TP)

True Positives (TP) + False Negatives (FN)
(3.13)

The F1 score is the harmonic mean of precision and recall, providing a single metric that

balances both the concerns of false positives and false negatives (Erickson and Kitamura,

2021).

F1 = 2× Precision×Recall
Precision+Recall

(3.14)

Precision, recall, accuracy and F1 score are crucial performance metrics for this research

study of lung cancer classification due to their ability to provide a clear understanding of the

model’s predictive capabilities. Similar studies, such as those by (Sharma et al., 2022) on

dermatologist-level classification of skin cancer using deep neural networks, and (Zubair,

2020) on CheXNet, which can diagnose pneumonia from chest X-rays at a level exceeding

33


practicing radiologists, have also underscored the importance of these metrics. They highlight

how precision, recall, and accuracy can collectively ensure that AI-driven diagnostic models

are both reliable and effective in real-world medical settings.

3.8 Model Deployment

After successfully developing and testing the Convolutional Neural Network model for lung

cancer classification, the next important step is model deployment. In this study, Vue.js was

selected to develop the frontend due to its ability to facilitate the creation of dynamic user

experiences with an efficient and predictable update mechanism. Vue.js is a progressive

JavaScript framework used to build user interfaces and single-page applications. It is known

for its simplicity and flexibility, which makes it a popular choice among developers for

creating dynamic and intuitive user experiences(Bielak et al., 2022).The main interface

includes a prominent image upload section where practitioners can easily drag and drop CT

scan images or browse files from their system.

Flask framework was used as the backbone for integrating the CNN ResNet 50-based machine

learning model into the system’s architecture. Flask Framework is a lightweight and flexible

Python web framework designed for quick development of web applications by providing the

tools, libraries, and technologies necessary. Its simplicity and scalability make it an excellent

choice for building RESTful web services, such as Application Programming Interfaces for

machine learning models(Ghimire, 2020).Similar studies such as those by (Yaganteeswarudu,

2020) on multi-disease prediction and a study by (Kumar, 2023) on improving cardiovascular

conditions prove that Flask framework demonstrate remarkable results in deploying machine

learning models into production.

The Flask application exposes a RESTful API endpoint that facilitates the communication

between the frontend and the machine learning model. When medical practitioners upload

CT scan images via the Vue.js frontend, these images are sent to the Flask backend as HTTP

requests.The Flask application then processes these requests, extracting the image data and

preparing it for analysis. It interfaces with the ResNet 50 model, feeding the images as input

34


and retrieving the classification result thus ensuring a smooth, efficient transmission of data,

enabling real-time analysis and feedback.

Chapter 4 of this research study provides an in-depth exploration of the system design and

architecture, detailing the frameworks and methodologies used in developing and deploy-

ing the lung cancer ResNet50 classification model, while Chapter 5 complements this by

presenting practical screenshots from the system implementation and testing phases.

35


Chapter 4

System Design and Architecture

4.1 Introduction

This section offers a comprehensive overview of the system design and architecture utilized

in developing the image classification model, anchored by the ResNet50 neural network. The

architecture is structured into two principal segments: Frontend and Backend development.

Additionally, this section outlines the model API and system requirements, essential for

integration and communication between the model’s components.

4.2 System Requirements

4.2.1 Model API Requirements

Below are the dependencies for the Model API’s functionality, outlining the specific Python

packages and their versions required for integration and operation within the system’s

architecture:

a. Python ==3.9

b. Flask==3.0.2

c. joblib==1.3.2

d. matplotlib==3.8.3

e. numpy==1.26.4

36


f. pandas==2.2.1

g. PyYAML==6.0.1

h. requests==2.31.0

i. safetensors==0.4.2

j. scikit-learn==1.4.1.post1

k. scipy==1.12.0

l. seaborn==0.13.2

m. timm==0.9.16

n. torch==2.2.1

o. torchvision==0.17.1

p. Werkzeug==3.0.1

q. huggingface-hub==0.21.4

4.3 Overview of System Architecture

The system architecture for lung cancer image classification shown in Figure 4.1 shows the

workflow of the system. It is designed for optimal efficiency and user-friendliness, starting

with a Vue.js-based user interface where medical practitioners upload CT scans for analysis.

37


Figure 4.1: Overview of System Architecture

Initially, a medical practitioner uploads a CT scan image via the web application. This image

is then encoded into base64 format directly within the web application converting the binary

data into a text string. The base64 encoding was employed to facilitate the safe and efficient

transmission of CT scan images from the client-side web application to the server-side Flask

API. By encoding the images into base64, we could send the image data as part of a JSON

object, which is a widely supported format for data interchange. This ensured that the image

data remained intact during transit and was compatible with web technologies that typically

handle text better than binary.

After encoding, the web application sends the base64-encoded image to the backend server

using an HTTP request. This request is received by the Flask API, chosen for its simplicity

and efficiency in handling web requests. The Flask API decodes the base64 string back into

a binary image that can be processed by the machine learning model.

38


The server then inputs this binary image into the pre-trained ResNet50 model, which has

already been trained on a diverse dataset of labeled lung CT images. The ResNet50 model

uses its learned weights to analyze the image and predict the presence and type of lung

cancer. This deep learning model, known for its depth and accuracy, is particularly adept at

recognizing complex patterns in medical imagery.

Once the ResNet50 model has made its prediction, the results, which include the classification

of the cancer type and the confidence level of the prediction, are sent back from the server to

the web application via the Flask API. The API formats these results into a response that the

web application can interpret.

The web application then displays these results to the medical practitioner. The displayed

information typically includes the type of lung cancer identified by the model and a confidence

score that indicates how certain the model is of its prediction. This allows the practitioner to

quickly and accurately understand the model’s diagnostic output.

The detailed description of the workflow of the frontend and backend architecture is discussed

in sub-sections 4.4.1 and 4.5.1.

4.4 Frontend Development

4.4.1 User Interface Design

The web application was developed using Vue.js. Vue.js is a progressive JavaScript frame-

work used to build user interfaces and single-page applications. It is known for its simplicity

and flexibility, which makes it a popular choice among developers for creating dynamic and

intuitive user experiences(Bielak et al., 2022). In this study, Vue.js was selected to develop

the frontend due to its ability to facilitate the creation of dynamic user experiences with an

efficient and predictable update mechanism.

The user interface (UI) designed for medical practitioners prioritizes clarity and ease of use.

It features a clean layout with intuitive navigation, designed to minimize the cognitive load

39


on users. The main interface includes a prominent image upload section where practitioners

can easily drag and drop CT scan images or browse files from their system. They can upload

CT scans for examination, restricting uploads to image formats only for consistency and

processing reliability. Once uploaded, images are encoded to base64, a step that facilitates

secure and efficient image data handling.

The UI also incorporates a pop up that displays the results of the image analysis, including

the classification outcomes. This is designed with accessibility in mind, ensuring that results

are comprehensible even to practitioners who may not be familiar with machine learning

terminology. Interactive elements, like tooltips and modals, provide on-demand guidance

throughout the platform.

4.4.2 Image Upload Functionality

The image upload functionality is an important aspect of the user interface design, enabling

medical practitioners to submit CT scan images effortlessly for classification. This function-

ality enables medical practitioners to upload CT scan images which then transitions to to the

backend processes to undergoes initial pre-processing techniques. This allows image formats

only to be uploaded.

To ensure consistency and reliability in the analysis, the system is configured to accept only

specific image formats, primarily JPEG and PNG. These formats are chosen due to their

widespread use and compatibility with medical imaging devices and software. JPEG is

preferred for its balance between image quality and file size, making it suitable for efficiently

transmitting high-resolution CT scan images without significant loss of detail. PNG is

included for its lossless compression feature, ensuring that no image data is lost during the

compression process, which is crucial for maintaining the integrity of medical images(Yang

et al., 2021).

40


4.4.3 Error Handling and User Feedback

Effective error handling and user feedback mechanisms are essential components of the

frontend development process, contributing to a positive user experience and mitigating

frustration. In the context of this study, Vue.js enables the implementation of robust error

handling mechanisms within the user interface design. The User interface employs validation

techniques to ensure that only valid image files are submitted for classification, preventing

errors and inconsistencies in the analysis process. Only image formats are accepted, ensuring

data consistency and processing accuracy.

4.5 Backend Development

4.5.1 API Integration for Machine Learning Model

Flask serves as the backbone for integrating the CNN ResNet 50-based machine learning

model into the system’s architecture. The Flask application exposes a RESTful API end-

point that facilitates the communication between the frontend and the machine learning

model(Lathkar, 2021). When medical practitioners upload CT scan images via the Vue.js

frontend, these images are sent to the Flask backend as HTTP requests. where they are then

decoded from base64 to their original format for further processing.

The Flask application then processes these requests, extracting the image data and preparing

it for analysis. It interfaces with the ResNet 50 model, feeding the images as input and

retrieving the classification result thus ensuring a smooth, efficient transmission of data,

enabling real-time analysis and feedback.

4.5.2 Handling Image Processing Requests

Upon receiving image data from the frontend, the Flask backend initiates a series of image

processing steps crucial for preparing the data for the machine learning model. This includes

41


resizing the images to the required dimensions for the ResNet 50 model, normalizing the

pixel values, and converting the images to the appropriate tensor format. Once the images

are processed and analysed, Flask collates the classification results and sends them back to

the frontend, where they are presented to the users. This seamless workflow, facilitated by

the Flask Framework provides a rapid, accurate classifications of lung cancer from CT scans,

empowering medical practitioners with actionable insights derived from advanced machine

learning analysis.

42


Chapter 5

System Implementation and Testing

5.1 Introduction

This chapters provides a practical aspect of bringing the theoretical design into practical

aspect focusing on the implementation and testing of the system developed for this research

study in lung cancer classification. The section include screenshots, to illustrate the user

interface (UI) design, showcasing the layout and features tailored for medical practitioners to

upload and analyze lung CT scan images. Additionally, it will present the system’s outputs,

detailing how the results of image classification are displayed to the user.

5.2 User Interface Design

The user interface of lung cancer classification and prediction model is designed specifically

for medical practioners thus its user friendly layout. Figure 5.1 shows the user interface of

this research study.

43


Figure 5.1: User Interface Design

The user interface (UI) of the Lung Cancer classification Model is designed to be efficient,

ensuring that medical practitioners can navigate it with ease. The Abstract section provides

a summary of the research and model’s functionality.

The interface features a Text box for practitioners to input their name, which personalizes the

interaction and potentially tailors the analysis to individual user needs.

The Image Upload section, has a Choose File button that facilitates the easy selection and

upload of CT scan images, ensuring that the system is immediately ready for analysis with

minimal navigation.Once the image is uploaded, the user clicks the Predict button which

then serves as step of initiating the classification process. The classification process then

analyses the CT scan image uploaded then displays the results in the Prediction Results

section.

44


5.3 Image Upload Process

The image upload process for the lung cancer classification model is designed to be simple

and user-friendly as shown in Figure 5.2. Medical practitioners can enter their name for a

personalized experience, select a CT scan image file, and then click ’Predict’ to initiate the

analysis. The system quickly processes the image and displays the results, including the type

of lung cancer detected and the confidence level of the diagnosis, alongside the actual CT

scan image for visual verification.

Figure 5.2: Image Upload Process

5.4 Image Upload Validation

To ensure integrity and consistency, the model allows only CT scan images to be uploaded.

The code section below in Figure 5.3 shows that only images are accepted into the model.

45


Figure 5.3: Validating Image File Formats for Upload

By allowing only the image formats typically associated with medical imaging the system

ensures that the input data aligns with the model’s pre-trained parameters and the expected

diagnostic outcomes.

5.5 Classification of Results

Once a medical practitioner uploads an image and initiates the analysis by clicking Predict

button, the model processes the image and the results are then displayed in this dedicated

area of the interface As shown in Figure 5.2.

The results are conveyed clearly, showing both the type of lung cancer detected—such as

adenocarcinoma, in the example shown in Figure 5.2 the probability associated with the

prediction, indicated as 99.93%. Additionally, the results section includes a brief description

of the identified cancer type, offering essential information on the particular sub-type of

non-small cell lung cancer (NSCLC) diagnosed.A timestamp is also provided, which can be

crucial for record-keeping and auditing processes.

46


Chapter 6

Discussion of Results

6.1 Introduction

The objective of this research was develop a Convolutional Neural Network model for lung

cancer classification through CT scan analysis by fine-tuning the ResNet50 pre-trained

model. This section provides a detailed discussion of the results obtained from the study

which includes examination of the findings, interpretation of outcomes, and a comprehensive

analysis of the data. Both training and validation accuracy are quite close throughout the

training process, suggesting that the model’s predictions should be reliable when used in

real-world scenarios, like helping doctors to diagnose lung cancer types from CT scans.

6.2 Image Pre-processing

6.2.1 Normalization

The CT scan images were normalized. Normalization ensures that the input image has a

similar distribution to the images that the network was originally trained on, which helps the

model to converge faster and perform better.A research study by (Li et al., 2021) highlights

the feature normalization and data augmentation techniques for image classification tasks.

It concludes that normalization is commonly applied when working with models that have

been pre-trained.

47


6.2.2 Resizing

Resizing images to a consistent size of 224x224 pixels is essential for uniformity, ensuring

that each input fed into the Convolutional Neural Network (CNN) is standardized. It also

ensures compatibility with pre-trained model ResNet50 (Koonce and Koonce, 2021), which

are optimized for images of this specific dimension, allowing for the effective application of

the model’s pre-learned weights and biases to the image dataset used in this research study.

6.2.3 CT-Scan images Exploration

The dataset for CT scan images consisted of four distinct classes: normal lung, adenocarci-

noma, squamous cell carcinoma, and large cell carcinoma types. The dataset was partitioned

into three subsets: 70% for training, 15% for testing, and 15% for validation purposes.

The images were converted to tensors. Tensors serve as the standard input format for deep

learning models, enabling the efficient handling and manipulation of multi-dimensional data

arrays within the neural network framework.Research studies such as that of (Panagakis

et al., 2021) highlights the scalability benefits and techniques of converting images to tensors

in computer vision tasks.

The result output of the data pre-processing is illustrated in Figure 6.1

Figure 6.1: CT scan images of four distinct classes: normal lung, adenocarcinoma, squamous
cell carcinoma, and large cell carcinoma types.

48


6.3 Model Analysis and Performance Metrics

6.3.1 Summary of Model Performance

As a result of this fine-tuning process, the model demonstrated a notable performance,

achieving a train loss of 0.001232% and an accuracy of 98.86%, with the peak validation

accuracy reaching 88.4% at epoch 171. The entire training process took 1016 minutes and

35 seconds, reflecting the extensive learning and adaptation the model underwent to achieve

these results.

The summary of the performance metrics of the lung cancer is shown in Table 6.1 for training

performance and Table 6.2 for validation performance below:

Table 6.1: Training Performance Metrics

Metric Value (%)
Epoch 117
Epoch Time (seconds) 8.830657
Accuracy 95.7586
Loss 0.1292
Recall 88.9346
Precision 90.7869
F1 Score 89.6074

Table 6.2: Test Performance Metrics

Metric Value (%)
Accuracy 87.5000
Loss 1.1573
Recall 78.2991
Precision 80.9746
F1 Score 77.4778

6.3.2 Training and Validation Loss

Figure 6.2 displayed in the graph illustrates the training and validation loss of a model across

200 epochs.

49


Figure 6.2: Training and Validation Loss

The training loss shows a sharp initial decrease and then levels off to a stable, low value,

which suggests that the model is effectively learning from the training dataset without

overfitting. The validation loss also decreases and remains low, it’s clear that the model

generalizes well to new, unseen data.

6.3.3 Accuracy

Figure 6.3 depicts the accuracy of the lung cancer classification model over 200 training

cycles.

50


Figure 6.3: Accuracy

The accuracy on the training data—the images the model learned from increases sharply at

first and then levels off, remaining high.Validation accuracy also rises quickly then fluctuates

somewhat but generally maintains an upward trend.

6.3.4 Precision

In the study’s precision show in Figure 6.4, both the training and validation precision start

at similar levels and increase over time, with the training precision consistently higher than

the validation precision. The curves suggest the model is reliable in identifying true cases of

lung cancer without many false positives. Precision represents the model’s ability to correctly

label an image as cancerous, and the high values imply a high trustworthiness of positive

predictions by the model.

51


Figure 6.4: Precision

6.3.5 Recall

For recall shown in Figure 6.5, the results show an increase in performance over time,

eventually leveling off. The recall measures how well the model identifies all actual cases of

lung cancer. The graph indicates that as the model is trained, it becomes better at detecting

the majority of lung cance