LCIT school of pharmacy Bilaspur Chhattisgarh
Lung Cancer is a leading cause of cancer related deaths worldwide. Screening high risk Individuals for lung cancer with Low Dose CT scans is now being implemented in United States And other countries are expected to follow soon. In CT Lung Cancer screening, many millions of CT scans need to be analyzed, which is enormous burden for radiologists. Therefore there is a Lot of interest to develop computer aided system to optimize screening. The goal of this work is to Detect Lung Cancer in an early stage, for which Pulmonary Nodules have to be located which Are the early manifestation of Lung Cancers.
Lung Cancer is a leading cause of deaths worldwide. The National Lung Screening Trial (NLST) is a randomized controlled trial in the U.S. including more than 50,000 high risk subjects, showed that lung cancer screening using annual low dose computed tomography (CT) reduces lung cancer morality by 20% in comparison to annual screening with chest radiography [1]. In 2013 the U.S. Preventive Services Task Force (USPSTF) has given low dose CT screening a grade B recommendation for high risk individuals[2] and early 2015, the U.S. Centre of Medicare and Medicaid Services (CMS) has approved CT Lung cancer Screening for Medicare recipients. As a result of those developments lung screening programs using low dose CT are being implemented in the United States and other countries. Computer Aided Detection (CAD) of pulmonary nodules can play an important role when screening is implemented on large scale.
To this end, different approaches have been used for classification of cancerous nodules in CT scans. Typically they start with candidate nodules creation and then these candidates are classified on the basis of some predetermined features which are designed to differentiate between nodules & non-nodules. Feature extraction in these approaches is majorly focused on intensity distribution and nodules geometry. They require segmentation of nodule region which may not be accurately segmented due to limitations of image processing techniques and dynamic nature of nodule sizes and intensities. Though these approaches have contributed some reasonable nodule detection results but there is a huge room of improvement because of the sensitivity & effect of disease.
It requires two steps to be done which are mentioned below:
Objective:
The objective of this project is to develop a system that detects and classifies Lung Nodules in Low Dose CT Images using 3D Convolutional Neural Network.
SWOT Analysis:
Strengths:
Threats:
Lung Cancer Detection on C T Scan Images:
The Lung cancer (LC) is the second most common cancer in both men and women in Europe and in the United States and represents a major economic issue for health care systems, accounting for about 12.7% of all new cancer cases per year and 18.2% of cancer deaths. In particular, each year there are approximately 1,095,000 new cancer cases and 951,000 cancerrelated deaths in men and 514,000 new cases and 427,000 deaths in women. Lung cancer is caused by uncontrollable irregular growth of cells in lung tissue. These lung tissue abnormalities are often called Lung nodules. They are small and roughly spherical masses of tissue, usually about 5 millimetres to 30 millimetres in size. In general, they can be categorized into 4 groups including: juxta-vascular, well-circumscribed, pleural tail, and juxtapleural. Figure 1 shows some examples of these categories. Pulmonary nodules are the characterization of the early stage of the lung cancer.
REVIEW OF EXISTING NODULE DETECTION METHODS
In literature, authors proposed several methods for automated and semi-automated detection of pulmonary nodules [59]. However, all these works involved four steps to detect the pulmonary nodule: pre-processing, extraction of nodule candidates, reduction of false positives and classification. Figure 2 shows these steps in details.
The next part focuses on the different studies involving these steps.
Computed Tomography (CT) is considered as one of the best methods to diagnose the pulmonary nodules. It uses x-rays to obtain structural and functional information about the human body. However, the CT image quality is influenced a lot by the radiation dose. The quality of image increases with the significant amount of radiation dose, but in the same time, this increases the quantity of x-rays being absorbed by the lungs. To prevent the human body from all kind of risk, radiologists are obliged to reduce the radiation dose, which affects the quality of image and is responsible for noises in lung CT images. Pre-processing step aims to reduce the noises in these images. Different filtering techniques were proposed in literature to remove these noises, such as median filtering wiener filtering, Gaussian filter bilateral filtering and a specific high-pass filter. Many others works combine median filters with Laplacian filters by a differential technique, which subtracts a nodule suppressed image (through a median filter) from a signal enhanced image (through a Laplacian matched filter with a spherical profile) . A difference image, containing nodule enhanced signal, is then obtained and used for the next stages.
Segmentation of the lung regions is the second stage of the methods processing scheme. It refers to the process of partitioning the pre-processed CT image into multiple regions to separate the pixels or voxels corresponding to lung tissue from the surrounding anatomy. Various approaches have been used for lung segmentation and they can be categorized into two main groups: 2D approaches and 3D approaches.
In this section, we systematically review the state-of-the-art of the segmentation methods for lung CT images. Due to the large number of segmentation methods, we have categorized these methods into five intuitive groups for easier comprehension: thresholding-based, stochastic, region-based, contour-based, and learning-based methods, as shown in Figure 3.
Fig. 1. 2D-based segmentation methods for lung CT images
Several approaches exist in literature regarding the volumetric lung nodule segmentation. They can be classified into five categories: thresholding [96], mathematical morphology, region growing, deformable model, and dynamic programming, as shown in Figure 4. Thresholding approach was adopted by Zhao et al. [96] and Yankelevitz et al. [91][92], where the appropriate threshold values can be deduced either after applying the Kmean clustering in 19 [91][92] or applying the average gradient magnitudes algorithm [96]. According to Diciotti et al. [21], segmentation algorithms should be evaluated on large public databases with a welldefined ground truth for verification. Several of the existing studies utilized private databases. Therefore, a performance comparison between various methods is thus limited [59]. Usually, a nodule will appear in several slices of image in a CT scan. In 2D method, the slice with the greatest sized nodule is selected for analysis to differentiate between benign and malignancy. Compared with 2D method, the addition of extra dimension dramatically increases the operational complexity and computational cost for processing the entire 3D nodule volume. Thus, to reduce both the computational cost and radiation dose, the study in this paper tries to distinguish between benign and malignant nodules by using a 2Dapproach for a single post-contrast CT scan [64].
C. Nodule extraction and classification
Lung nodule detection aims to identify the location of the nodules if they exist. The most widely proposed approach is detection by classification and clustering. This approach comprises four categories: Fuzzy and neural network, Knearest neighbour, Support vector machines and linear discriminant analysis, as shown in Figure 5. The same approach was also adopted by Kostiset al. [52], Bong et al. [12] and Hosseini et al. [43]. In [12] Bong et al. propose and apply stateof-the-art fuzzy hybrid scatter search for segmentation of lung Computed Tomography (CT) image to identify the lung nodules detection. It utilized fuzzy clustering method with evolutionary optimization of a population size. Later in [43], the authors employed two fuzzy methods for the lung nodule CAD application. The Mamdani model and the Sugeno model of the fuzzy logic system. These methods were implemented and the classification results were compared and evaluated through ROC curve analysis and root mean squared error methods.
Fig. 5. An overview of the Nodule classification methods
Recently, Akram et al. implemented an automated pulmonary nodule detection system a novel pulmonary nodule detection system using Artificial Neural Networks based on hybrid features consist of 2D and 3D Geometric and Intensity based statistical features [2]. A nearest cluster method was used by Ezoe et al. [25] and Tanino et al. [81] to classify the detected nodules candidate. Zhao et al. [96] applied boosting of the KNN classifier to estimate the probability density function of the intensity value of the trained ground glass opacity nodules. In [50], Kockelkorn et al. designed a user-interactive framework for lung segmentation with a k-nearest-neighbour (KNN) classifier. After that, Mabrouk et al. selected, in [66], a total of 22image features from the enhanced CT image, then, a fisher score ranking method was used as a feature selection method to select the best ten features and a K-Nearest Neighbourhood classifier was used to perform classification.
CONCLUSION
This review gives an overview of the current detection techniques for CT images that may help researchers when choosing a given method. Certainly, lung analysis techniques have been improved over the last decade. However, there still are issues to be solved such as developing new and better techniques of contrast enhancement and selecting better criteria for performance evaluation is also needed.
RELATED CONCEPTS
What is Nodule?
A nodule is a small round oval shaped growth in the Lung. It may also be called a “spot on the lungs” or a “coin lesion”. Nodules are smaller than 3 centimeters in diameter. If the growth is larger than that, it is called a pulmonary mass and is more likely to represent cancer [3]. It has two types namely malignant & benign. Malignant are cancerous. Unfortunately no apparent symptoms are associated with its presences and they can only be detected with computed tomography or traditional X-rays. A nodule can be seen in figure shown below:
Classification of Nodules:
Lung nodules can be distinguished in solid nodules and sub solid nodules. Sub solid nodules can be further classified as nonsolid nodules and part solid nodules. This classification is significant because different nodules require different approaches for their 24 detection, measurement & management [5].
Fig: 6 - Classification of Lung Nodules
SYSTEM ANALYSIS and DESIGN
Use case Diagram:
Picture shown above is a use case of how the proposed system shall be used by Radiologist. User comes and Loads some CT image in which nodule detection is required. System do some processing and shows results back to radiologist. In processing, a chain of sub processes starts with preprocessing of CT image, followed by 3D patch extraction which are needed for detection. Then results are computed, stored & returned to the request initiator.
PROJECT IMPLEMENTATION
Patch Extraction:
Non Nodules 3D patches of Non Nodules are cut from CT scans using provided annotations. Total annotations for this class are more than 5 lacs. The patch size is kept as 32*32*32 in all three dimensions.
Patch Normalization:
3D patches of both nodules and non-nodules cut from different CT scans taken from different CT Machines are having variable CT intensity values. To deal with this problem, all 3D patches are normalized as:
Min Value = -400
Max Value = 1000
3D CNN Architecture:
The architecture proposed in paper [7] is simple but deep. A 32*32*32 input layer is used. After that three convolutional layers with 32, 16, and 16 small 3*3*3 kernels are used respectively. Each convolutional layer is followed by a max-pooling layer with overlapping 2*2*2 windows. Three fully connected layers with 64, 64, and 2 neurons are used respectively. Rectified linear units (ReLU) are used in each convolutional and fully connected layers. Dropout rate of 0.5 is applied after first two fully connect layers.
3D CNN Training & Testing:
Keeping in view the input size, computational and memory capacity, the 3D CNN model is trained on 39 thousand equally distributed samples of nodule and non-nodule patches. The model was required to be trained for 10 epochs and tested on 17 thousand equally distributed samples of both classes.
Baseline Results:
Due to the complex nature and huge size of problem, large memory and high computational power system was required with the continuity of power for approximately 40 days (4 days per epoch). To fulfill this, an Octa Core CPU was used which could handle 16 threads at a time. Memory need was fulfilled by using 64 GB ram installed in system. For power backup, a shared UPS was used with approximately 1 hour power backup time. The system was kept in university and remote access rights were given to me for using system from anywhere.
FUTURE RECOMMENDATIONS
Lung Segmentation:
Training the model after segmenting the lung region can increase the accuracy of used 3D CNN Model rather than cutting patches from original CT scans. In most cases, segmentation helps the classifier to classify. Original CT scan is on left whereas segmented Lung is on right. All nodules will always fall inside lung region.
Parallel Computing Power Usage:
Parallel computing would be helpful and time efficient for this complex problem. One can exploit the power of thousands of cores in GPU and can get the work much earlier.
On Disk Memory Usage:
Instead of using RAM, one could use the hard drive as a memory with the help of some python packages. It will increase Input Output operations will increase but it will give more space in RAM for better computation and large data load.
Saving Machine Stats with Third Party Tools:
To capture the state of machine, third party tools could be used which can start the system and run high priority task by their own when the power comes again.
It would be very handy if the results after every epoch could be stored to a file and then if the electricity goes off, then instead of initializing everything from scratch, the weights should be loaded from file and the further processing could be done. This would save a lot of time & processing.
METHODOLOGY:
Subjects
We used chest CT examinations of 293 subjects participating in a lung cancer screening program that were obtained under an institutional review board-approved protocol (consent was obtained). The selected examinations were a subset from those ascertained as part of a larger study (Specialized Program of Research Excellence [SPORE] in Lung Cancer) designed to evaluate lung cancer screening with low-dose CT examinations. Examinations for this study were selected by the institutional principal investigator of the SPORE project, who assembled a limited data set enriched with examinations originally reported as depicting pulmonary nodules. The mean age of the subjects whose examinations were included in this study was 60.9 years (range, 50-80 years).
MDCT Data Acquisition
The CT examinations were performed using LightSpeed Plus 4-MDCT (n = 282) or LightSpeed Ultra 8-MDCT (n = 11) scanners (GE Healthcare). The helical CT scans were contiguous ( non -overlapping ) volume scans encompassing the entire lung area acquired with 2.5-mm section thickness in the axial plane. Images were reconstructed with 512 × 512 pixel matrices using the GE Healthcare lung reconstruction kernel. The low-dose CT acquisition protocol varied slightly depending on patient size: tube voltage range, 120-140 kVp; mean tube current, 29.7 36 ± 10.7 (SD) mAs; and range of pixel dimensions, 0.600.98 mm. The CT examinations were acquired with an end-inspiratory breath-holding protocol. A GE Healthcare Advantage Workstation running Advanced Lung Analysis 1 (ALA) software was used to review and rate the CT examinations. The workstation was placed in the main thoracic radiology reading room for convenience of the participating radiologists, and reviewers were notified by the project leader if they fell substantially behind in the planned interpretation schedule. The full functionality of the ALA software was available to the participating radiologists (e.g., window and level settings, zoom, cine mode, and maximum intensity projection [MIP]). Solid nodules were defined as any pulmonary (or pleural) lesion represented on a chest CT image (displayed on lung windows) as sharply defined, discrete, and nearly circular soft-tissuedensity opacity with a diameter measuring between 1.0 and 30.0 mm. Nonsolid nodules (e.g., ground-glass opacity) were defined in the same manner except that the density of the opacity was not solid or soft-tissue attenuation but less than soft tissue, allowing visualization of background structures (e.g., blood vessels). Partially solid nodules (i.e., mixed) were defined as a combination of solid and nonsolid nodules. Reviewers were asked to mark the location of all three nodule types larger than 1.0 mm and to provide characterization information only for those larger than 3.0 mm. However, the question regarding calcification (calcified or non calcified ) responded to for all marked nodules regardless of size.
Data and Statistical Analysis
All pulmonary nodules detected by at least one of the three radiologists were tabulated and analyzed. This was done because there was no verified outcome for most of the cases (and nodules). A consensus score of the three reviewers was determined for nodule size, calcification, and clinical importance. Individual reviewer's measured nodule size was defined as the maximum of the length and width on one individual section depicting the nodule. The 39 consensus nodule size was computed as the average reported size as indicated by those reviewers detecting (marking) the nodule in question. If a reviewer did not score the length and width of an identified nodule (i.e., for nodules < 3 xss=removed>
Relative reviewer agreement was evaluated for NCNs with clinical importance equal to or greater than 1 as follows: nodule-based, for individual nodules and negative examinations equally weighted; and examination-based, for positive examinations with one to six nodules and negative examinations with no marked nodules or those with more than six marked nodules. In the former, interobserver agreement was based on all nodules observed by any of the three observers; hence, a reviewer not involved in specific paired analyses could influence the measured agreement.
RESULTS AND DISCUSSION
The three reviewers identified a total of 1,317 pulmonary nodules in 293 CT examinations, with 16 examinations rated as negative by all three reviewers (Table 2).
Table 2: Nodules And Noncalcified Nodules (Ncns) With Some Suspicion Of Malignancy (Clinical Importance ? 1)
The mean absolute percentages of differences (percentage of the mean size) between the reported sizes for the pairing of reviewers 1 and 2, 1 and 3, and 2 and 3 were 27.0% ± 23.2%, 16.3% ± 16.3%, and 30.0% ± 25.9%, respectively. Intraobserver agreement was poor in the 30 repeated examinations for the detection of individual nodules (highest ? = -0.035) but was good to excellent in the examination-based evaluation (i.e., all examinations with one or more detected nodules) (Table 3).
Reviewer 1 had the highest intraobserver agreement for the detection of individual nodules, but that reviewer also detected the lowest number of nodules. Conversely, reviewer 3 had the highest intraobserver agreement in the examination-based evaluation (? = 0.889) and detected the highest number of nodules.
Table 3: Intra Observer Agreement For Marked Non Calcified Nodules (Ncns) With Some Suspicion Of Malignancy (Clinical Importance ? 1) And Negative Examinations (N = 30) That Were Interpreted Twice.
Inter observer agreement was poor for the detection of individual nodules and marginal for examination-based evaluation (i.e., for examinations with one or more detected nodules) (Table 4). The agreement between any pair of radiologists was less than 55% for the detection of individual NCNs with a clinical importance equal to or greater than 1 based on all nodules detected by the three reviewers. The interobserver agreement among the three reviewers was 18.9%. The agreement between reviewers 1 and 2 was the highest (? = 0.120); those two reviewers detected the lowest number of nodules.
TABLE 4: Interobserver Agreement for Marked Noncalcified Nodules (NCNs) with Some Suspicion of Malignancy (Clinical Importance ? 1) and Negative Examinations 44
TABLE 5: Distribution of Missed (Not Marked) Noncalcified Nodules (NCNs) with Some Suspicion of Malignancy (Clinical Importance ? 1) by Rated Clinical Importance, Size, and Density
DISCUSSION:
Detection of pulmonary nodules depicted on low-dose, thin-section CT examinations of the chest may become an important part of the daily tasks of many radiologists, both those who specialize in thoracic imaging and general radiologists. We observed a large intra-and interradiologist variability in the detection of individual pulmonary nodules among the three reviewers in terms of relative reviewer agreement. As expected, the more nodules detected by a reviewer, the greater the variability and the lower the agreement with oneself during repeated interpretations and with other radiologists.
CONCLUSION:
While seemingly a straightforward task, the effective and efficient detection of lung nodules on CT presents many challenges to the radiologist, which result in limited sensitivity and interobserver agreement for many nodule detection applications. Within the context of lung cancer screening, performance is impacted by the threshold size for actionability and the associated management protocol, which assures repeated exams to limit the time interval before another opportunity for detection arises. When serial CT exams are acquired to assess lesion growth and thus classify a finding as positive, performance is significantly better than for nodule detection in general. Nevertheless, in the interest of maximizing the effectiveness of interpretation, greater understanding of how radiologists detect lung nodules as well as how a spectrum of alternative visualization methods and CAD might be used systematically is needed. In light of the magnitude of lung cancer screening CT scans anticipated to be performed, these investigations should be prioritized
REFERENCES:
Roshan Kumar, Rohit Kumar, Somprabha Madhukar, Shruti Rathore, Pulmonary Nodule Analysis For Lung Cancer Detection In Low Dose CT Image, Int. J. of Pharm. Sci., 2024, Vol 2, Issue 6, 228-240. https://doi.org/10.5281/zenodo.11482662