© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

0 Aufrufe

Computeraidedlungcancerdiagnosiswithdeeplearningalgorithms.pdf

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

- 5- Artificial Intelligence in Surgery -Promises and Perils..pdf
- Machine Learning
- GoodFellowGANforMedicine.pdf
- Tensor Flow
- Nicholas Troutman Resume
- Report on AI
- Tensorflow - whitepaper2015
- Artificial+Intelligence+with+Python+Nanodegree+Syllabus+9-5
- Machine Learning and a i for Risk Management
- Blazingly Fast Seg
- Feature Engineering Handout
- Quantized Neural Networks: Training Neural Networks withLow Precision Weights and Activations
- Geomorphometry Concepts, Software, Applications.pdf
- GI-report No.9
- Data Scientist Key Points For Responsibility.docx
- final research paper
- DMMMM.pdf
- AI Skills Gap Infographic.pdf
- Srivastav a 14 A
- Ng Beyond Short Snippets 2015 CVPR Paper

Sie sind auf Seite 1von 9

net/publication/301651102

DOI: 10.1117/12.2216307

CITATIONS READS

32 2,037

3 authors:

University of Texas at El Paso University of Oklahoma

17 PUBLICATIONS 147 CITATIONS 343 PUBLICATIONS 3,839 CITATIONS

Wei Qian

University of Texas at El Paso

198 PUBLICATIONS 2,404 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

All content following this page was uploaded by Wenqing Sun on 02 April 2018.

Computer aided lung cancer diagnosis with deep learning algorithms

Wenqing Suna, Bin Zhengb, c, Wei Qiana, c

a) Medical Imaging and Informatics Laboratory, Department of Electrical & Computer Engineering,

University of Texas, El Paso, Texas, United States

b) College of Engineering, University of Oklahoma, Norman, Oklahoma, United States

c) Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, China

Abstract

Deep learning is considered as a popular and powerful method in pattern recognition and

classification. However, there are not many deep structured applications used in medical imaging

diagnosis area, because large dataset is not always available for medical images. In this study we

tested the feasibility of using deep learning algorithms for lung cancer diagnosis with the cases

from Lung Image Database Consortium (LIDC) database. The nodules on each computed

tomography (CT) slice were segmented according to marks provided by the radiologists. After

down sampling and rotating we acquired 174412 samples with 52 by 52 pixel each and the

corresponding truth files. Three deep learning algorithms were designed and implemented,

including Convolutional Neural Network (CNN), Deep Belief Networks (DBNs), Stacked

Denoising Autoencoder (SDAE). To compare the performance of deep learning algorithms with

traditional computer aided diagnosis (CADx) system, we designed a scheme with 28 image

features and support vector machine. The accuracies of CNN, DBNs, and SDAE are 0.7976,

0.8119, and 0.7929, respectively; the accuracy of our designed traditional CADx is 0.7940,

which is slightly lower than CNN and DBNs. We also noticed that the mislabeled nodules using

DBNs are 4% larger than using traditional CADx, this might be resulting from down sampling

process lost some size information of the nodules.

Key Words: lung cancer, deep learning, computed tomography, computer aided diagnosis

(CADx), Convolutional Neural Network (CNN), Deep Belief Networks (DBNs), Stacked

Denoising Autoencoder (SDAE)

1. Introduction:

Deep Learning is a new subfield of machine learning created by Hinton [1] which was inspired

by the human brain’s architecture. By learning from the deep, layered and hierarchical models of

data, the deep learning algorithms can outperform the traditional machine learning models.

However, even ten years ago, most people still think this deep structured algorithm can only be

used in simple image classifications like handwritten numbers recognition. But with the

development of deep learning algorithm, many research groups have been already successfully

applied to more complicated classification tasks. In ImageNet LSVRC-2012 contest, the winner

group used deep learning algorithm successfully classified 1.2 million high-resolution images

into 1000 different classes with an error rate of 15.3%, compared to 26.2% reported by the

second-best group [2]. In another contest, deep learning algorithm won MICCAI 2013 Grand

Challenge and ICPR 2012 Contest on Mitosis Detection [3]. In recent years, some researchers

used convolutional neural network (CNN) to detect clustered microcalcifications in digital breast

tomosynthesis, and the results are promising [4] [5].

Medical Imaging 2016: Computer-Aided Diagnosis, edited by Georgia D. Tourassi, Samuel G. Armato III,

Proc. of SPIE Vol. 9785, 97850Z · © 2016 SPIE · CCC code: 1605-7422/16/$18 · doi: 10.1117/12.2216307

To the best of our knowledge, no literature has reported the purely data driven approach to

classify the lung cancer lesion images. In this study, we implemented and compared three

different deep learning algorithm with the traditional image feature based CAD system. All the

algorithms were applied to Lung Image Database Consortium and Image Database Resource

Initiative (LIDC/IDRI) public database, and details of data description and algorithms design are

listed below.

2. Materials and methods:

2.1 Data:

There are 1018 lung cases in LIDC database collected from Seven academic centers and eight

medical imaging companies. Four radiologists independently reviewed each CT scan and

suspicious marked lesions with five malignancy level ratings. There are five different ratings of

the malignancy levels raging from 1 to 5, and level 1 and 2 represent benign cases and level 4

and 5 denote malignant cases.

For every nodule, we removed top layer and bottom layer of each cubic, because of their sizes

and shapes might significantly different from rest of the layers, and they are not representative

for the nodule. The rest of the layers, the nodule areas were segmented based on the union of the

four radiologists’ truth files. If the segmented region of interest (ROI) can be fitted into a 52 by

52 pixel rectangular, this ROI was placed into the center of the box. For the ROIs exceed this

size, they will be down sampled to the size of 52 by 52 pixels. Then each ROI will be rotated to

four different directions, and converted to four single vectors with each represent one ROI at one

direction. All the pixel values in the vectors were down sampled to 8 bits. From these 1018 cases

we generated 174412 vectors are acquired and each vector has 2704 elements. According to the

malignancy levels provided by the four radiologists, we calculated the average ratings of the four

radiologists, and the final truth file was made based on the average ratings. The distribution of

malignancy levels of the data is shown in table 1. All the intermediate cases (level 3) were

eliminated, and 114728 vectors were remained and used for classification. Among them, 54880

cases were benign and 59848 cases were malignant.

Table 1: The distribution of malignancy likelihood level of each nodule.

Amount 20500 34380 59684 26316 33532

2.2 Methods:

In this study, three deep learning models, Convolutional Neural Network (CNN), Deep Belief

Networks (DBNs), Stacked Denoising Autoencoder (SDAE) were implemented and compared

on the same dataset. Al the codes and experiments were implemented and run on a machine with

2.8 GHz Intel Core i7 processor and 16 GB 1600 MHz DDR3 memory.

The architecture of our CNN contains 8 layers, and except input and output layer, every odd

number layer is a convolution layer and every even number layer is a pooling and subsampling

layer [6]. For each convolution we used 12, 8 and 6 feature maps, and they are all connected to

the pooling layers through 5 by 5 kernels. The batch size was set to 100 and the learning rate was

1 for 100 epochs. The details of each layer is shown in the figure 1.

Input image

Convolution

Feature map: 12 Kernel size: 5

Pooling

Scale 2

Convolution

Feature map: 8 Kernel size: 5

Pooling

Scale 2

Convolution

Feature map: 6 Kernel size: 5

Pooling

Scale 2

00

Output neuron

The second deep learning algorithm we tried was DBNs, it was obtained by training and stacking

four layers of Restricted Boltzmann Machine (RBM) in a greedy fashion. Each layer contains

100 RBM, and RBM doesn’t allows the interactions either between the hidden units or between

visible units with each other. The trained stack of RBMs were used to initialize a feed-forward

neural network for classification. For the output vector hk of layer k is computed by using the

output hk−1 of the previous layer k-1 following the formulation hk = tanh(bk + Wk hk−1), where

parameters bk represents the vector of offsets and Wk is a matrix of weights [7].

The third model we tested was three layer SDAE [8] and each autoencoder was stacked on the

top of each other. The structure is similar to the DBNs motioned above. There are 2000, 1000,

and 400 hidden neurons in each autoencoder with corruption level of 0.5. For both DBN and

SDAE the size of batches was set to 100, and the learning rate was 0.01 for all the 100 epochs.

To compare the performance of deep learning algorithms and traditional CAD scheme, we also

tested the same dataset on our traditional CAD system. We extracted the 35 features, including

30 texture features and 5 morphological features from each ROI, and these features are proved to

be useful features in our previous studies. The 30 texture feature include 22 features from Grey-

Level Co-occurrence Matrix (uniformity, entropy, dissimilarity, inertia, inverse difference,

correlation, homogeneity, autocorrelation, cluster shade, cluster prominence, maximum

probability, sum of squares, sum average, sum variance, sum entropy, difference variance,

difference entropy, information measures of correlation, information measures of correlation,

maximal correlation coefficient, inverse difference normalized, inverse difference moment

normalized), and 8 wavelet features (mean and variance from four combinations of high-pass

filters and low-pass filters). The 5 morphological features are area, skewness, mean intensity,

variance and entropy. Then the kernel based support vector machine (SVM) was used to train the

classifier on the same training data.

3. Results:

The CNN algorithm can help the computer learn its own features, instead of using human-

designed features. There are 600, 400 and 300 feature maps in each layer, some examples of the

learned features in layer 1 are shown in figure 2. From the figure we can see different curves

representing the characters of lower left corners of the nodules. There are 12, 96, and 48 kernels

in layer 1, 2, 3 respectively, and Figure 3 and 4 are the visualizations of the kernels in the first

and last layer. The final mean squared error of the training data is 0.1347, and its change with

iterations is shown in Figure 5.

Figure 4: The visualization of the 48 kernels in the third layer.

0.26

0.24

0.22

0.2

0.18

0.16

0.14

0.12

0:5 1 1.5 2 2:5 3 31.5

Iterations x10°

Figure 6 and 7 shows the visualization of the weights in first and second RBM, which is a key

module for DBNs. Figure 8 shows the visualization of the weights of the neurons in SDAE.

-=I

N MI

MEE

E NE

Ell

MEE

N MI

MEE

100 500 600

f layer RB

BM

!MENEM

60

BE!

80

Figure 7:

7 Visualizatiion of 100 raandom weighhts in the seccond layer RBM

R

w of thhe neurons inn the first layyer of SDAE

E

The compparison of allgorithm acccuracies wass shown in Table

T 2. From

m these threee deep learniing

algorithm

ms, we found d DBNs achiieved the besst performannce in terms of accuracy on testing data d

and meann squared errror on traininng data. Forr the compariison reason, we also testted the

traditionaal CADx sysstem on the same

s datasett. Our featurre set containns one groupp of texture

features and

a one grou up of densityy features. When

W we only use texturee features, thhe accuracy is

0.7409 att the thresho

old of 0.62577 which maxximize the arrea of largestt rectangularr under the ROC,

R

and AUC is 0.7914. When only use density features, the accuracy is 0.7814 and AUC is 0.8342.

When combine these features together, we get the accuracy of 0.7940 and AUC is 0.8427.

To analyze the influence of nodule size on our algorithm accuracy, we measured the nodule

pixels for the mislabeled cases, and reported the mean and standard deviation in the Table 3.

Since the distribution of nodule areas do not follow normal distribution, we conducted the Mann-

Whitney test, the p value for the test of any pair of the two groups is less than 0.0001.

Table 2: Accuracy comparison of three deep learning algorithms

on training data testing data

Mean of Standard

Group nodule deviation of

size nodule size

Mislabeled cases

200 204

in DBN

Mislabeled cases

in traditional 192 196

CADx

All tested cases 247 245

4. Conclusions:

In this study, we tested the feasibility of using deep structured algorithms in lung cancer image

diagnosis. We implemented and compared the performance of three different deep learning

algorithms: CNN, DBNs, and SDAE, and the highest accuracy we get is 0.8119 using DBNs.

This accuracy is slightly higher than 0.7940 computed from traditional CADx system. The

comparison results demonstrated the great potential for deep structured algorithm and computer

learned features used in medical imaging area.

Defining the size of ROI is a very important step to apply deep learning algorithms to lung image

diagnosis. In many other image recognition tasks, like ImageNet classification challenge, the size

of the objects doesn’t have significant impact on the classification results, so all the images can

be simply down sampled to the same size. However, for lung cancer image diagnosis, the size of

nodules and how you crop the nodule areas are important, because the absolute nodule size is

one of the most important measurements for malignancy likelihood. We compared the nodule

size in pixels of the mislabeled cases using DBNs and traditional CADx, and the mislabeled

nodules using DBNs are 4% larger. One possible explanation is the larger nodules have to be

down sampled to fit into our selected ROIs, that might lose some of the shape information.

This study is a preliminary study to use deep learning algorithms to diagnose lung cancer, the

results showed very promising performance of the deep learning algorithms. In future, we will

test more deep structured schemes for lung cancer diagnosis, and find out more efficient way to

minimize the down sample effect.

References:

[1] Geoffrey E. Hinton, Simon Osindero, and Yee W. Teh. A fast learn- ing algorithm for deep

belief nets. Neural Comput., 18(7):1527– 1554, July 2006.

[2] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep

convolutional neural networks. In Advances in neural information processing systems (pp. 1097-

1105).

[3] Cireşan, D. C., Giusti, A., Gambardella, L. M., & Schmidhuber, J. (2013). Mitosis detection

in breast cancer histology images with deep neural networks. In Medical Image Computing and

Computer-Assisted Intervention–MICCAI 2013 (pp. 411-418). Springer Berlin Heidelberg.

[4] Samala, R. K., Chan, H. P., Lu, Y., Hadjiiski, L. M., Wei, J., & Helvie, M. A. (2014). Digital

breast tomosynthesis: computer-aided detection of clustered microcalcifications on planar

projection images. Physics in medicine and biology, 59(23), 7457.

[5] Samala, R. K., Chan, H. P., Lu, Y., Hadjiiski, L. M., Wei, J., & Helvie, M. A. (2015).

Computer-aided detection system for clustered microcalcifications in digital breast

tomosynthesis using joint information from volumetric and planar projection images. Physics in

medicine and biology, 60(21), 8457.

[6] Palm, R. B. (2012). Prediction as a candidate for learning deep hierarchical models of data.

Technical University of Denmark.

[7] Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends® in Machine

Learning, 2(1), 1-127.

[8] Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of

deep networks. Advances in neural information processing systems, 19, 153.

DownloadedViewFrom:

publicationhttp://proceedings.spiedigitallibrary.org/

stats on 04/17/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

- 5- Artificial Intelligence in Surgery -Promises and Perils..pdfHochgeladen vonKayo Henrique Fernandes
- Machine LearningHochgeladen vonAnurag Singh
- GoodFellowGANforMedicine.pdfHochgeladen vontansoei
- Tensor FlowHochgeladen voncnm_scribd
- Nicholas Troutman ResumeHochgeladen vonNick Troutman
- Report on AIHochgeladen vonSamantha Hazell
- Tensorflow - whitepaper2015Hochgeladen vonJaco Jansen
- Artificial+Intelligence+with+Python+Nanodegree+Syllabus+9-5Hochgeladen vonKapildev Kumar
- Machine Learning and a i for Risk ManagementHochgeladen vonpierme
- Blazingly Fast SegHochgeladen vonaditya
- Feature Engineering HandoutHochgeladen vonTanoy Dewanjee
- Quantized Neural Networks: Training Neural Networks withLow Precision Weights and ActivationsHochgeladen vonmorganknutson
- Geomorphometry Concepts, Software, Applications.pdfHochgeladen vonThe ENTP
- GI-report No.9Hochgeladen vonVăn Thuấn Ngô
- Data Scientist Key Points For Responsibility.docxHochgeladen vonSyed Kashif Ali
- final research paperHochgeladen vonapi-431898796
- DMMMM.pdfHochgeladen vonteddy demissie
- AI Skills Gap Infographic.pdfHochgeladen vonishaan90
- Srivastav a 14 AHochgeladen vonxj112358
- Ng Beyond Short Snippets 2015 CVPR PaperHochgeladen vonArmin M Kardovic
- mini project for computer studentHochgeladen vonRoshani Prasad
- Introduction to Big Data and Deep LearningHochgeladen vonAaqib Inam
- MailHochgeladen vonGoutam Agrawal
- pone.0168606Hochgeladen vonFahrur Rozi Illahi
- Kaczor Ow SkiHochgeladen vonUmesh Solanki
- h2oresearchHochgeladen vonapi-306051411
- Open Recruitment Week 2 Agustus 2016Hochgeladen vonYulius Kristanto
- Singh 2015Hochgeladen vonKMBA
- Reference Text and Citations_ChamanHochgeladen vonDirector BGC
- B2B Brochure Cambridge BAHochgeladen vonKrishna Chaitanya

- Kim 2016Hochgeladen vonMy Life
- 0900766b813827a9.pdfHochgeladen vonMy Life
- ijaerv13n12_04_2Hochgeladen vonMy Life
- ICML2013Hochgeladen vonMy Life
- Intro to Image CompressionHochgeladen vonfmail701
- 615-WR0097Hochgeladen vonMy Life
- ijaerv13n12_04_2.pdfHochgeladen vonMy Life
- Women Safety DeviceHochgeladen vonMy Life
- Coates 2011Hochgeladen vonMy Life
- 13_IEEEICNTET_NodeMCU12eShieldBasewithTftNextionanExperimentforIoTProjectHochgeladen vonMy Life
- 1608.05148v2Hochgeladen vonMy Life
- 1533Küchhold2018Hochgeladen vonMy Life
- Simo Es 2016Hochgeladen vonMy Life
- Alexandre an Autoencoder-based Learned CVPR 2018 PaperHochgeladen vonMy Life
- An Introduction to Image CompressionHochgeladen vonMy Life

- DBMS Example Midterm Exam 1 Check ItHochgeladen vonJohnsands111
- PIC16F1708Hochgeladen vonelvinguitar
- Teradata Interview Q&AHochgeladen vonmadhusudhan reddy
- D50102GC20_sg2Hochgeladen vonAida Rodriguez
- 4 Bit MultiplierHochgeladen vonİrem Sedef
- SW-TM4C-TOOLS-UG-2.1.0.12573Hochgeladen vontrungkiena6
- Basic of Finite Difference MethodHochgeladen vonpoliskarma
- Communication Matlab Proakis PDF Filetype PDFHochgeladen vonApril
- JVM Troubleshooting GuideHochgeladen vonAndrei Maresu
- Digital Logic DesignHochgeladen vonShareef Khan
- solutionsTT.02_Gallagher-solutions TT 02.pdfHochgeladen vonmeister_eric
- Access HandbookHochgeladen vonShahid Malik
- Configuration Guide for Acme Packet SBC Local CDRsHochgeladen vonMichael Curtis
- GR3ANSHochgeladen vonpoo66777
- Privacy-Preserving Ride Sharing Scheme for Autonomous VehiclesHochgeladen vonAnonymous lPvvgiQjR
- AIX VUG Peftuning Part I Tactical Monitoring V3.0 Jul29-10Hochgeladen vonJoanne Peterson
- Microsoft VDI Licensing PresentationHochgeladen vonSady
- ImageStegnography(Synopsis)-@mumbai-academicsHochgeladen vonMumbai Academics
- compiler notesHochgeladen vonARPIT 20
- 140 Top ASP.net Multiple Choice Questions and Answers _ Multiple Choice Questions and Answers PDF for Beginners ExperiencedHochgeladen vonYazan Aswad
- MultiSpec_Description.docHochgeladen vonLiineTth NúñEz
- SharePoint 2010 STSASMHochgeladen vonSreedhar Konduru
- D1 Linear Programming TermsHochgeladen vonboostoberoi
- 1 Program SpecificationHochgeladen vonna2nk70
- 446-09 Sig Flow Graph (N)Hochgeladen vonArif Gulten
- Sap Abap to Store in BrainHochgeladen vonPrashant Kavitkar
- Text Processing (Complete)Hochgeladen vonWeiYewHuong
- Kohonen_Self_Organizing_NetworkHochgeladen vonമിഷ രവി
- Example List With LabsHochgeladen vonMukund Jai
- Rman Performance WpHochgeladen vonJitesh Varshney