Beruflich Dokumente
Kultur Dokumente
This thesis explores ways to use known image processing and machine
learning techniques for computer-aided breast cancer detection using
mammography images to find a potentially good method. for the detection
of computer-assisted breast cancer based on mammography images, and
assists the pathologist in making decision.
The concrete application is designed and applied, including both
primary image processing and subsequent cancer detection through the use
of neural network-based machine learning algorithms. The app is evaluated
on a set of mammograms and the results are presented in detail and
discussed. This thesis is characterized by the use of a combination of neural
networks in the detection of breast cancer, as well as the use of real breast
cancer images obtained in the digital breast examination database (DDSM).
INTRODUCTION
Breast cancer is the most common form of cancer in women worldwide and affects an
average of 1.4 million people annually (Autier, et al., 2010). Breast cancer occurs 1 in
every 5 years, Figure 1.1. According to the National Agency for Research on Cancer
(Ferlay, Shin, Bray, Forman, Mathers & Parkin, 2010), it is also the most common form of
cancer, accounting for one in eight deaths. to do ten. Each year, more than 150,000 women
worldwide die from breast cancer (Fairley, Shane, Bray, Foreman, Mathers & Parkin,
2010). Only 1% of breast cancers occur in humans (Gunderman, 2006).
The rate of living and also the ailment diagnosis vary significantly based on the stage of
the cancer and cancer arrange. Cancer treatment is more efficient and effective at the early
stage of detection, so to avoid its advancement into a more extreme stage.
Mortality of Breast cancer is very high when linked to other different types of cancer.
Finding and analysing of breast cancer can be accomplished by using imaging procedures
such as the diagnostic-mammograms also known as x-rays, ultrasound (sonography),
magnetic resonance imaging and thermography. In more than four decades, investigations
have been done for cancer imaging screening. In any case, biopsy seems to be the best way
to determine if certainty of cancer extremely exists. Among the biopsy methods, the fine
needle aspiration, vacuum-assisted, core needle biopsy, and surgical (open) biopsy (SOB)
are the most common. Collecting of cells or tissues samples that are fixed across a glass
microscope for later staining and microscopic examination are procedure needed in all this
techniques
Histopathological investigation is a profoundly tedious expert assignment reliant on the
experience of the pathologists and affected by variables, for example, tiredness plus
decrease of attention. There is a squeezing requirement for (CAD) computer assisted
diagnosis to calm or ease the job on pathologists via separating clearly benign regions, so
that the specialists can focus on the extra difficult to diagnose cases. Alot of endeavors has
in this manner stayed dedicated to the field of BC histopathology picture investigation, and
specifically to the computerized classification of benign or malignant images, for
computer-aided diagnosis.
Pathology labs have begun to move towards a completely digital workflow, with the
utilization of digital slides being the principle segment of this procedure. This was made
conceivable by the presentation of scanners for whole slide imaging (WSI) that empower
financially savvy generation of digital portrayals of glass slides. Notwithstanding many
advantages as far as storage and perusing limits of the image data one of the upsides of
digital slides is that they empower the utilization of Image analysis techniques that intend
to deliver quantitative highlights to help pathologists in their work. An automatic mitosis
discovery technique with great execution could ease both the subjectivity and the
tediousness of manual mitosis counting, for instance, by independently producing a mitotic
activity score or guiding the pathologist to the region within the tissue with highest mitotic
activity. Automatic feature selection is achieved using a novel feature weighting scheme.
Feature weights depend on the significance of a feature and we dismiss features with low
weights. A new generation of forest (new population of trees) is created which operates on
a reduced feature set. Through the test phase, each tree of the taught forest votes with their
matching weights to execute the classification.
LITERATURE REVIEW
Figure 2.4 – Mass examples with diverse shapes and borders (from (Arnau, 2007)).
Subject on the morphology, the masses have different malignant probability. The ill-
defined and speculated boundaries have higher probability of malignancy (Arnau, 2007). A
benign method is usually connected with the existence of circular or oval masses.
However, the great variability of the mass appearance is an obstacle to a correct
mammography analysis (Mini & Thomas, 2003). Some masses can incorporate micro
calcifications, as in Figure` 2.5. A craniocaudal sight of the right breast demonstrates
benign vascular calcifications as well as two well-circumscribed masses holding “popcorn”
calcifications classic for involuting fibroadenomas (Gunderman`, 2006)
Figure 2.5 – A craniocaudal view of the right breast (from (Gunderman, 2006)).
When cancer spread`s to new parts of the human body through blood and lymph
circulation, its known as metastization. When the ductal carcinoma invades the skin of the
nipple is called Paget`s disease. Inflammatory breast cancer corresponds to an aggressive
tumor that invaded the dermal lymphatics (Gunderman, 2006), representing about 1 to 4%
of the breast cancer. This cancer usually presents breast inflammation.
Medullary breast carcinoma arises from the stromal cells of the breast (Gunderman, 2006).
Mucinous carcinoma is associated with enormous volumes of cytoplasmic mucin
(Gunderman, 2006). The last two types of cancer generally experience lower ability to
create metastasis than the ductal and lobular.
2.3.1 ULTRASONOGRAPHY
Ultrasound imaging or 'sonography' has a tendency to be utilized as a part of breast cancer
screening as a 'second look' or follow up application. The typical signs for breast
ultrasound would be a suspicious finding on mammography or for promote indicative
assessment of a tangible injury felt on a clinical breast exam.
Notwithstanding, in light of the fact that a lady is sent for a subsequent sonogram is no
motivation to have lifted nerves about breast cancer. Ultrasound is especially useful in
recognizing a strong mass and a liquid filled blister, which is the thing that a lion's share of
breast sores end up being.
Ultrasound is likewise valuable in finding little injuries that are too little to be felt at a
clinical exam.
Fig: 2. The ultrasound picture demonstrates a dull zone just beneath the skin. I'm not sure
what it is. For this situation, inflammation is more probable than cancer.
Ultrasound imaging utilizes high recurrence sound waves to frame a picture, called a
'sonogram'. The sound waves it utilizes are safe and go through the breast and skip back or
'reverberate' from different tissues to frame a photo of the inward structures. An unforeseen
'resound' implies that there is a strong knob or some likeness thereof inside the tissue.
There is no radiation engaged with ultrasound imaging, which makes it a favored
technique for demonstrative imaging for pregnant ladies.
2.3.1.4 Cost and Practical Considerations of Ultrasound for Breast Cancer Screening
Ultrasound imaging isn't generally any more costly than mammography, and from
numerous points of view it is more helpful. The issue is, all suspicious ultrasound
discoveries are uncertain and wind up being alluded for biopsy at any rate. This must be
weighed against the cost and viability of mammographic screening all in all, which has a
tendency to give better confirmation of the idea of a sore as for the requirement for a
biopsy.
2.3.1.6 What can an ultrasound uncover about a potential breast cancer sore?
A sonogram gives a decent sign of the fluid or strong nature of an injury, or maybe a mix.
Fluid masses (sores) have a tendency to be darker in shading, and homogeneous. An
accomplished radiologist picks up a vibe for what the distinctive surfaces of a sonogram
have a tendency to speak to. The state of a sore and furthermore its edge (the attributes of
its edges) are additionally very apparent on sonograms. This decides if an injury is
cancerous or benevolent (cancerous injuries have a tendency to have spiked edges). Breast
cancer sores additionally have a tendency to be to some degree arbitrary fit as a fiddle, yet
not generally. Kindhearted fibroadenomas are generally round or oval. In any case,
ultrasound is certainly not a complete test, and tissue investigation through biopsy is
normally required. Notwithstanding when ultrasound recommends the nearness of a
sinewy knob or complex sore, a biopsy is as yet advocated. Up to 15% of these kinds of
developments wind up being threatening.
Any echoes on the sonogram (a change from the sound on its way back contrasted with on
it's way in) shows that a strong knob or some likeness thereof has hindered the way of the
sound wave. Examination of the strong knobs on a breast sonogram requires significant
aptitude, and can give encourage clearness with regards to the amiable or harmful nature of
the sore. (Steven Halls, June 22, 2018)
Figure 2.8: cancer of the breast pointed by arrow shown in MRI scan
2.4.1 Masses
In mass recognition, every mass ROI comprises a solitary mass. Mass detection can be
assessed by a free response ROC analysis for several thresholds on the base nesting depth,
the percentage of mass ROIs intersecting a location (i.e. the sensitivity) and the quantity of
identifications interconnecting no mass ROI (i.e. the quantity of wrong positives) per
pathological mammogram (Quellec et al, 2016). By and large, risky sores have an added
prominent radiographic thickness than an equivalent volume of fibro glandular breast
tissue. Lucent wounds are frequently kind. Parted singular non-cystic densities should be
measured painstakingly. And when bigger than 8 mm in distance across, they should be
considered for biopsy. More unpredictable the state of damage, more probable is the mass
to be cancerous. A sporadic or guessed edge is a collective imperative element displaying
that the mass is harmful. The cancers which are less penetrating may have just marginally
irregular or even all around circumscribed margins; papillary, medullary and colloid
carcinomas are liable to remain all around circumscribed. The margin of the mass will be
abruptly defined in a considerate injury, for example, a fibro adenoma or a cyst, till the
mass having this presence maybe malignant in about 7%` of cases. An intramammary
lymph node is regularly very much circumscribed, small in shape, and regularly establish
in the upper outside quadrant of the breast. Asymmetric breast tissue closely always can be
known from a genuine mass by means of mammographic evaluation. The refinement of a
small stellate mass from an initial invasive breast cancer is often to a great degree
unpretentious, so optimal system and careful interpretations are vital. Favourable stellate
masses, for instance, post-biopsy scamming and fat necrosis, as frequently as likely have a
characteristic appearance. One ought to take subsequently rather than execute a biopsy on
most nonspecific circumscribed masses because they mostly are seen on mammograms and
have fewer than 5% chance of being malignant. A typical case of a probable kind wound is
a non-calcified mass or knob with very much clear margins. Markowitz reported an
investigation of 593 non calcified masses larger than 1.0 cm seen by mammography and
found that about 2% of them ended up being malignant. More than half of these masses
were appeared to be basic cysts on aspiration or on ultrasound evaluation. The etiology of
most non calcified strong masses can be controlled by palpation or by ultrasound-guided
final biopsy.
2.4.2 Micro-Calcifications
Large thick calcifications of an amiable, involutional fibro adenoma, when related with a
lobulated mass, are diagnostic of favourable procedure; when an involutional method is
creating, it might be indistinguishable from a malignant tumor, and biopsy need to be
accomplished. Calcification might happen in fat necrosis or in the walls of a cyst.
Punctuate, pointed irregular calcifications that are heterogeneous in size inside a mass, or
fine, branching calcific deposits filling ducts, are solid indicators of cancer. Half of
malignant masses have calcifications that can be seen by mammography. About 20-35% of
radiographically identified clustered calcifications without a mass will be malignant, and
the vast majority of these will speak to non-invasive cancer. Mammography is profoundly
delicate in identifying breast calcifications; however the specificity in recognizing
considerate from malignant calcifications is just 50-60% (McKenna, 1994). The cluster of
couple of calcifications ought not to be considered clearly considerate, rather, it ought to be
considered at generally safe for being malignant and took after at 4 to 6 month intervals. At
the point when more than a couple of calcifications are available without an associated
mass, the choice to perform a biopsy may be more troublesome the most minor
calcifications, as minor as 0.2 mm, might be extra suspicious than the 2-mm calcification.
Kind-hearted and malignant calcifications may possibly coincide in the same breast.
Calcifications associated with fibrocystic changes in the breast often impersonate those
found in malignancy, leading to an unavoidable false constructive report. Skin and
vascular calcifications must be recognized from mammary lesions, and biopsies don’t
should be performed on them. Involvement in deciphering mammograms is necessary to
minimize the quantity of biopsies that have amiable results.
2.4.5 Objectives
This thesis suggests a techniquefor detecting breast cancer in mammography images. The
technique consists of two main parts. In the first part, image processing techniques are
used to prepare images for feature and pattern extraction processes. The extricated
highlights are used as a contribution to a neural organize and calculated relapse machine
learning calculation. This algorithm is a directed machine learning algorithm that is
prepared with input images.
The core objectives and aim of this work can be simplified as follows:
1- Application of a new CAD system for cancer of the breast diagnosis.
2- Exploiting image processing techniques and oversaw machine learning in the new
proposed model.
3- Growing the accuracy in cancer of the breast detection.
4- Decreasing the wrong positive probability in cancer of the breast diagnosis process.
Kiyan, et, al (2004) used a database`in machine learning neural network and also signal
processing. In other to raise the objectivity and precision of breast cancer diagnosis,
statistical neural networks are used.
Nabil, et, al (2008) used and implemented the genetic algorithm and artificial immune
system and the hybrid algorithm and tested in the Wisconsin breast cancer diagnosis
(WBCD) problem in order to create a fuzzy rule system for caner of the breast diagnosis.
The hybrid algorithm generated a fuzzy system which reached the extreme classification
ratio earlier than the two other ones.
In a study done by (Sheppard`, A. P., Sok`, R.M., & Averdunk`, H. (2004), the method for
doing segmentation of images of porous and composite materials that were gotten from X-
Ray tomography was discussed. This technique involves a three-stage approach;
anisotrophic diffusion, application of unsharp mask sharpening filter and application of
watershed and active contour. At the begining stage, the structures in the image are
preserved.
Rejani and Selvi (2009) present a research a tumor detection algorithm from
mammograms. The suggested system focuses on the solution of two problems. One is how
to detect tumors as suspicious region with a very weak contrast to their background and
anther is how to extract features which categorize tumors. The tumor discovery technique
follows the scheme of mammogram enhancement, the segmentation of the tumor area, the
extraction of features from the segmented tumor area,` and the use of SVM (Support
Vector` Machine) classifier.
(Emma Regentova, et al, 2006) explored the execution of statistical modeling of digital
mammograms by methods for wavelet domain Hidden Markov Tree Model (WHMT) for
its consideration to a PC helped indicative inciting framework for distinguishing
microcalcification (MC) groups. In their investigation, the framework consolidates: net
segmentation of mammograms for acquiring the bosom locale, wiping out the pepper-type
commotion, piece shrewd wavelet change of the bosom flag and probability figuring,
picture segmentation and post preparing for holding MC bunches. FROC bends are
acquired for all MC bunches containing mammograms of scaled down MIAS database.
100% of genuine positive cases are identified by the framework at 2.9 false positives for
each case. All things considered, there outlined a technique for MC discovery in view of a
segmentation algorithm that uses WHMT modeling. By utilizing db3 filter, a particular
tree structure and weighting of probability esteems in MLE strategy, the capacity of the
HMT model is incredibly advanced for the point. The technique performs well in the
structure of the framework they created. A reasonable assessment of the entire framework
has been performed with smaller than usual MIAS database. The focused on exactness of
genuine positive cases is 100%, in light of the fact that the framework is proposed for
analytic provoking. False positive cases which are kept at as low rates as conceivable can
be additionally segregated by radiologists. Along these lines, the objective of furnishing
radiologists with all obvious MC
Microcalcifications are early indication of breast cancer show up as disconnected splendid
game in mammogram pictures, which are hard to identify because of their tiny size. In this
investigation, (JuCheng Yang`, DongSun` Park 2004) presented morphological bandpass
channel (MBF) to identify microcalcifications, which is actualized by opening the first
picture two times with two distinctive structure components individually, and subtracting
one opened picture utilizing another that can decay the picture information premium
wrecks space picture where microcalcifications have a trend to displayup. Arrangement of
MBFs are tuned for the location errand, and twofold picture contained microcalcifications
district of-intrigue (ROI) will be acquired. Test comes about demonstrate that the proposed
technique with these bandpass filters can perceive ROI with a genuine positive of 93.07%
and a bogus positive of 4.34%. Contrasting with the outstanding discreet wavelet transform
(DWT) strategy, this technique is more precise in positions and sizes of microcalcifications
Gayathri, et, al (2013) used numerous machine learning algorithm`s (Supervised Learning,
Unsupervised Learning, Semi-supervised Learning, Transduction, and Learning to learn)
and techniques to improve the precision of predicting cancer of the breast.
Gayathri, et, al (2013) used various machine learning algorithms (Supervised Learning,
Unsupervised Learning, Semi`-supervised Learning, Transduction, and Learning to learn)
and techniques to increase the correctness of guessing breast cancer`.
It is evident from previous studies that more research is needed to advance the accuracy of
early finding of breast cancer, using a realistic dataset of mammogram.
CHAPTER 3
3.1 BACKGROUND
In chapter three of this work, the image processing techniques and machine learning is
discuss and explain (artificial neural network and logistic regression) were also used in
carrying out this work.
Figure4.1 Usual stages of CAD system operation, showing the significance of image
processing in the course of the first stages. (Saad, 2012).
Where F is the Fourier transform of an "ideal" version of a given image, and H is the
blurring function. In this case H is a sinc function: if three pixels in a line contain info from
the same point on an image, the digital image will seem to have been convolved with a
three-point boxcar in the time domain. Ideally one could reverse-engineer a Fest, or F
estimate, if G and H are known. This technique is known as inverse filtering.
The discrete wavelet transform allows a signal to be sampled at discrete points, which
leads to an efficient computation. Discrete wavelets are scaled and translated into
indiscrete steps which Erickson (2015) believes is achieved using scaling and translation of
integers instead of real numbers.
Again, f is the signal in the time domain, t equals time, C is the wavelet coefficient, s is
scale, with as translation. * s, is known also as the mother wavelet (Erickson, 2005).
Both the Fourier and wavelet transform allow a temporal signal to be analysed for the
purpose of its frequency content. The Fourier transform is known also as a linear transform
that indicates a function with a base function of cosine and sine. Similarly, the wavelet
transform is also a linear transform that represents a function with a basis of wavelet
functions. Finally, with both the Fourier transform and wavelet transform, an inverse
transform returns the original signal (Erickson.2005).
I i, j
Where : Is the gradient in I direction
i
I i, j
Is the gradient in j direction.
j
⌈𝐺⌉ = √𝐺𝑖 2 + 𝐺𝑗 2
3.5.3.2 Second Order Derivative Based Edge Detection (Laplacian based Edge
Detection):
This method searches for zero crossings in the second derivative of the image to search for
edges. An image edge has the one-dimensional shape of a ramp and image derivatives; its
location can be highlighted. This method is one of the characters in the "Gradient Filter"
contour detection filter family. For a pixel location to be declared an edge location, the
value of its gradient must exceed a certain threshold. As opined earlier, edges have higher
pixel intensity values than those surrounding it. Therefore, once a threshold is set,
comparisons can be drawn between the gradient value and threshold value and edges can
be detected whenever the threshold is exceeded. In addition, when the first derivative is at
a maximum maximum, the second derivative is zero. As a result, another alternative for
finding the location of an image edge is to locate zeros in the second derivative of the
image.
(A) ideal step edge, (b) first order derivation and (c) second order derivation.
This approach uses zero-crossing operator which acts by locating zeros of the second
derivatives of image I (i, j). The differential operator is used in zero crossing edge
2 I 2 I
detectors, 2 I j
i 2 j 2
Thresholding allots a range of pixel values to any object of interest. It works best with
greyscale images that utilize the whole range of greyscale. For the image I (i,j), the
threshold image g (i, j) is defined as,
1 if I (i, j )T
g (i, j ) Where T is the threshold value.
0 if I (i, j ) T
Laplacian of characteristics
Finding the Malfunctioning
in all directions
Gaussian(LoG) (Marr- correct places of at the corners, curves
Hildreth) edges, Testing and where the gray
wider area level intensity function
around the pixel varies. Not finding the
orientation of edge
Gaussian(Canny, Using Complex
because of using the
Shen-Castan) probability for Laplacian filter
Computations, False
finding error zero crossing, Time
rate, consuming
Localization
and response.
Improving
signal to noise
ratio, Better
detection
specially in
noise conditions
3.5.3.4 Laplacian of Gaussian
Laplacian morphology illustrates the area of rapid change and is therefore often used in
contour research. Laplacian imaging that is used near objects with Gausian facial filters is
often used to learn about noise. The service provider often uses the gray graphs to create
and create the next level gray image.
2 I 2 I
L( x, y) j
x 2 y 2
Summary Table
3.7 Machine Learning and Algorithm
In this work, controlled learning was used. Two different kinds of supervised machine
learning algorithms was been employed; logistic regression and neural networks. We
compared the results of these two algorithms in our work. In the following sections we will
introduce these types. Logistic regression algorithm plus neural network are also the
standard method employed for clinical classification problems.
Hidden layer may have a large number of out of sight processing nodes. A feed – forward
back–propagation network broadcasts the information from the input layer onto the output
layers, matches the network output with identified target, and propagates the inaccuracy
term from the output layer back to the input layer, using a learning mechanism to adjust the
loads and biases.(Rahman, et, al 2013).
hiddenSizes Row vector of one or more hidden layer sizes (default = 10)
trainFcn Training function (default = 'trainscg')
performFcn Performance function (default = 'crossentropy')
hiddenSizes Row vector of one or more hidden layer sizes (default = 10)
trainFcn Training function (default = 'trainlm' |'trainbr' | 'trainbfg' | 'trainrp' | 'trainscg')
This chapter discusses the implemented model. It starts by explaining the two main layers
of the model; image pre-processing and machine learning. Subsequently, this chapter will
present the experiments and the analytical results that were obtained. Detecting breast
cancer by utilizing mammography images is a two steps procedure. In the first step, images
are filtered, cropped and mapped into values that can be used as an input to a second step.
In the second step, the input data can be used to train the system to predict future cancer in
future images. Our model consists of these two steps or layers. In the following sections,
the result, and processing of the images will be discussed.
Figure 4.1 steps of our model
4.1 METHODOLOGY
Dataset Overview
The dataset we used in this research consists of three classes of breast mammography.
Which are benign, malignant and normal, Benign has 376 images, malignant 453 and
normal is 632 images.
Figure 4.1 (a) is a mammogram image of breast. Benign (non-cancerous) breast conditions
are unusual growths or other changes in the breast tissue that are not cancer
Figure 4.1(b) Mammogram image of malignant breast
If a tumor is found to be malignant, you have breast cancer or another form of cancer as
seen in figure 4.1(b) above. Malignant tumors are aggressive and will spread to other
surrounding tissues. When the tumor is identified, your doctor may recommend a biopsy to
identify how advanced the cancer is and how severe it is.
Figure 4.2 (c) Mammogram image of normal breast without cancer
Figure 4.1(c) is a mammogram of a normal fatty breast that does not have a lot of dense
tissue. A mammogram searching for abnormal lesions, benign lumps, or breast cancer is
more accurate when performed on women with non-dense breasts such as these.
The gray areas correspond to normal fatty tissue, while the white areas are normal breast
tissue with ducts and lobes. While breast masses also appear white on a mammogram, the
color is typically more concentrated because they are denser than other features of a
normal breast, like those seen in the figure 4.1(c).
The method here is applied to real clinical database of 1461 mammograms images. These
mammogram are divided into 90%, and 10% are used to train and test respectively for our
model.
4.1.1Image Resizing
The size of the mammogram image dataset obtain from the screening centers was 2744 by
4728, it was too large and require much space for this reason we resulted to resizing the
image to 256 by 256 all squared.
Figure 4.3 Resized image
4.1.2Wiener filter
Noise was removed from the cropped image through wiener filter
Our dataset images have fuzzy or blur effects. This effect is considered as noise in our
data. To remove and eliminate it, Wiener filter will be used.
In signal processing, Weiner filter is a technique that estimates the target signal by Linear
Time Invariant (LTI) processing on a noisy signal. In Matlab, Wiener filter is categorized
as a de-blurring filter. Figure 4.3 shows the image before and after applying Wiener filter.
We can observe that the white lines are less blurred in the figure 4.4(B) than the white lines
in the figure 4.4 (A).
Figure 4.4 Image before applying Wiener filter (B) Image after Wienerfilter
Figure 4.4 (a) is the image of the noisy image and the result after the wiener filter has been
applied to remove the constant power addictive noise (Gaussian white noise). A
neighborhood size of 5 by 5 was used to estimate the noisy image mean and standard
deviation.
Image transformation
The coefficients in of the other three matrices Approximation, CV, CH, have not been used
in our work. However the forth matrix which is defined as the CD (diagonal
decomposition) have been utilized as input data to our learning algorithm as will be seen in
the following sections. Figure 4.5 shows the output of DWT of one of the images in our
dataset. The image is shown in Figure4.5.
Figure 4.5 DWT image based on approximate image detail (LL), horizontal details (HL),
Figure 4.5 is the discrete wavelet decomposition of the noiseless image. Using DWT
techniques, the images are broken down into four parts: Approximate Image, Horizontal
Detail, Vertical Detail, and Diagonal Detail. When we apply a high frequency to an image,
the gray level varies greatly between the two adjacent pixels. So the edges have occurred in
the image. When we apply a low frequency to an image, the variations between the
adjacent pixels are smooth.So edges are not generated or very few edges are generated. All
information of image is remaining same as real image information (it display as
approximation image).
4.1.4 Zero crossing
The zero crossing algorithms was applied on the output of the DWT image as seen in
figure 4.6, in the above table 4.3, it is seen that when we apply zero crossing algorithm
transcore result we archive was 82% while the result archive without applying zero
crossing is 75% it is seen that the result we obtain when we compared the both results zero
crossing algorithm, improves the output of image and thus it was use in our model.
Table 4.4 NN result on HH Diagonal of DWT and Zero crossing
For the purpose of this thesis, the HH Diagonal of DWT was use because it produced a
better result when tested in our NN model with the zero crossing algorithm.
Table 4.4 shows the NN result of various discrete wavelet decomposition when combined
with Zero crossing that was used in this thesis. As seen in the Fig above, the average train
score and average test score of HH Diagonal is 89% and 65% respectively as compared to
LL Approximation coefficient, LH Vertical, HL Horizontal which have lower percentage
values respectively. This results show that the HH Diagonal having the highest values
produced better result.
4.2 Experiment
488 patients cases have been collected from The Digital Database for Screening
Mammography (DDSM) (Michael Heath, Kevin Bowyer et el 2001), at the University of
South Florida (K. Bowyer), and Sandia National Laboratories (P. Kegelmeyer), where
1461 images where extracted from these cases. These images are used to train and test our
model. 90% and 10% are the percentages that have been utilized for training, and testing.
Each one of these images has a resolution of 4696x3024pixels. From these images, 632 are
normal images; benign has 376 images, malignant 453 images. We arranged them and
created our result vector for trainings.
This vector has been used in our neural network model. The image was resized to a
coordinates of 256 by 256. The output matrices where fed as input to Wiener filter to de-
blur them. Subsequently, the output was used as an input to wavelet. Finally, zero crossing
values and data normalization ends the preparation process of our images.
The output matrices of the preprocessing step with the training vector that we prepared are
used to train our machine learning models. Figure 4.8 shows the average train score value
of the neural network model after image processing. We can observe that the average value
is 89%. Moreover, we can observe from figure 4.9 shows the best validation value which is
0.26%. Finally, Figure 4.12 shows the performance of the gradient of neural network.
4.3 Result
RESULTS ORBTAIN AFTER THE FOLLOWING PROCESS WAS PERFORM ON pgm FORMAT IMAGE DATASET
RESULT AFTER RESULT AFTER RESULT AFTER
RESULT WITHOUT IMAGE RESULT AFTER IMAGE
APPLYING ONLY APPLYING ONLY DWT APPLYING ONLY ZERO
PROCESSING PROCESSING
WIENER FILTER (HH Diagonal) CROSSING
HiddenLayerSize 10 10 10 10 10
three classes of breast mammography Benign = 376, Malignant = 453, Normal = 632.
Total number of image (dataset) 1461 1461 1461 1461 1461
Total number of trained image (dataset) 1315 1315 1315 1315 1315
Total number of test image (dataset) 146 146 146 146 146
Numiter = number of iteration 10 10 10 10 10
net = net = patternnet(hiddenLayerSize,trainFcn)
In order to obtain the average train error, and average train score we have to run the mat
lab program to obtain result after image processing, result without image processing,
Result after applying only wiener filter, result after applying only zero cross, and result
applying only dwt and result after applying dwt and zero crossing. In this process, we also
have to consider the following: the hidden layer size, three classes of breast
mammography, total number images(1461), total number of trained images(1315), total
number of test images() and Numiter=number of iteration.
The average test error was obtained by adding the average test result and the performance
test result from the neural network which resulted to the percentages as shown in table 4.5.
The average test score was obtained by dividing the average test and number of iteration
which is 10 Numiter in our case, which resulted to the percentages result shown in the
figure above. The difference in the result between average train error, average train score,
average test error, and average test score in the table above, is as a result of the different
output obtained from the different preprocessing techniques of the dataset that was used as
the input to the neural network in running the matlab program.
Figs. 4.7 show the confusion matrix obtained from our experiment. At the all confusion
matrix partitions, we obtain 11.0 – 89.0%, using our model as compared to (A.M. Abdel-
Zaher, A.M. Eldeib 2016) result which varied from (0.5– 99.5%) best classifier accuracy of
deep belief network (DBN-NN)
The above figure 4.9 is the characteristics receiver in which the NN is operating, it shows
the true positive rate / false positive rate of the training region ROC, validating ROC, test
ROC and all ROC put together
CHAPTER 5
Conclusions
Breast cancer is the most commonly diagnosed type of cancer in women. Although the
death rate is the second highest among women with cancer, early detection of the disease
greatly improves the chance of survival. Therefore, it is important to develop new and
improved methods for breast cancer screening.
This dissertation explored the potential benefits of a new proposed method for automated
detection of breast cancer using mammogram images.
The main contributions to the existing of knowledge are two: first, an overview of existing
image processing techniques currently used for CAD systems that can help diagnose breast
cancer; second, a new method for automated detection of breast cancer using
mammography images, image processing techniques and the machine learning algorithms.
The dissertation described in detail the new method proposed, its implementation in
Matlab and its evaluation on a dataset of 1461 breast mammogram images. The focus was
on exploring how the method performs in various conditions and not on providing an
overall accuracy result for the method.
The overarching goal of this thesis was to improve breast cancer screening by using neural
network to assist radiologists in the classification of breast lesions.
Additional, another aim of this thesis was to use data that was acquired from the Digital
Database for Screening Mammography (DDSM) (Michael Heath, Kevin Bowyer et el
2001), at the University of South Florida (K. Bowyer), and Sandia National Laboratories
(P. Kegelmeyer), where 1461 images where extracted from these cases. We used real data
this helped us to evaluate algorithms that has been used.
The main task of the thesis was to find features in the data that would distinguish normal
samples from those containing tumours. Use wavelet technique to extract features and
filters to reduced noises and fuzzy are well defined in Matlab.
Future Work
Considering the initial, exploratory nature of the work done for this dissertation, the results
are also informative with respect to potential directions for future work that are likely to
yield valuable results. For instance, a first future step would be to evaluate the method
more thoroughly by using other programming because Matlab did not allow us to use large
number of features with ANN.
The tests focused solely on the accuracy of tumour detection. However, additional tests on
other suitably annotated data sets can reveal the accuracy of the method in detecting each
type of tissue. In turn, this could be very helpful for practitioners and even to further
improve the diagnosis accuracy of the method, since it is known that two types of breast
tissue (the denser ones) can hide more easily signs of cancer so that they are often missed
at scans and not visible until later. Thus, reliable information on the distribution of such
tissue and perhaps even a technique to further investigate such tissue more thoroughly
could offer additional useful diagnosis help.
Another direction for future work is ,our focus on the examination of the image either
cancer or normal , it is possible to bring samples where the cancer is classified into
malignant and benign and use method to distinguish between type of cancer malignant and
benign.
REFERENCES
2016 IEEE International Conference on Systems , Man, and Cybernetics SMC
20161 October 9-12 , 2016 Budapest , HungaryA “Versatile Edge Preserving
Image Enhancement Approach For Medical Images Using Guided Filte”
Acr, 2013 ACR BI-RADS Atlas: Breast Imaging Reporting and Data System. American
College of Radiology, 2013.
Al-Shamlan, Hala, and Ali El-Zaart. “Feature extraction values for breast cancer
mammography images.” In Bioinformatics and Biomedical Technology (ICBBT),
2010 International Conference on, pp. 335-340. IEEE, 2010.
Altrichter M., Ludanyi, Z., Horvath, G., ”Joint analysis ofmultiple mammographic
views in cad systems for breast cancer detection,” In: Proc. of Image Analysis.
14th Scandinavian Conference, 2005.
Autier, P., Boniol, M., LaVecchia, C., Vatten, L., Gavin, A., Héry, C., et al. (2010).
Disparities in breast cancer mortality trends between 30 European countries:
retrospective trend analysis of WHO mortality database. BMJ 2010, 341:c3630.
Artificial Neural Networks.. Ani1 K. Jain Michigan State University.. Jianchang Mao
Baert, A., Reiser, M., Hricak, H., & Kanuth, M. (2010). Digital Mammography. Springer.
Baker, J., Rosen, E., Lo, J., Gimenez, E., Walsh, R., & Soo, M. (2003). Computer-
Aided Detection (CAD) in Screening Mammography: Sensitivity of
Commercial CAD Systems for Detecting Architectural Distortion. AJR Am J
Roentgenol., 181, No 4.
Barrett, & A. Gmitro, Information Processing in Medical Imaging (Vol. 687, pp. 472 -
486).
Brandt, Sami S., Gopal Karemore, Nico Karssemeijer, and Mads Nielsen. “An
anatomically oriented breast coordinate system for mammogram analysis.” IEEE
transactions on medical imaging 30, no. 10 (2011): 1841-1851.
Calas, Maria Julia Gregorio, Bianca Gutfilen, and Wagner Coelho de Albuquerque Pereira.
“CAD and mammography: why use this tool?.” Radiologia Brasileira 45, no. 1
(2012): 46-52.
Cao, Ying, Xin Hao, Xiaoen Zhu, and Shunren Xia. “An adaptive region growing
algorithm for breast masses in mammograms.” Frontiers of Electrical and Electronic
Engineering in China 5, no. 2 (2010): 128-136.
Chandrika Saxena, Prof. Deepak Kourav‖ Noises and Image Denoising Techniques: A
Brief Survey‖ Versha Rani et al, Journal of Global Research in Computer Science, 4
(4), April 2013, 166-171
Cheng, H. D., X. J. Shi, Rui Min, L. M. Hu, X. P. Cai, and H. N. Du. “Approaches for
automated detection and classification of masses in mammograms.” Pattern
recognition 39, no. 4 (2006): 646-668.
Cheng, H., Cai, X., Chen, X., Hu, L., & Lou, X. (2003). Computer-aided detection
and classication of microcalcifications in mammograms: A survey. Pattern
Recognition, 36, pp. 2967 – 2991.
Cireşan, Dan C., Alessandro Giusti, Luca M. Gambardella, and Jürgen Schmidhuber.
“Mitosis detection in breast cancer histology images with deep neural networks.” In
International Conference on Medical Image Computing and Computerassisted
Intervention, pp. 411-418.
Cristobal G., Navarro. R., Space and frequency varient image enhancment based in Gabor
D. Donoho, I. Johnstone, G. Kerkyacharian, D. Picard, “Wavelet shrinkage:
asymptopia?”, Journal of the Royal Statistical Society B, vol.57, pp. 301-369,
1995.
Dinsha, D., and N. Manikandaprabu. “Breast tumor segmentation and classification using
SVM and Bayesian from thermogram images.” Unique Journal of Engineering and
Advanced Sciences 2, no. 2 (2014): 147-151.
Etehad Tavakol, Mahnaz, Vinod Chandran, E. Y. K. Ng, and Raheleh Kafieh. “Breast
cancer detection from thermal images using bispectral invariant features.”
International Journal of Thermal Sciences 69 (2013): 21-36.
F. Laine, S. Schuler, J. Fan, and W. Huda, “Mammographic feature enhancement by
multiscale analysis.,” IEEE transactions on medical imaging, vol. 13, pp. 725–40,
Jan. 1994.
George, Yasmeen M., Bassant M. Bagoury, Hala H. Zayed, and Mohamed I. Roushdy.
“Automated cell nuclei segmentation for breast fine needle aspiration cytology.”
Signal Processing 93, no. 10 (2013): 2804-2816.
Gonzalez R.C., Woods R.E., Digital Image Processing, Upper Saddle River, NJ Prentice
Gubern-Mérida, Albert, Michiel Kallenberg, Ritse M. Mann, Robert Marti, and Nico
Karssemeijer. “Breast segmentation and density estimation in breast MRI: a fully
automatic framework.” IEEE journal of biomedical and health informatics 19, no. 1
(2015): 349-357.
Gubern-Mérida, Albert, Robert Martí, Jaime Melendez, Jakob L. Hauth, Ritse M. Mann,
Nico Karssemeijer, and Bram Platel. “Automated localization of breast cancer in
DCE-MRI.” Medical image analysis 20, no. 1 (2015): 265-274.
Hall, 2008.
Haus AG, Yaffe MJ., “Screen-film and Digital mammography: Image Quality and
Radiation Dose Considerations,” Radiol Clin North America, 2000,38:871– 898.
Heywang-Köbrunner, Sylvia H., Astrid Hacker, and Stefan Sedlacek. “Advantages and
disadvantages of mammography screening.” Breast care 6, no. 3 (2011): 199-207.
http://www.breastcancer.org/symptoms/understand_bc/statistic
http://www.mathworks.com/matlabcentral/fileexchange/19084
Huang Q., Gao W., Cai W., Thresholding technique with adaptive window selection for
uneven lighting image, Pattern Recognition Letters, Elsevier, 2004, 26, p. 801-808.
Irshad, Humayun, Antoine Veillard, Ludovic Roux, and Daniel Racoceanu. “Methods for
nuclei detection, segmentation, and classification in digital histopathology: a
review—current status and future potential.” IEEE reviews in biomedical
engineering 7 (2014): 97-114.
J.K.KimandH.W.Park,“Statisticaltexturalfeaturesfordetectionofmicrocalcifications in
digitized mammograms.,” IEEE transactions on medical imaging, vol. 18, pp. 231–8,
Mar. 1999.
K. Bowyer, D. Kopans, W. Kegelmeyer, R. Moore, M. Sallam, K. Chang, and K. Woods,
“The digital database for screening mammography,” in Third International
Workshop on Digital Mammography, vol. 58, 1996.
Kiyan , T , Yildirim , T , "Breast cancer diagnosis using statistical neural networks " .
(2004 ). Journal of Electrical & Electronics Engineering 2(4).
Kim, J., Park, J., Song, K., & Park, H. (1997, Oct). Adaptive mammographic image
enhancement using first derivative and local statistics. IEEE Trans on Medical
Imaging, 16, No 5.
MaheshMahadevappa,"DigitalMammography:AnOverview,"RadioGraphics, RSNA
2004, Vol 24, No. 6, 1747-1760
Pradeep, N., Girisha, H., Sreepathi, B. andKa ribasappa K.(2012). Feature extraction
of mammograms. International Journal of Bioinformatics
Reversible data hiding in medical image for contrat enhancement of ROI, Ying-Hui
XIA, Hao-Tian WU
Rohit verma and Jahid ali, ―A comparative study of various types of image noise and
efficient noise removal techniques‖, International journal of advanced research
in computer science and software engineering, volume 3,issue 10 October 2013
USFdigitalmammographyhomepage,”http://marathon.csee.usf.ecu/Mammography/Dat
abase.html
www.Worldwidebreastcancer.com/wpcontent/uploads/2011/08/breastcancerstatsworld
wide.jp
JuCheng Yang, DongSun Park 2004 IEEE International Conference on Multimedia and
Expo (ICME)
Emma Regentova, Lei Zhang, Jun Zheng, and Gopalkrishna Veni. Proceedings of the
28th IEEE EMBS Annual International Conference New York City, USA, Aug
30-Sept 3, 2006
https://medium.com/analytics-vidhya/cnns-architectures-lenet-alexnet-vgg-googlenet-
resnet-and-more-666091488df5