Sie sind auf Seite 1von 76

ABSTRACT

Mammography is an examination available for early detection of signs


of breast cancer, such as masses, calcifications, bilateral asymmetry and
architectural deformity. Because of the limitation of the number of human
screens, computers play a key role in detecting the first signs of cancer. A
wide range of characteristics that define abnormalities and the fact that they
can often not be distinguished from surrounding tissues complicate
computer-assisted diagnosis and diagnosis of breast abnormalities.

This thesis explores ways to use known image processing and machine
learning techniques for computer-aided breast cancer detection using
mammography images to find a potentially good method. for the detection
of computer-assisted breast cancer based on mammography images, and
assists the pathologist in making decision.
The concrete application is designed and applied, including both
primary image processing and subsequent cancer detection through the use
of neural network-based machine learning algorithms. The app is evaluated
on a set of mammograms and the results are presented in detail and
discussed. This thesis is characterized by the use of a combination of neural
networks in the detection of breast cancer, as well as the use of real breast
cancer images obtained in the digital breast examination database (DDSM).

Key words, Breast, Cancer, Computer-aided detection, Image analysis, Image


processing, Mammography, Medical imaging, Microcalcification, Neural
networks.
CHAPTER 1

INTRODUCTION

Breast cancer is the most common form of cancer in women worldwide and affects an
average of 1.4 million people annually (Autier, et al., 2010). Breast cancer occurs 1 in
every 5 years, Figure 1.1. According to the National Agency for Research on Cancer
(Ferlay, Shin, Bray, Forman, Mathers & Parkin, 2010), it is also the most common form of
cancer, accounting for one in eight deaths. to do ten. Each year, more than 150,000 women
worldwide die from breast cancer (Fairley, Shane, Bray, Foreman, Mathers & Parkin,
2010). Only 1% of breast cancers occur in humans (Gunderman, 2006).

Figure 1.1 – The rate of Cancer among women in total populace

(Fеrlay, Shin, Bray, Forman, Mathers, & Parkin, 2010).

The rate of living and also the ailment diagnosis vary significantly based on the stage of
the cancer and cancer arrange. Cancer treatment is more efficient and effective at the early
stage of detection, so to avoid its advancement into a more extreme stage.
Mortality of Breast cancer is very high when linked to other different types of cancer.
Finding and analysing of breast cancer can be accomplished by using imaging procedures
such as the diagnostic-mammograms also known as x-rays, ultrasound (sonography),
magnetic resonance imaging and thermography. In more than four decades, investigations
have been done for cancer imaging screening. In any case, biopsy seems to be the best way
to determine if certainty of cancer extremely exists. Among the biopsy methods, the fine
needle aspiration, vacuum-assisted, core needle biopsy, and surgical (open) biopsy (SOB)
are the most common. Collecting of cells or tissues samples that are fixed across a glass
microscope for later staining and microscopic examination are procedure needed in all this
techniques
Histopathological investigation is a profoundly tedious expert assignment reliant on the
experience of the pathologists and affected by variables, for example, tiredness plus
decrease of attention. There is a squeezing requirement for (CAD) computer assisted
diagnosis to calm or ease the job on pathologists via separating clearly benign regions, so
that the specialists can focus on the extra difficult to diagnose cases. Alot of endeavors has
in this manner stayed dedicated to the field of BC histopathology picture investigation, and
specifically to the computerized classification of benign or malignant images, for
computer-aided diagnosis.
Pathology labs have begun to move towards a completely digital workflow, with the
utilization of digital slides being the principle segment of this procedure. This was made
conceivable by the presentation of scanners for whole slide imaging (WSI) that empower
financially savvy generation of digital portrayals of glass slides. Notwithstanding many
advantages as far as storage and perusing limits of the image data one of the upsides of
digital slides is that they empower the utilization of Image analysis techniques that intend
to deliver quantitative highlights to help pathologists in their work. An automatic mitosis
discovery technique with great execution could ease both the subjectivity and the
tediousness of manual mitosis counting, for instance, by independently producing a mitotic
activity score or guiding the pathologist to the region within the tissue with highest mitotic
activity. Automatic feature selection is achieved using a novel feature weighting scheme.
Feature weights depend on the significance of a feature and we dismiss features with low
weights. A new generation of forest (new population of trees) is created which operates on
a reduced feature set. Through the test phase, each tree of the taught forest votes with their
matching weights to execute the classification.

1.1 SCOPE OF STUDY


This thesis suggests a technique for detecting breast cancer in mammography images. The
technique consists of two main parts. In the first part, image processing techniques are
used to prepare images for feature and pattern extraction processes. The extracted features
are utilized as an input to a neural network and logistic regression machine learning
algorithm. This algorithm is a supervised machine learning algorithm that is trained with
input images.
The main objectives of this work can be summarized as follows:
1- Implementation of a new CAD system for breast cancer diagnosis.
2- Utilizing image processing techniques and supervised machine learning in the new
proposed model.
3- Increasing the accuracy of breast cancer detection.
4- Reducing the false positive probability in the breast cancer diagnosis process.

1.2 PROJECT ORGANIZATION


This project comprises of 5 parts. It is sorted out as takes after:
This chapter provides an introduction to the concept of breast cancer, breast cancer
diagnosis, techniques used for breast cancer measurement. It also introduces the problem
statement and the objectives of the research. The rest of this thesis is organized as follows:
In part 2, presents a literature review. Also, it summarizes the most recent and related
work. Information about cancer of the breast is given. The variations from the norm that
can be noticeable on mammograms, focal points and confinements of mammography to
distinguish bosom disease are talked about. Definition of PC and CAD helped
determination and benefits of utilizing it for cancer of the breast analysis are talked about.
Part 3 manages the current image processing techniques and machine learning upgrade and
division strategies. It gives quantitative image division and upgrade execution measures.
Chapter 4 deals with experimental result and discussion. Also, presents experiments that
are performed to evaluate the neural network and logistic regression algorithm for the
detection of breast cancer over a number of images.
Chapter 5 is the conclusion of the thesis and a given recommendations for future work.
CHAPTER 2

LITERATURE REVIEW

BREAST ANATOMY AND PATHOLOGIES


This chapter aims to demonstrate the importance of the breast cancer study and to provide
some fundamental knowledge on the breast structure and diseases. Thus, the anatomic
structure of the breast is introduced, along with a description of the different types of breast
cancer and some other diseases that affect the breast.

2.1. BREAST ANATOMY


In humans, the breasts are situated in left and right sides of the upper ventral district of the
storage compartment and each stretches out from the second rib above to the 6th rib
underneath. The female breasts compare to two huge hemispherical eminences, which
contain the mammary gland, Figure 2.1. This gland secretes milk, when stimulated, which
usually corresponds to the period subsequent to conceiving an offspring. The mammary
glands are sweat glands modified. They occur both in male and female, but in the former is
only rudimentary, apart from some peculiar circumstances (Gray 2000), (Seeley, Stephens,
& Tate, 2004).

Figure 2.1 – Anatomy of the breast

(Seeley, Stephens, & Tate, 2004).


The surface of the breast is convex and has, simply underneath the inside, a little conical
prominence, called papilla or areola. It is sited about the level of the fourth intercostal’s
space. The base of the papilla is enclosed by an areola (Gray, 2000), which has a slightly
irregular surface due to the existence of rudimentary mammary glands, areolar glands, just
under the surface (Seeley, Stephens, & Tate, 2004).
The mature female breast comprises of gland tissue, fibrous tissue, fatty tissue, blood
vessels, nerves and ducts. The breast has various flaps, ordinarily 15 to 20 (Seeley,
Stephens, & Tate, 2004), which are composed of lobules. Those consist of alveoli and
lactiferous ducts. These lactiferous ducts extend to frame a little lactiferous sinus, which
gathers drain amid lactation. The drain leaves the breast trough a few openings in the
nipple. The fibrous tissue lays at the whole surface of the breast and associates the lobes
together. The greasy tissue covers the surface of the organ, aside from the areola, and is
situated between the lobes. More often than not, this tissue is bottomless and decides the
frame and size of the gland (Gray, 2000), (Seeley, Stephens, & Tate, 2004).
The breast is hold in place as a product of the Cooper’s ligaments support, which spread
out from fascia above the pectoralis main muscles to the skin over the mammary glands
(Seeley, Stephens, & Tate, 2004).
The breast heaviness and dimension differ among individuals and at diverse periods of life
(Gray, 2000), (Seeley, Stephens, & Tate, 2004). The female breasts begin to create at
puberty, empowered by the hormones estrogens and progesterone of the female sexual
menstrual cycle. Higher organs advancement happens amid pregnancy, when the estrogens
levels ascend as they are discharged by the placenta and increment considerably more after
conveyance, when they are emitting milk to nourish the infant. The breasts end up plainly
atrophied in old age (Gray, 2000), (Guyton & Hall, 2000), (Seeley, Stephens, & Tate,
2004).
A children breast comprises primarily of ducts with dispersed alveoli, being similar in both
female and male. A teenage breast mostly consists on fibrous and gland tissue. When adult,
the fat substitutes some of the fibrous and gland tissue. For the period of menopause, the
breast is primarily adipose tissue.
The breast is hugely influenced by some hormones. Estrogens animate the breast adipose
statement and the development of the mammary glands, and in addition the underlying
advancement of lobules and alveoli of the breast. Progesterone and prolactin cause the final
growth, are responsible for the function of these structures, and affect the external presence
of the mature female breast (Guyton & Hall, 2000).
During the time of pregnancy, the focus of estrogens and progesterone upswings. This
phenomenon causes expansion and branching of the breast gland ducts and deposition of
additional adipose tissue. Prolactin is responsible for the milk production (Gunderman,
2006), (Seeley, Stephens, & Tate, 2004).

2.2. BREAST CANCER


The breast can be affected by many pathologies. Nevertheless, the imagiology of the breast
is almost completely addressed to the breast cancer (Gunderman, 2006).
As the other cancers, breast cancer corresponds to a malignant development, which, in this
case, begins in the cells of breast tissues. In common situations, the cell division cycle is
controlled and ordered, allowing tissue formation, growth and regeneration. When the
control fails and there is no reparation of the eventual mu-tations, a tumour formation
occurs.
After its formation, the progression be determined the patient. However, an early
recognition and treatment is necessary to stop the cancer advancement and to minimize the
damages. The breast cancer, as the bulk of other cancers, can have the ability to spread to
other tissues, metastasizing, allowing the dissemination of cancer. When the breast cancer
is premature detected, this phenomenon is avoided, which provides a better prognosis for
the patient.
The breast cancer risk is increased with the age, where the majority of patients are over 50
years (Gunderman, 2006). Other threat factors correspond to family history of breast
cancer, previous breast cancer, initial menarche, late menopause, obesity, null parity and
chest radiation exposure, abnormal cells in fibrocystic disease and hormone replacement
therapy (Gunderman, 2006), (Seeley, Stephens, & Tate 2004).
Due to these risks, some countries developed screening programs, where women over 40
or with higher risk of emerging breast cancer perform mammographic exams in a periodic
interval.

2.2.1. Breast cancer lesions


Cancer of the Breast has some characteristic lesions such as microcalcifications, masses,
architectural distortions. Asymmetry amongst breasts can also be a breast cancer indicator.
Micro-calcifications are small size lesions, typically in the range 0.05 to 1 mm. With these
sizes, micro-calcifications are somewhat difficult to identify. They are bright and have
numerous sizes, shapes and distributions and in some cases low contrast due to a reduced
intensity variance between the doubtful areas and the surroundings. Additional reason to
their difficult detection is the proximity to the surrounding tissues. In dense tissues,
suspicious regions are almost undetectable as a result of the tissue superimposition. Some
anatomic structures such as fibrous strands, breast borders or hypertrophied lobules are
similar to micro-calcifications in the mammographic image (Sankar & Thomas, 2010).
There is a high correlation amid the nearness of micro-calcifications and breast cancer,
especially when the micro-calcifications appear in clusters. In this manner, a precise
detection of micro-calcifications is basic to an early discovery of the dominant part of
breast cancers (Li, Liu, & Lo, 1997). Generally, greater, round and oval shaped
calcifications with uniform size have higher probability of being benign, while smaller,
irregular, polymorphic and branching calcifications, with heterogeneous size and
morphology have higher probability of being malignant (Arnau, 2007)

Figure 2.2 –Commonly seen microcalcifications types on mammographic. Images


(Gunderman`, 2006)).
Masses appear as dense regions of different sizes and properties. They can be circular,
oval, lobular or irregular/spiculated and their margins can be (Arnau, 2007), Figure 2.3
and Figure 2.4:

 circumscribed, which are well-defined and distinctly demarcated borders;


 obscured, which are hidden by superimposed or adjacent tissue;
 micro-lobulated, which have undulating circular borders;
 ill-defined, which are poorly defined scattered borders;
 Speculated, which are radiating thin lines.

Figure `2.3 – Morphologic spectrum of mammographic masses

(from (Bruce & Adhami, 1999)).

Figure 2.4 – Mass examples with diverse shapes and borders (from (Arnau, 2007)).

Subject on the morphology, the masses have different malignant probability. The ill-
defined and speculated boundaries have higher probability of malignancy (Arnau, 2007). A
benign method is usually connected with the existence of circular or oval masses.
However, the great variability of the mass appearance is an obstacle to a correct
mammography analysis (Mini & Thomas, 2003). Some masses can incorporate micro
calcifications, as in Figure` 2.5. A craniocaudal sight of the right breast demonstrates
benign vascular calcifications as well as two well-circumscribed masses holding “popcorn”
calcifications classic for involuting fibroadenomas (Gunderman`, 2006)

Figure 2.5 – A craniocaudal view of the right breast (from (Gunderman, 2006)).

2.2.2. Types of Breast Cancer


Cancer of the Breast can be categorized according to the breast tissue where the cancer was
originated (glands, ducts, fat tissue or connective tissue) and according to the point of the
cancer blowout (non-invasive/in situ or invasive/infiltrating) (Gunderman, 2006).
Carcinoma in situ tumor is an early form of carcinoma (invasive malignant tumor due to
muted epithelial cells) detected in an early stage and with the absence of invasion of
surrounding tissues. A cancer is known as infiltrating when the cells that started in the
glands or ducts spread to healthy surrounding tissue. This type of cancer can have a
multiplicity of appearances (Eastman, Wald, & Crossin, 2006).
Both in situ and infiltrating cancers can be ductal and lobular, depending on the breast
cancer location. Ductal carcinoma rises from the epithelial cells that line the breast milk
ducts. In the ductal carcinoma in situ, cancer cells have not pierced the basement
membrane of the ducts. In the mammographic images is characterized by fine
microcalcifications; however, the degree of cancer infiltration is not generally visible
(Gunderman, 2006). The infiltrating ductal carcinoma is the most successive sort of breast
cancer, being in charge of about 80% of cases. A tumor irregular mass is characteristic in
the mammography of this type of cancer.
Lobular carcinoma originates in the milk glands, in the terminal lobules. Approximately,
10% of breast cancer is lobular carcinoma (Gunderman, 2006). The lobular carcinoma in
situ is hardly identified in mammography.

Figure 2.6 – Invasive Ductal Carcinoma showing microlobulated borders


andmicrocalcifications (from (Kaushak, 2007)).

When cancer spread`s to new parts of the human body through blood and lymph
circulation, its known as metastization. When the ductal carcinoma invades the skin of the
nipple is called Paget`s disease. Inflammatory breast cancer corresponds to an aggressive
tumor that invaded the dermal lymphatics (Gunderman, 2006), representing about 1 to 4%
of the breast cancer. This cancer usually presents breast inflammation.
Medullary breast carcinoma arises from the stromal cells of the breast (Gunderman, 2006).
Mucinous carcinoma is associated with enormous volumes of cytoplasmic mucin
(Gunderman, 2006). The last two types of cancer generally experience lower ability to
create metastasis than the ductal and lobular.

2.2.3 Other breast pathologies


Some alterations in the breast are not malignant. To analyze breast cancer lesions is
necessary to regard some other similar lesions caused by different pathologies and benign
procedures in order to distinguish them.
Fibroadenoma is a benign tumor of the breast developed usually in young women, below
30 years old. This tumor remains in place for some time, but never progresses to a
malignant cancer. It can grow rapidly due to the explosion of the strome and epithelium
cells. In mammography, is characterized as an oval mass with smooth borders, which may
have some calcifications (Eastman, Wald, & Crossin, 2006).
A cyst is a sealed structure which contain`s a distinct membrane and may have air, fluid or
semi-solid material. Generally, arises from dilated glandular ducts or lobules. In some rare
cases cancer may occur inside the cyst, usually when the inside liquid contains some
blood. Some cysts may contain calcium and develop calcification within the walls.
Mammographically is a rounded mass with a well-defined contour (Eastman, Wald, &
Crossin, 2006). After a breast injury with hematoma and fat tissue necrosis, oil cyst may
occur, being physically similar to a simple cyst; however, with density equivalent to fat
tissue (Eastman, Wald, & Crossin, 2006).
Mastitis is the inflammation of breast tissue due to an infection. In plasma cell mastitis,
there are solid, dense, regular rodshape calcifications in the glandular ducts of the breast
(Eastman, Wald, & Crossin, 2006).
Mammary dysplasia, also called fibrocystic disease or mastopathy, is a common condition
due to excess of estrogen or higher tissue response to estrogens. It is characterized by three
major conditions: formation of fluid filled cysts, breast duct system hyperplasia and fibrous
connective tissue deposition (Eastman, Wald, & Crossin, 2006).

2.3 DIFFERENT IMAGING TECHNIQUES FOR BREAST CANCER DETECTION


Cancer of the breast may be observed by technique for a cautious inspection of clinical
history, physical examination, and imaging with either mammography or ultrasound.
Nevertheless, conclusive diagnosis of a breast mass must be set up through fine-needle
aspiration (FNA) biopsy, core needle biopsy, or excisional biopsy. In breast cancer
identification, mammography, MRI and PET sweep could give profitable information to
diagnosis (Qi, Hairong, and Nicholas A. Diakides, 2009). X-ray and PET are not
prevalently embraced for dissimilar reasons together with high cost, unpredictability and
availability issues. X-ray won’t have the capacity to discover all cancers (i.e. breast cancers
showed by micro calcifications) (“MRI Scan” http://www.cancerquest.org/mri-advantages-
and-disadvantages.html). X-ray can’t generally recognize malignant tumours or benign
disease, (for example, breast fibro adenomas), which could prompt a false positive results.
The test is effortless, yet tolerant need to lie still inside the thin barrel. Patient might be
requested that hold her breath or keep as yet amid specific parts of the test. In PET
checking, Ultrasound outcomes may differentiate a potential region of worry that is not
malignant. The false-positive outcomes can provoke for more methodology, together with
biopsies that are redundant. Despite the fact that ultrasound is frequently utilized as a part
of an endeavour to keep an obtrusive measure for diagnosis, now and then it can’t figure
out if or not a mass is malignant, and the biopsy will be suggested. Calcifications that are
noticeable on mammograms are not unquestionable on ultrasound examines, along these
lines staying away from early diagnosis of the bit of breast cancers that start with
calcifications. Like other therapeutic diagnosis frameworks, X rays are used as
characteristic instrument as a component of mammography for the investigation of human
breast. These examinations are recorded as specific images which are then seen by
radiologists for any conceivable abnormality (Yasmin et al, 2013). The irregularities in
mammograms incorporate micro-calcifications (MCs), masses, structural distortion, and
asymmetry (Li et al, 2016). Mortality lessening is the significant target of mammography
screening. Chemotherapy may be more powerful in the premature stages, both are liable to
add agreeably to the lessening of breast cancer mortality, (Sylvia et al, 2011).
Mammography aids in early discovery and it assumes a significant part in tumour
treatment and allows a speedier recuperation for a large portion of the patients.
Mammography is a particular kind of imaging that uses a low-dose X-ray framework,
high-contract and high-determination film for examination of the breasts. Two procedure
of mammography are determined as advanced and film. Digital mammography is better
than anything film mammography since radiation rays can be decreased up to half and can
regardless recognize breast cancer, however in film mammography the standard radiation
rays can’t be decreased (Al-Shamlan et al, 2010). In mammography, twofold scrutinizing
has had all the earmarks of being exceptionally useful, lessening the amount of false-
negative outcomes by 4% to 14%, enhancing the rates of breast cancer identification (Calas
et al, 2012). Mammograms can delineate the greater part of the noteworthy changes of
breast disease. The basic radio-realistic signs of cancer are masses (its thickness, site,
shape, outskirts), spicular sores and calcification content. These components might be
extricated utilizing different recognition system (Tomar et al, 2009).

2.3.1 ULTRASONOGRAPHY
Ultrasound imaging or 'sonography' has a tendency to be utilized as a part of breast cancer
screening as a 'second look' or follow up application. The typical signs for breast
ultrasound would be a suspicious finding on mammography or for promote indicative
assessment of a tangible injury felt on a clinical breast exam.
Notwithstanding, in light of the fact that a lady is sent for a subsequent sonogram is no
motivation to have lifted nerves about breast cancer. Ultrasound is especially useful in
recognizing a strong mass and a liquid filled blister, which is the thing that a lion's share of
breast sores end up being.
Ultrasound is likewise valuable in finding little injuries that are too little to be felt at a
clinical exam.

Fig: 2. The ultrasound picture demonstrates a dull zone just beneath the skin. I'm not sure
what it is. For this situation, inflammation is more probable than cancer.

Ultrasound imaging utilizes high recurrence sound waves to frame a picture, called a
'sonogram'. The sound waves it utilizes are safe and go through the breast and skip back or
'reverberate' from different tissues to frame a photo of the inward structures. An unforeseen
'resound' implies that there is a strong knob or some likeness thereof inside the tissue.
There is no radiation engaged with ultrasound imaging, which makes it a favored
technique for demonstrative imaging for pregnant ladies.

2.3.1.1 Ultrasound and breast density


Ladies with high breast thickness are regularly screened with ultrasound, since
mammograms of ladies with thick breast tissue have a tendency to be harder to decipher.
Thus, ultrasound is regularly a first indicative imaging technique for ladies under 35.
Regardless of whether a ultrasound can remain solitary as a screening technique as
opposed to joining it with mammography or MRI is as yet a subject of civil argument. At
show, there is no investigation which conclusively demonstrates that ultrasound screening
alone brings down death rates for breast cancer, dissimilar to mammography, which does.
It used to be trusted that a screening mammogram with no variations from the norm seen,
and thick breast tissue, was not a reason enough for a breast ultrasound. Be that as it may,
to an ever increasing extent, when mammograms have extremely thick tissue, ultrasound is
being included.

2.3.1.2 Combining ultrasound for Breast Cancer with MRI or biopsy


The mix of ultrasound with Magnetic Resonance Imaging has been observed to be an
especially decent blend in followup assessment of injuries found on mammography. The
detail of MRI incredibly helps indicative and treatment choices. Ultrasound is additionally
exceptionally valuable in directing the needle amid a followup biopsy.
Fig: 3. Ultrasound image of a breast mass.

2.3.1.3. Screening and Detection Rates Using Ultrasound


The rate at which injuries found by screening are appeared to be harmful breast cancer is in
reality low. The rate of recognizing malignancies utilizing mammography (x-beam) is
around 5 cancer for each 1000 ladies screened. At the point when ultrasound for breast
cancer screening is utilized alone to decide harm, the rate is marginally lower. This
completes have a tendency to propose that mammography is marginally more solid.
However, it must be underscored that even the mix of ultrasound and even MRI with
mammography can't totally reject the likelihood of breast cancer. Up to 3% of ladies with
negative mammograms and sonograms of suspicious sores may even now have breast
cancer.

2.3.1.4 Cost and Practical Considerations of Ultrasound for Breast Cancer Screening
Ultrasound imaging isn't generally any more costly than mammography, and from
numerous points of view it is more helpful. The issue is, all suspicious ultrasound
discoveries are uncertain and wind up being alluded for biopsy at any rate. This must be
weighed against the cost and viability of mammographic screening all in all, which has a
tendency to give better confirmation of the idea of a sore as for the requirement for a
biopsy.

2.3.1.5 Normal classifications of breast ultrasound comes about


Strange aftereffects of a ultrasound will tend to fall into four classes. A radiologist can
normally tell if the echoes are caused by generous stringy knobs (breast fibrocystic
sickness, papillomas, fibroadenomas). Of somewhat more prominent concern are
sonographic signs of a mind boggling pimple . The third and fourth casual classifications
of improved probability of danger are suspicious injury, and sore very suggestive of
cancer.

2.3.1.6 What can an ultrasound uncover about a potential breast cancer sore?
A sonogram gives a decent sign of the fluid or strong nature of an injury, or maybe a mix.
Fluid masses (sores) have a tendency to be darker in shading, and homogeneous. An
accomplished radiologist picks up a vibe for what the distinctive surfaces of a sonogram
have a tendency to speak to. The state of a sore and furthermore its edge (the attributes of
its edges) are additionally very apparent on sonograms. This decides if an injury is
cancerous or benevolent (cancerous injuries have a tendency to have spiked edges). Breast
cancer sores additionally have a tendency to be to some degree arbitrary fit as a fiddle, yet
not generally. Kindhearted fibroadenomas are generally round or oval. In any case,
ultrasound is certainly not a complete test, and tissue investigation through biopsy is
normally required. Notwithstanding when ultrasound recommends the nearness of a
sinewy knob or complex sore, a biopsy is as yet advocated. Up to 15% of these kinds of
developments wind up being threatening.
Any echoes on the sonogram (a change from the sound on its way back contrasted with on
it's way in) shows that a strong knob or some likeness thereof has hindered the way of the
sound wave. Examination of the strong knobs on a breast sonogram requires significant
aptitude, and can give encourage clearness with regards to the amiable or harmful nature of
the sore. (Steven Halls, June 22, 2018)

2.3.2 Magnetic resonance imaging of the Breast


Magnetic Resonance Imaging is the most appealing contrasting option to Mammography
for finding a few cancers which may be missed by specialists to determine how to treat
cancer of the breast disease patients by recognizing the stage of the infection (Singhand
Mohapatra, 2011).

Figure 2.8: cancer of the breast pointed by arrow shown in MRI scan

2.3.3 Image Mammography


X-Ray Mammography is regularly utilized as a part of clinical practice for analytic and
screening purposes (Singhand Mohapatra, 2011).

Figure 2.9: Filmscreen mammography


2.4 BREAST CANCER ABNORMALITIES
The fact that mammography is extremely touchy however frequently a nonspecific check-
up must be appreciated by both the clinician and the radiologist to avoid wrong
expectations. Honest to goodness positive rates are refered to from 10-30% of
mammography abnormalities, dependent upon how forceful and experienced the
radiologist is. Mammography gives a false negative rate of l0-15% when a breast cancer is
substantial (Vikhe and Thool, 2016). Generally (80-85%) breast cancers usually can be
seen on a mammogram as a mass, a gathering of calcifications, or a blend of both. The
affirmation of a mass littler than 2 mm might be the ideal, yet sensibly, it is hard to see
most tumors littler than 5 mm. Large, non-calcified masses may hard to perceive in the
thick glandular breast, which is customary in women of childbearing age. The edge for
area of a cancer is variable and depends on the radiographic anomaly, the fat-glandular
tissue ratio of the breast, the technical quality of the examination, and the diligence of the
radiologist. (Tomar et al, 2009`).

2.4.1 Masses
In mass recognition, every mass ROI comprises a solitary mass. Mass detection can be
assessed by a free response ROC analysis for several thresholds on the base nesting depth,
the percentage of mass ROIs intersecting a location (i.e. the sensitivity) and the quantity of
identifications interconnecting no mass ROI (i.e. the quantity of wrong positives) per
pathological mammogram (Quellec et al, 2016). By and large, risky sores have an added
prominent radiographic thickness than an equivalent volume of fibro glandular breast
tissue. Lucent wounds are frequently kind. Parted singular non-cystic densities should be
measured painstakingly. And when bigger than 8 mm in distance across, they should be
considered for biopsy. More unpredictable the state of damage, more probable is the mass
to be cancerous. A sporadic or guessed edge is a collective imperative element displaying
that the mass is harmful. The cancers which are less penetrating may have just marginally
irregular or even all around circumscribed margins; papillary, medullary and colloid
carcinomas are liable to remain all around circumscribed. The margin of the mass will be
abruptly defined in a considerate injury, for example, a fibro adenoma or a cyst, till the
mass having this presence maybe malignant in about 7%` of cases. An intramammary
lymph node is regularly very much circumscribed, small in shape, and regularly establish
in the upper outside quadrant of the breast. Asymmetric breast tissue closely always can be
known from a genuine mass by means of mammographic evaluation. The refinement of a
small stellate mass from an initial invasive breast cancer is often to a great degree
unpretentious, so optimal system and careful interpretations are vital. Favourable stellate
masses, for instance, post-biopsy scamming and fat necrosis, as frequently as likely have a
characteristic appearance. One ought to take subsequently rather than execute a biopsy on
most nonspecific circumscribed masses because they mostly are seen on mammograms and
have fewer than 5% chance of being malignant. A typical case of a probable kind wound is
a non-calcified mass or knob with very much clear margins. Markowitz reported an
investigation of 593 non calcified masses larger than 1.0 cm seen by mammography and
found that about 2% of them ended up being malignant. More than half of these masses
were appeared to be basic cysts on aspiration or on ultrasound evaluation. The etiology of
most non calcified strong masses can be controlled by palpation or by ultrasound-guided
final biopsy.

2.4.2 Micro-Calcifications
Large thick calcifications of an amiable, involutional fibro adenoma, when related with a
lobulated mass, are diagnostic of favourable procedure; when an involutional method is
creating, it might be indistinguishable from a malignant tumor, and biopsy need to be
accomplished. Calcification might happen in fat necrosis or in the walls of a cyst.
Punctuate, pointed irregular calcifications that are heterogeneous in size inside a mass, or
fine, branching calcific deposits filling ducts, are solid indicators of cancer. Half of
malignant masses have calcifications that can be seen by mammography. About 20-35% of
radiographically identified clustered calcifications without a mass will be malignant, and
the vast majority of these will speak to non-invasive cancer. Mammography is profoundly
delicate in identifying breast calcifications; however the specificity in recognizing
considerate from malignant calcifications is just 50-60% (McKenna, 1994). The cluster of
couple of calcifications ought not to be considered clearly considerate, rather, it ought to be
considered at generally safe for being malignant and took after at 4 to 6 month intervals. At
the point when more than a couple of calcifications are available without an associated
mass, the choice to perform a biopsy may be more troublesome the most minor
calcifications, as minor as 0.2 mm, might be extra suspicious than the 2-mm calcification.
Kind-hearted and malignant calcifications may possibly coincide in the same breast.
Calcifications associated with fibrocystic changes in the breast often impersonate those
found in malignancy, leading to an unavoidable false constructive report. Skin and
vascular calcifications must be recognized from mammary lesions, and biopsies don’t
should be performed on them. Involvement in deciphering mammograms is necessary to
minimize the quantity of biopsies that have amiable results.

2.4.3 Architectural Distortion of breast cancer


Architectural distortion is a regular mammographic presence of non-palpable breast
cancer, speaking to nearly 6% of abnormalities recognized on screening mammography.
Although its prevalence on mammography is small compared with calcification or
unmistakable mass, architectural alteration is also harder to diagnose because it can be
unobtrusive and variable in presentation. In fact, architectural alteration is a typical finding
in review assessments of false-negative mammography and may speak to the earliest
manifestation of breast cancer. Moreover, some surveys recommend that early location of
architectural distortion may be associated with a more significant change in prognosis than
earlier discovery of calcifications. Various computerized methods have been produced to
raise the discovery ratio of architectural distortion, however they remain defective
(identification rate of short of what one half with one strategy). The amiable and malignant
causes of architectural distortion and illustrates its various manifestations in an effort to
lessen undiagnosed architectural distortion on screeningof mammography (Gaur.et al,
2013).

2.4.4 Breast Density


Report of mammogram also comprises an assessment of cancer of the breast density.
Density breast is based on how glandular tissues and fibrous are conveyed in breast, versus
amount of breast made up fattytissue. Thick breasts are not unusual, but rather they are
connected to a higher danger of breast cancer. We realize that thick tissues of breast can
create it difficult to discover cancers on a mammogram. Still specialists do not accept the
results from different tests, assuming objects, ought to be completed with mammograms in
ladies with thick breast who does not have a high breast cancer hazard bunch (based on
gene mutations, breast cancer in the family, or other factors).

2.4.5 Objectives
This thesis suggests a techniquefor detecting breast cancer in mammography images. The
technique consists of two main parts. In the first part, image processing techniques are
used to prepare images for feature and pattern extraction processes. The extricated
highlights are used as a contribution to a neural organize and calculated relapse machine
learning calculation. This algorithm is a directed machine learning algorithm that is
prepared with input images.
The core objectives and aim of this work can be simplified as follows:
1- Application of a new CAD system for cancer of the breast diagnosis.
2- Exploiting image processing techniques and oversaw machine learning in the new
proposed model.
3- Growing the accuracy in cancer of the breast detection.
4- Decreasing the wrong positive probability in cancer of the breast diagnosis process.

2.5 RELATED/PREVIOUS WORKS


There are various ways human body can fall flat. Breast cancer is one of the dangerous and
harmful, particularly in the case of women. This ailment is one of the worst, dangerous and
widely recognized cancer of women around the globe. A study was done by (Rejani, Y., &
Selvi, S. T. (2009)) that developed an algorithm for noticing tumor from mammogram. In
the study, there were two different types that was used; one is enhancing the cancer using
weak contrast and the other is segmenting the features that categorize the cancer. The
procedure used for the cancer detection involves mammograph extraction, segmentation of
the cancer area, extracting the content from the segmented cancer area and using the
Backing vector classifier to organize the extracted features. In the methodology of this
study, the signal to noise ratio was increased for making the features more visible by
altering the colors. 75 mammographic images were tested and the result showed 88.75%
sensitivity.

Acha,et al,(2006) proposed a method for identifying the micro-calcifications of clusters


mammography images. The authors utilized Daubechie’s Wavelets (db2, db4, db8 and
db16). They claimed an accuracy of 80%; however they did not justify either the selection
of features or how neural networks were used in the decision making.

Kiyan, et, al (2004) used a database`in machine learning neural network and also signal
processing. In other to raise the objectivity and precision of breast cancer diagnosis,
statistical neural networks are used.

Nabil, et, al (2008) used and implemented the genetic algorithm and artificial immune
system and the hybrid algorithm and tested in the Wisconsin breast cancer diagnosis
(WBCD) problem in order to create a fuzzy rule system for caner of the breast diagnosis.
The hybrid algorithm generated a fuzzy system which reached the extreme classification
ratio earlier than the two other ones.

In a study done by (Sheppard`, A. P., Sok`, R.M., & Averdunk`, H. (2004), the method for
doing segmentation of images of porous and composite materials that were gotten from X-
Ray tomography was discussed. This technique involves a three-stage approach;
anisotrophic diffusion, application of unsharp mask sharpening filter and application of
watershed and active contour. At the begining stage, the structures in the image are
preserved.

Rejani and Selvi (2009) present a research a tumor detection algorithm from
mammograms. The suggested system focuses on the solution of two problems. One is how
to detect tumors as suspicious region with a very weak contrast to their background and
anther is how to extract features which categorize tumors. The tumor discovery technique
follows the scheme of mammogram enhancement, the segmentation of the tumor area, the
extraction of features from the segmented tumor area,` and the use of SVM (Support
Vector` Machine) classifier.

(Emma Regentova, et al, 2006) explored the execution of statistical modeling of digital
mammograms by methods for wavelet domain Hidden Markov Tree Model (WHMT) for
its consideration to a PC helped indicative inciting framework for distinguishing
microcalcification (MC) groups. In their investigation, the framework consolidates: net
segmentation of mammograms for acquiring the bosom locale, wiping out the pepper-type
commotion, piece shrewd wavelet change of the bosom flag and probability figuring,
picture segmentation and post preparing for holding MC bunches. FROC bends are
acquired for all MC bunches containing mammograms of scaled down MIAS database.
100% of genuine positive cases are identified by the framework at 2.9 false positives for
each case. All things considered, there outlined a technique for MC discovery in view of a
segmentation algorithm that uses WHMT modeling. By utilizing db3 filter, a particular
tree structure and weighting of probability esteems in MLE strategy, the capacity of the
HMT model is incredibly advanced for the point. The technique performs well in the
structure of the framework they created. A reasonable assessment of the entire framework
has been performed with smaller than usual MIAS database. The focused on exactness of
genuine positive cases is 100%, in light of the fact that the framework is proposed for
analytic provoking. False positive cases which are kept at as low rates as conceivable can
be additionally segregated by radiologists. Along these lines, the objective of furnishing
radiologists with all obvious MC
Microcalcifications are early indication of breast cancer show up as disconnected splendid
game in mammogram pictures, which are hard to identify because of their tiny size. In this
investigation, (JuCheng Yang`, DongSun` Park 2004) presented morphological bandpass
channel (MBF) to identify microcalcifications, which is actualized by opening the first
picture two times with two distinctive structure components individually, and subtracting
one opened picture utilizing another that can decay the picture information premium
wrecks space picture where microcalcifications have a trend to displayup. Arrangement of
MBFs are tuned for the location errand, and twofold picture contained microcalcifications
district of-intrigue (ROI) will be acquired. Test comes about demonstrate that the proposed
technique with these bandpass filters can perceive ROI with a genuine positive of 93.07%
and a bogus positive of 4.34%. Contrasting with the outstanding discreet wavelet transform
(DWT) strategy, this technique is more precise in positions and sizes of microcalcifications

Narang, et, al (2012) presents an overview on classification of breast cancer by means of


adaptive resonance neural network (ARNN), and feed forward artificial neural network and
the performance of the network is assessed by means of Wisconsin breast cancer data set
of numerous training algorithms.

Gayathri, et, al (2013) used numerous machine learning algorithm`s (Supervised Learning,
Unsupervised Learning, Semi-supervised Learning, Transduction, and Learning to learn)
and techniques to improve the precision of predicting cancer of the breast.

Gayathri, et, al (2013) used various machine learning algorithms (Supervised Learning,
Unsupervised Learning, Semi`-supervised Learning, Transduction, and Learning to learn)
and techniques to increase the correctness of guessing breast cancer`.

It is evident from previous studies that more research is needed to advance the accuracy of
early finding of breast cancer, using a realistic dataset of mammogram.
CHAPTER 3

IMAGE PROCESSING METHODS AND MACHINE LEARNING

3.1 BACKGROUND
In chapter three of this work, the image processing techniques and machine learning is
discuss and explain (artificial neural network and logistic regression) were also used in
carrying out this work.

3.2. IMAGE PROCESSING


An image is a variety of square pixels that are basically organized in columns and lines. In
reality, it is a function of two variables; one of which is the amplitude that affects the
brightness of the image (Russ, J. C., et al. (1994)). Besides, an image can also be said to
contain sub-images that are said to be districts (Jain, A. K. (1989).
Because of the development of innovation, it has turned out to be workable for multi-
dimensional signs having signals with frameworks to be controlled from straightforward
digital frameworks to cutting edge parallel PCs (Weeks, A. R. (1996)). This mechanical
progression including adjusting images is improved the situation three purposes, which are
picture processing, picture investigation and picture understanding. Picture processing
includes the investigation of calculations that examinations a picture as info and imitates
the picture as yield. The components of picture processing are picture show and printing,
picture altering, picture control, picture improvement, include discovery and picture
pressure. A few applications discover use in picture processing. Cases are remote
detecting, medical imaging, non-damaging assessment, scientific examinations, materials,
material sciences, military,
There are two sorts of picture processing; they are simple picture processing and digital
picture processing.
3.2.1 Analogue Image Processing
Analogue image processing is the adjustment of image using electrical methods. Examples
of analogue image processing applications are printouts, photographs, and television
images. The television signal`s brightness is measured by the voltage level, and this
changes dependent on the amplitude of the signal. The signal can be changed electrically;
hence the displayed image on the television is altered. Also, the brightness and contrast of
the television device controls the amplitude and reference of the video signal.

3.2.2 Digital Image Processing


In digital picture processing, PCs are utilized as a part of controlling digital images
(Adams, J. E. (2007)). In digital picture processing, the picture experiences three
procedures, which are; pre-processing, improvement and picture extraction. Digital picture
processing is the processing of two-dimensional images using a digital PC. Digital picture
processing is finished remembering the true objective to upgrade the pictorial information
for a prevalent yield quality. This entails enhancement and restoration of degraded
pictures. Also, digital image processing is done for automatic machine interpretations. This
involves segmentations and descriptions (Jähne, B. (2002))
The advantages of digital image processing over analogue digital image processing are its
versatility, repeatability, and the preservation of original data precision.

3.3 ROLE OF IMAGE PROCESSING IN CAD SYSTEMS.


The ability of running a medical diagnosis using computer-system is known as computer
aided diagnosis CAD. This process of diagnosis uses different methods and techniques,
which include image processing, big data analysis, database, and machine learning.
Most CAD systems that aim to help with the detection of breast cancer use as input
mammogramimages or other images. However, images have to be in the appropriate digital
format first, in order to be useful as input to a CAD system. Consequently, the first role of
image processing is often simply to digitize an existing mammogram or MRI that is stored
in analogue format. However, this is often only the first step, as subsequent image
processing is performed to first enhance the quality of the image and then to identify,
separate or otherwise mark on the image elements or features of interest (Bankman2009)

Figure4.1 Usual stages of CAD system operation, showing the significance of image
processing in the course of the first stages. (Saad, 2012).

3.4. IMAGE PROCESSING TECHNIQUES


The different picture processing are; Image representation, picture reprocessing, picture
enhancement, picture restoration, picture investigation, picture reclamation, picture
remaking, picture information pressure (Schneider, C. A., Rasband, W. S., and Eliceiri, K.
W. (2012)).

3.4.1 Image Analysis;


Image analysis involves measuring an image for the purpose of describing it. The ways of
doing this can be sorting different parts of an assembly line, studying a label on a grocery
shop, etc. A more sophisticated image analysis system uses the measurement results of the
image to make more specialized decisions like controlling an aircraft (Schneider, C. A.,
Rasband, W. S., & Eliceiri, K. W. (2012)).

3.4.2 Image Segmentation;


In image segmentation, the image is divided into smaller parts. The image is subdivided
based on the issue with the image which is to be resolved. In other words, a part of the
image is to be isolated for the purpose of repair (KMM Rao, Medical Image Processing,
Proc. of workshop on Medical Image Processing and Applications, 8thOctober 1995 @
NRSA, Hyderabad-37.)
Likewise, Image segmentation is the division of an image into districts or classifications,
which relate to various parts or objects of objects. Each pixel in an image is dispensed to
one or some of these classifications. Image segmentation is frequently the second step after
image enhancement. Dissimilar to enhancement, the principle reason for image
segmentation isn't to enhance the general nature of the image, but instead to distinguish
and portray structures and position of enthusiasm for the image (Bankman, 2009). Quite
often in fact, the analysis of image for breast cancer detection focuses in fact on one or
several areas that are in some way anomalous and thus a sign of a possible benign or
malign tumor. Such parts of attention are commonly called regions-of-interest (ROI). (
Bankman, 2009).

3.4.3 Restoration of Image;


Restoration of image is the purpose of removing or taking out degraded image. The ways
of doing this involves noise filtering, correction of geometric distortion, and deburring of
the image (Jhon R. Jenson, 2003).

3.4.4 Enhancement of Image


Enhancement of image is regularly the simple initial step of any CAD framework
algorithm. According to the, its main point is to make the resulting analysis of the image
less demanding or more exact. The concrete technique or techniques chosen for image
enhancement typically depend on the quality and type of image used as input as well as on
the concrete CAD algorithm that are used and its requirements. One of the most popular
techniques used for image enhancement is often a form of noise reduction or contrast
enhancement that helps bring out features of interest in the image both for a human
observer and for subsequent automated analysis by CAD systems. We will talk about in the
accompanying sections on the ways of enhancing images and consequently expel noise
from them, utilizing filtering methods.
3.4.5 Quantification of image
In some cases, image quantification is applied to the ROI obtained through segmentation.
The purpose of image quantification is basically to further characterize and potentially
classify theelementsof interest in the ROI (Bankman, 2009).
For instance, many CAD systems that investigate potential cases of breast cancer will
attempt to classify observed masses and calcifications based on features such as shape, size
and type of tissue as reflected in the colors obtained in the MRI or mammogram. One
important aspect of image quantification is that its results depend on the quality of the
image processing performed during the previous steps, but also on the fit between the
choice of features, quantification method and final aim (Bankman, 2009).

3.4.6 Image registration


Registration of image is a step that occurs most occasionally in the anaylisis of
mammograms. Essentially, the aim of image registration is to align as well as possible two
distinct images in order to allow easy comparison of similar features between them.
(Bnakman, 2009).
This is most useful for breast cancer detection techniques that rely on the natural similarity
between the two breasts of the same woman and thus attempt to discover potential
anomalies by looking for suspicious difference in the two images(Bankman,2009).

3.4.7 Image visualization


Image visualization is considered as a relatively new addition to the set of frequent image
processing techniques used by CAD systems for breast cancer detection.(Bankman.2009).
Essentially, image visualization aims to provide clear visual representations of the results
of an automated investigation, in order to allow a human practitioner attempting to make a
diagnosis to take advantage of both his/her experience and the computational power of the
machine. Broadly speaking, image visualization attempts thus to support human
examination of mammogram or other images in order to detect potential signs of breast
cancer.(Bankman,2009).
3.4.8 Imaging compression;
Imaging compression can be said to be the fitting of image for archiving the data or
transmitting the data on the network. Some example of data compression method is ‘Joint-
Photographic-Experts-Group’ also known as (JPEG) which makes use of Discrete Cosine
Transformation (DCT) based compression technique.

3.4.9 Compression of image, communication and storage


The compression, storage and communication of digital images is increasingly important
given the vast amounts of medical data that is currently stored and further acquired through
such images. In addition, efficient and reliable compression and communication are often
crucial for the functioning of complex, distributed systems that involve several computers
in different locations and access data from databases that are effectively stored in various
locations across a network or even several networks.
Particular challenges include the need for efficient compression that maintains the
important information contained in medical images, and efficient storage solutions that
make it easy for users to find, share and retrieve subsequently the images they need
(Bankman, 2009).

3.5 DIFFERENT IMAGE-PROCESSING METHODS USED IN THIS THESIS


We used many image processing techniques starting with filtering to reduce noises, and
wavelet transforms to extract features from images. In this section the following techniques
used in our work was discus.

3.5.1 Wiener-Filter techniques


The process or technique of enhancing or transforming images is known as Image filtering.
In addition, image filtering gains ability for users to apply various effects on images. These
effects may eliminate some features of images and emphasize other features.
The Wienerfilter is a filtering technique used to produce an approximation of a preferred or
targeted random process by linear time-invariant filtering of a perceived noisy route,
assuming known fixed signal with noise spectra, and additive noise. The Wienerfilter
decreases the mean square error between the likely random process and the preferred
process.
It is the most important technique for removal of blur in images due to linear motion or
unfocussed optics. From a signal processing standpoint, blurring due to linear motion in a
photograph is the result of poor sampling. Each pixel in a digital representation of the
photograph should represent the intensity of a single stationary point in front of the
camera. Unfortunately, if the shutter speed is too slow and the camera is in motion, a given
pixel will be an amalgram of intensities from points along the line of the cameras motion.
This is a two-dimensional analogy to
G(u,v)=F(u,v).H(u,v)

Where F is the Fourier transform of an "ideal" version of a given image, and H is the
blurring function. In this case H is a sinc function: if three pixels in a line contain info from
the same point on an image, the digital image will seem to have been convolved with a
three-point boxcar in the time domain. Ideally one could reverse-engineer a Fest, or F
estimate, if G and H are known. This technique is known as inverse filtering.

3.5.2 The wavelet transform techniques


The wavelet transform is one of several other types of mathematical transforms that are
applied to signals to obtain information that is not readily available in the raw signal
(Erickson, 2005). One of the most common types of transforms is the Fourier transform.
In medical imaging, wavelets have been used for many applications, including feature
extraction. An example of this is the extraction of micro calcifications from mammograms.
By using wavelets, a mammogram can be decomposed into high and low frequency
components. According to Erickson (2005), Micro-calcifications appear as a small bright
spots on a mammogram, and are represented by the high frequency components of the
decomposition.
Suppression of the low-frequency components when the image is reconstructed, leads to
the enhancement of micro-calcification, allowing them to be segmented from the
mammograms. This was the technique used by Wang and to enhance micro-calcifications
(Erickson, 2005).
For this research, the goal of using wavelets is to uncover features in the image data that
could be used to distinguish normal samples from tumour samples. Discrete Wavelet
Transform (DWT) was applied to the breast image patterns in order to extract features that
would be useful in classifying the pattern.

3.5.2.1 Discrete Wavelet Transform (DWT)


DWT is a type of wavelet transformation that captures in addition to frequency
information, location and time information too. However, in Fourier transform, the
transformed message has only frequency information in the frequency domain. This fact
has made DWT an important signal information extractor (Erickson, 2005).

The discrete wavelet transform allows a signal to be sampled at discrete points, which
leads to an efficient computation. Discrete wavelets are scaled and translated into
indiscrete steps which Erickson (2015) believes is achieved using scaling and translation of
integers instead of real numbers.

3.5.2.2 The wavelet Transforms vs. Fourier Transforms


The Fourier` transform is one of the best known and understood mathematical transforms.
Therefore, it makes sense to talk about the resemblances and differences between the
Fourier transform and the wavelet transform. The 1 D continuous Fourier` transforms can
be written as fellow:
 iwt
1 
F (W ) 
2

f (t )e dt
f Represent the signal in the time domain, t is time, F represent the signal in the frequency
domain; and W is frequency. The formula for continuous wavelet transform is written as
follows:

c( s, )   f (t ) * s, (t )d

Again, f is the signal in the time domain, t equals time, C is the wavelet coefficient, s is
scale, with  as translation.  * s, is known also as the mother wavelet (Erickson, 2005).
Both the Fourier and wavelet transform allow a temporal signal to be analysed for the
purpose of its frequency content. The Fourier transform is known also as a linear transform
that indicates a function with a base function of cosine and sine. Similarly, the wavelet
transform is also a linear transform that represents a function with a basis of wavelet
functions. Finally, with both the Fourier transform and wavelet transform, an inverse
transform returns the original signal (Erickson.2005).

3.5.3 Edge Detection


Edge detection refers to the process of identifying and locating sharp lapses in an image.
These discontinuities/lapses are abrupt changes in pixel intensity which characterize
boundaries of objects in a scene. According to Raman Maini and Himanshu Aggarwal,
classical methods of edge detection involve convolving the image with an operator (a 2-D
filter), which is constructed to be sensitive to large gradients in the image while returning
values of zero in uniform regions.
Though so many ways exist of performing edge dectection, it can be grouped into two
categories:

3.5.3.1 First Order Derivative based Edge Detection (Gradient method):


This is gradient based and involves the use of a first order derivative. The enormity of the
gradient computed gives edge strength and gradient direction that is often or always
vertical to the direction of image edge. For instance, if I (i, or j) be the input image, then
image gradient is calculated by following formula;
I  i, j  I  i, j 
I  i, j   i j
i j

I  i, j 
Where : Is the gradient in I direction
i

I  i, j 
Is the gradient in j direction.
j

The gradient magnitude can be calculated by the formula:

⌈𝐺⌉ = √𝐺𝑖 2 + 𝐺𝑗 2

3.5.3.2 Second Order Derivative Based Edge Detection (Laplacian based Edge
Detection):
This method searches for zero crossings in the second derivative of the image to search for
edges. An image edge has the one-dimensional shape of a ramp and image derivatives; its
location can be highlighted. This method is one of the characters in the "Gradient Filter"
contour detection filter family. For a pixel location to be declared an edge location, the
value of its gradient must exceed a certain threshold. As opined earlier, edges have higher
pixel intensity values than those surrounding it. Therefore, once a threshold is set,
comparisons can be drawn between the gradient value and threshold value and edges can
be detected whenever the threshold is exceeded. In addition, when the first derivative is at
a maximum maximum, the second derivative is zero. As a result, another alternative for
finding the location of an image edge is to locate zeros in the second derivative of the
image.

(A) ideal step edge, (b) first order derivation and (c) second order derivation.
This approach uses zero-crossing operator which acts by locating zeros of the second
derivatives of image I (i, j). The differential operator is used in zero crossing edge
2 I 2 I
detectors,  2 I   j
i 2 j 2
Thresholding allots a range of pixel values to any object of interest. It works best with
greyscale images that utilize the whole range of greyscale. For the image I (i,j), the
threshold image g (i, j) is defined as,
1 if I (i, j )T
g (i, j )  Where T is the threshold value.
0 if I (i, j )  T

3.5.3.3 Edge Detection Techniques


Robert, Sobel, Prewitt are classified in the category of classical operators, simple and easy
to use, but extremely sensitive to noise. Conventional operators and malignant operators
belong to the category of Edge Detection based on a first-order derivative (gradient
method). The Marr-Hildreth edge detector is a gradient-based operator that uses the
Laplacian to take the second derivative of an image.

3.5.3.4 Advantages and Disadvantages of Edge Detector

As edge detection is a fundamental step in computer vision, it is necessary to specify the


true edges to get the best results from the matching process. Thats why it's important to
choose the edge detectors that best suit your application. In this regard, we first present
some advantages and disadvantages of edge detection techniques in the context of our
classification in the table.
Operator Advantages Disadvantages
Classical (Sobel, Simplicity, Sensitivity to

prewitt, Kirsch,…) Detection of noise, Inaccurate


edges and their
orientations
Zero Detection of Responding to

Crossing(Laplacian, Second edges and their some of the existing


directional derivative) orientations. edges, Sensitivity to
Having fixed noise

Laplacian of characteristics
Finding the Malfunctioning
in all directions
Gaussian(LoG) (Marr- correct places of at the corners, curves
Hildreth) edges, Testing and where the gray
wider area level intensity function
around the pixel varies. Not finding the
orientation of edge
Gaussian(Canny, Using Complex
because of using the
Shen-Castan) probability for Laplacian filter
Computations, False
finding error zero crossing, Time
rate, consuming
Localization
and response.
Improving
signal to noise
ratio, Better
detection
specially in
noise conditions
3.5.3.4 Laplacian of Gaussian

Laplacian morphology illustrates the area of rapid change and is therefore often used in
contour research. Laplacian imaging that is used near objects with Gausian facial filters is
often used to learn about noise. The service provider often uses the gray graphs to create
and create the next level gray image.

The Laplacian L (x, y) of a potential pixel (m, y) is given:

2 I 2 I
L( x, y)   j
x 2 y 2

3.5.3.5 Zero Crossing


The zero crossing detector looks for places in the Laplacian of an image where the value of
the Laplacian passes through zero i.e. points where the Laplacian changes sign. Such
points often occur at edges in images (i.e. points where the intensity of the image changes
rapidly), they are not limited there because they also occur at places that are not so easy to
associate with edges. Zero crossing detectors can simply be seen as a sort of feature
detector rather than as a specific edge detector. Zero crossings always lie on closed
contours, and thereby making the output from the zero crossing detectors a binary image
with single pixel thickness lines showing the positions of the zero crossing points.
The starting point for the zero crossing detectors is an image which has been filtered using
the Laplacian of Gaussian filter. The zero crossings that result are strongly influenced by
the size of the Gaussian used for the smoothing stage of this operator. As the smoothing
increases, fewer and fewer zero crossing contours will be established, and the remnant will
correspond to features of larger (always multiplying) scale in the image.
It works on zero crossing method. LOG uses both Gaussian and laplacian operator so that
Gaussian operator reduces the noise and laplacian operator detects the sharp edges in an
image.
For this research, the goal of using zero crossing LOG method is to utilize both the
Gaussian and laplacian operator so that Gaussian operator will reduce the noise and the
laplacian operator will detect the sharp edges in the image.
3.6 Convolutional Neural Network (CNN) Architectures

Convolutional Neural Networks (CNN, or ConvNet) is a unique type of optical network


available in a variety of formats, designed to visualize optical signals from Pixel images
with minimal use. The following image is a visual representation of the model. The
ImageNet project organizes an annual software competition, the ImageNet Large Scale
Visual Recognition Challenge (ILSVRC), where software programs compete with
discovery and film. Here I am going to talk about CNN's home work of the top competitors
of the ILSVRC.3.6.1 LeNet-5 (1998)
LeNet-5, a pioneering 7-level convolution network developed by LeCun et al. In 1998,
which classifies numbers, has been used by several banks to recognize handwritten
numbers on checks (checks) scanned into 32x32 pixel greyscale input images. The ability
to process high resolution images requires larger, more convoluted layers. This technique
is therefore limited by the availability of computer resources.

3.6.2 AlexNet (2012)


In 2012, AlexNet significantly outperformed all the prior competitors and won the
challenge by reducing the top-5 error from 26% to 15.3%. The second place top-5 error
rate, which was not a CNN variation, was around 26.2%.
The network had an architecture very similar to LeNet by Yann LeCun et al. But was
deeper, with more filters per layer and stacked convolution layers. It consisted of 11x11,
5x5,3x3, convolutions, maximum pooling, dropouts, data augmentation, ReLU activations,
DMS with momentum. It has associated ReLU activations after each convolutive and fully
connected layer. AlexNet was trained for 6 days simultaneously on two Nvidia Geforce
GTX 580 GPUs which is the reason for why their network is split into two pipelines.
AlexNet was designed by the SuperVision group, consisting of Alex Krizhevsky, Geoffrey
Hinton, and Ilya Sutskever.
3.6.3 ZFNet(2013)
Not surprisingly, the ILSVRC 2013 winner was also a CNN which became known as
ZFNet. It achieved a top-5 error rate of 14.8% which is now already half of the prior
mentioned non-neural error rate. It was mostly an achievement by tweaking the hyper-
parameters of AlexNet while maintaining the same structure with additional Deep
Learning elements as discussed earlier in this essay.
GoogLeNet/Inception(2014)
The winner of the ILSVRC 2014 competition was GoogLeNet (a.k.a. Inception V1) from
Google. It achieved a top-5 error rate of 6.67%! This was very close to human level
performance which the organizers of the challenge were now forced to evaluate. As it turns
out, this was actually rather hard to do and required some human training in order to beat
GoogLeNets accuracy. After a few days of training, the human expert (Andrej Karpathy)
was able to achieve a top-5 error rate of 5.1%(single model) and 3.6%(ensemble). The
network used a CNN inspired by LeNet but implemented a novel element which is dubbed
an inception module. It used batch normalization, image distortions and RMSprop. This
module is based on several very small convolutions in order to drastically reduce the
number of parameters. Their architecture consisted of a 22 layer deep CNN but reduced the
number of parameters from 60 million (AlexNet) to 4 million.

3.6.4 VGGNet (2014)


The finalist of the ILSVRC 2014 contest is named VGGNet by the community and was
developed by Simonyan and Zisserman. VGGNet consists of 16 convolutional layers and
is very attractive because of its very uniform architecture. Similar to AlexNet, only 3x3
convolutions, but a lot of filters. Trained on 4 GPUs for 2 to 3 weeks. It is currently the
most popular choice in the community for extracting features from images. The VGGNet
weight configuration is publicly available and has been used in many other applications
and challenges as a basic feature extractor.However, VGGNet consists of 138 million
parameters, which can be a bit challenging to handle.
3.6.5 ResNet(2015)
At last, at the ILSVRC 2015, the so-called Residual Neural Network (ResNet) by Kaiming
He et al introduced a novel architecture with “skip connections” and features heavy batch
normalization. Such skip connections are also known as gated units or gated recurrent units
and have a strong similarity to recent successful elements applied in RNNs. Thanks to this
technique they were able to train a NN with 152 layers while still having lower complexity
than VGGNet. It achieves a top-5 error rate of 3.57% which beats human-level
performance on this dataset.
AlexNet has parallel two CNN line trained on two GPUs with cross-connections,
GoogleNet has inception modules, ResNet has residual connections.

Summary Table
3.7 Machine Learning and Algorithm
In this work, controlled learning was used. Two different kinds of supervised machine
learning algorithms was been employed; logistic regression and neural networks. We
compared the results of these two algorithms in our work. In the following sections we will
introduce these types. Logistic regression algorithm plus neural network are also the
standard method employed for clinical classification problems.

3.7.1 Artificial Neural Networks (ANN)


ANN is the most well-known oversaw machine learning algorithm .It has many types and
families (Rahman, et, al 2013).
Artificial Neural Networks are considered also as a field of artificial intelligence. The
improvement of the model was stirred by the neural architecture of the human brain; ANN
has been pragmatic in many disciplines including biology, statistics/mathematics, medical
with computer science. Recently, artificial neural networks have become a very current
model and have been applied to analyse disease and foretell the survival ratio of the
patients. (Raghavendra, et al 2011).
ANN modelling, a paradigm for computational and knowledge representation(Rahman, et,
al 2013).The most important advantage of ANN is the detection of complex and non-linear
relationship between independent and dependent variable. The performance of a neural
network is to be determined by some number of factors the network weightiness, the
choice of a correct training algorithm, the type of transmission function used, and the
determination of the network size. (Raghavendra, 2011).

3.7.2 Biological Neural Networks


A neuron (or nerve cell) is a special biological cell that processes information as shown in
Figure 3.2, it is composed of a Soma or cell body, and two kinds of outreaching tree-like
branches: the axon and the dendrites. The cell body have a nucleus that comprises of data
about hereditary traits and a plasma that holds the molecular equipment for producing
material required by the neuron (Jain, 1996).
Other words biological neural is composed of a Soma, which composed of cell body and
Dendrite and where cell bodies are connected through Axons (Jain, et al 1996).
Figure3.2b the Human Neurons (JAIN,1996)

Artificial Neural Networks (ANN) are utilized in three primary ways:


- As models of biological nervous system and intelligence.
- As real time adaptive signal processing controllers implemented in hardware for
applications such as robots.
- As data analytic methods. The primary rule of neural network computing is the decay of
the information yield relationship into a progression of directly detachable advances
utilizing hidden layer
There are three unmistakable strides in building up an ann based solution:
- Data transformation or scaling.
- Network architecture definition, when the number of hidden layers, the number of nodes
in each layer and connectivity between the nodes and set, construction of learning
algorithm so as to modle the network.
Figure3.3 Simple structure of a typical neural network (Rahman,2013)

Figure3.4 Structure of a typical neural network (Rahman, 2013)


The above Figure3.4 shows the architecture of a usual network that is made up of an input
layer, chains of hidden layer, and an output layer with links between them. Nodes in the
input layer represent potential influential features that affect the network output and have
none computational happenings, while the output layer comprises one or more nodes that
create the network output.

Hidden layer may have a large number of out of sight processing nodes. A feed – forward
back–propagation network broadcasts the information from the input layer onto the output
layers, matches the network output with identified target, and propagates the inaccuracy
term from the output layer back to the input layer, using a learning mechanism to adjust the
loads and biases.(Rahman, et, al 2013).

3.7.3 Pattern Recognition Network (patternnet)


Which syntax is patternnet(hiddenSizes,trainFcn,performFcn) can be best described as a
Pattern recognition networks or feedforward networks that can be trained to classify inputs
according to target classes. The target data for pattern recognition networks should consist
of vectors of all zero values except for a 1 in element i, where i is the class they are to
represent.
patternnet(hiddenSizes,trainFcn,performFcn) takes these arguments,

hiddenSizes Row vector of one or more hidden layer sizes (default = 10)
trainFcn Training function (default = 'trainscg')
performFcn Performance function (default = 'crossentropy')

3.7.4 Feedforward Neural Network (feedforwardnet)


Syntax as feedforwardnet(hiddenSizes,trainFcn), Feedforward networks consist of a series
of layers. The first layer has a connection from the network input. Each subsequent layer
has a connection from the previous layer. The final layer produces the network’s output.
Differentiated from recurrent neural network, feedforward neural network is the simplest
form of artificial neural network that allows for the movement of information from only
one direction i.e. forward from the input layer. Though hidden layers may exist, there is
only one for output and there are no cycles or loops in the network. Although simple, it can
be used for any kind of input to output mapping because it can fit into infinitesimal input-
output mapping problems
Specialized versions of the feedforward network include fitting (fitnet) and pattern
recognition (patternnet) networks. A variation on the feedforward network is the cascade
forward network (cascadeforwardnet) which has additional connections from the input to
every layer, and from each layer to all following layers.
feedforwardnet(hiddenSizes,trainFcn) takes these arguments,
hiddenSizes Row vector of one or more hidden layer sizes (default = 10)
trainFcn Training function (default = 'trainlm')

and returns a feedforward neural network


3.6.5 Fitnet Neural Network (fitnet)
Fitnet neural network with its syntax as fitnet(hiddenSizes,trainFcn) is the method of
training a neural network on a set of inputs such that an associated set of desired outputs
are produced from them.
It involves the construction of network with the preferred hidden layers and the training
algorithms, before subsequently instructing it with a set of training data for output. Once
the neural network has fit the data, a generalization of the input-output relationship is
formed thereby making it possible to use the trained network to engender other output that
it wasn’t even trained on.
Consider the following arguments:

hiddenSizes Row vector of one or more hidden layer sizes (default = 10)
trainFcn Training function (default = 'trainlm' |'trainbr' | 'trainbfg' | 'trainrp' | 'trainscg')

3.8 K-Fold Cross-Validation


Cross validation in K-fold, the data is partitioned first into k equally or almost equal sized,
folds or segments. Afterward k iterations of training and validation will be implemented
such that within each iteration a different fold of data is placed out for validation while the
rest of k – 1 folds are used for learning. Data is commonly stratified prior to being split
into k folds. Stratification is the process of rearranging the data as to ensure each fold is a
good representative of the whole.

3.9 USING OF MATLAB IMPLEMENTATION


Matlab is the most regularly used software package in digital image processing. It has
powerful and easy to use features for dealing with complex structures, arrays and images,
for example an image reading process is one command "imread". These functionalities are
already obtainable in Matlab. Very important steps in image processing are Filtering and
Wavelet transformation, then Artificial Neural Network applications and Logistic
Regression for prediction. All these functionality becomes main reason to use Mat lab
implementation.
We use Matlab software to implement the algorithms for the reason that Matlab is a high–
performance language for learning and research as it computation visualization and
programming in a stress-free to use environment where difficulties and solutions are
expressed in familiar mathematical notation and also it has toolboxes for neural networks,
signal processing, image processing ,and databases(Beucher, 1990).
Matlab image processing toolbox is a gathering of functions that extend the proficiency of
the Matlab numeric computing environment. The toolbox supports a extensive range of
image processing processes such as image analysis and enhancement, region of interest
operation, linear filtering and filter-design(Beucher, 1990).
CHAPTER 4

EXPERIMENTAL RESULTS AND DISCUSSIONS

This chapter discusses the implemented model. It starts by explaining the two main layers
of the model; image pre-processing and machine learning. Subsequently, this chapter will
present the experiments and the analytical results that were obtained. Detecting breast
cancer by utilizing mammography images is a two steps procedure. In the first step, images
are filtered, cropped and mapped into values that can be used as an input to a second step.

In the second step, the input data can be used to train the system to predict future cancer in
future images. Our model consists of these two steps or layers. In the following sections,
the result, and processing of the images will be discussed.
Figure 4.1 steps of our model
4.1 METHODOLOGY

Image preprocessing and features extraction

Dataset Overview

The dataset we used in this research consists of three classes of breast mammography.
Which are benign, malignant and normal, Benign has 376 images, malignant 453 and
normal is 632 images.

Figure 4.2 (a) Mammogram image of Benign Breast

Figure 4.1 (a) is a mammogram image of breast. Benign (non-cancerous) breast conditions
are unusual growths or other changes in the breast tissue that are not cancer
Figure 4.1(b) Mammogram image of malignant breast

If a tumor is found to be malignant, you have breast cancer or another form of cancer as
seen in figure 4.1(b) above. Malignant tumors are aggressive and will spread to other
surrounding tissues. When the tumor is identified, your doctor may recommend a biopsy to
identify how advanced the cancer is and how severe it is.
Figure 4.2 (c) Mammogram image of normal breast without cancer

Figure 4.1(c) is a mammogram of a normal fatty breast that does not have a lot of dense
tissue. A mammogram searching for abnormal lesions, benign lumps, or breast cancer is
more accurate when performed on women with non-dense breasts such as these.

The gray areas correspond to normal fatty tissue, while the white areas are normal breast
tissue with ducts and lobes. While breast masses also appear white on a mammogram, the
color is typically more concentrated because they are denser than other features of a
normal breast, like those seen in the figure 4.1(c).

The method here is applied to real clinical database of 1461 mammograms images. These
mammogram are divided into 90%, and 10% are used to train and test respectively for our
model.

4.1.1Image Resizing

The size of the mammogram image dataset obtain from the screening centers was 2744 by
4728, it was too large and require much space for this reason we resulted to resizing the
image to 256 by 256 all squared.
Figure 4.3 Resized image

4.1.2Wiener filter

Noise was removed from the cropped image through wiener filter

Our dataset images have fuzzy or blur effects. This effect is considered as noise in our
data. To remove and eliminate it, Wiener filter will be used.

In signal processing, Weiner filter is a technique that estimates the target signal by Linear
Time Invariant (LTI) processing on a noisy signal. In Matlab, Wiener filter is categorized
as a de-blurring filter. Figure 4.3 shows the image before and after applying Wiener filter.
We can observe that the white lines are less blurred in the figure 4.4(B) than the white lines
in the figure 4.4 (A).
Figure 4.4 Image before applying Wiener filter (B) Image after Wienerfilter

Table 4.1 NN Result without Image Preprocessing compare with Winerfilter

NN Result without Image Preprocessing compare with Winer filter


trainscore trainerror testscore testerror
NN Result without Preprocessing 82% 14% 78% 17%
NN Result on Winer filter 82% 14% 78% 17%

Figure 4.4 (a) is the image of the noisy image and the result after the wiener filter has been
applied to remove the constant power addictive noise (Gaussian white noise). A
neighborhood size of 5 by 5 was used to estimate the noisy image mean and standard
deviation.

4.1.3Discrete wavelet transform

Image transformation

When a single-level two-dimensional wavelet is used, the output, which is known as


wavelet decomposition, consists of four matrices. The first matrix is defined as the
approximation coefficients matrix. The other three matrices are the detailed coefficient
matrices (horizontal, vertical and diagonal).

The coefficients in of the other three matrices Approximation, CV, CH, have not been used
in our work. However the forth matrix which is defined as the CD (diagonal
decomposition) have been utilized as input data to our learning algorithm as will be seen in
the following sections. Figure 4.5 shows the output of DWT of one of the images in our
dataset. The image is shown in Figure4.5.

Figure 4.5 DWT image based on approximate image detail (LL), horizontal details (HL),

vertical details (LH) and diagonal details (HH) in one level.


Table 4.2 NN result on DWT decomposition levels

NN Result on various DWT decomposition level


AVGtrain AVGtest
trainscore trainerror testscore testerror
LL Aproximation coeeficient 73% 17% 65% 21%
HL Horizontal 57% 30% 44% 33%
LH Vertical 55% 32% 42% 35%
HH Diagonal 43% 36% 43% 36%

Figure 4.5 is the discrete wavelet decomposition of the noiseless image. Using DWT
techniques, the images are broken down into four parts: Approximate Image, Horizontal
Detail, Vertical Detail, and Diagonal Detail. When we apply a high frequency to an image,
the gray level varies greatly between the two adjacent pixels. So the edges have occurred in
the image. When we apply a low frequency to an image, the variations between the
adjacent pixels are smooth.So edges are not generated or very few edges are generated. All
information of image is remaining same as real image information (it display as
approximation image).
4.1.4 Zero crossing

Figure 4.6 HH Diagonal image of DWT and Zero crossing

Table 4.3 NN result on HH Diagonal of DWT and Zero crossing

NN Result without Image Preprocessing compare with ZC (Edge)


trainscore trainerror testscore testerror
NN Result without Preprocessing 82% 14% 78% 17%
NN Result on ZC (Edge) 75% 20% 59% 27%

The zero crossing algorithms was applied on the output of the DWT image as seen in
figure 4.6, in the above table 4.3, it is seen that when we apply zero crossing algorithm
transcore result we archive was 82% while the result archive without applying zero
crossing is 75% it is seen that the result we obtain when we compared the both results zero
crossing algorithm, improves the output of image and thus it was use in our model.
Table 4.4 NN result on HH Diagonal of DWT and Zero crossing

NN Result on DWT and ZC (Edge)


AVGtrain AVGtest
trainscore trainerror testscore testerror
LL Aproximation coeeficient 75% 20% 59% 27%
HL Horizontal 84% 16% 65% 26%
LH Vertical 66% 27% 64% 30%
HH Diagonal 89% 11% 65% 26%

For the purpose of this thesis, the HH Diagonal of DWT was use because it produced a
better result when tested in our NN model with the zero crossing algorithm.

Table 4.4 shows the NN result of various discrete wavelet decomposition when combined
with Zero crossing that was used in this thesis. As seen in the Fig above, the average train
score and average test score of HH Diagonal is 89% and 65% respectively as compared to
LL Approximation coefficient, LH Vertical, HL Horizontal which have lower percentage
values respectively. This results show that the HH Diagonal having the highest values
produced better result.

4.2 Experiment

488 patients cases have been collected from The Digital Database for Screening
Mammography (DDSM) (Michael Heath, Kevin Bowyer et el 2001), at the University of
South Florida (K. Bowyer), and Sandia National Laboratories (P. Kegelmeyer), where
1461 images where extracted from these cases. These images are used to train and test our
model. 90% and 10% are the percentages that have been utilized for training, and testing.
Each one of these images has a resolution of 4696x3024pixels. From these images, 632 are
normal images; benign has 376 images, malignant 453 images. We arranged them and
created our result vector for trainings.
This vector has been used in our neural network model. The image was resized to a
coordinates of 256 by 256. The output matrices where fed as input to Wiener filter to de-
blur them. Subsequently, the output was used as an input to wavelet. Finally, zero crossing
values and data normalization ends the preparation process of our images.

The output matrices of the preprocessing step with the training vector that we prepared are
used to train our machine learning models. Figure 4.8 shows the average train score value
of the neural network model after image processing. We can observe that the average value
is 89%. Moreover, we can observe from figure 4.9 shows the best validation value which is
0.26%. Finally, Figure 4.12 shows the performance of the gradient of neural network.
4.3 Result

Table 4.5 Result obtain after neural network training

RESULTS ORBTAIN AFTER THE FOLLOWING PROCESS WAS PERFORM ON pgm FORMAT IMAGE DATASET
RESULT AFTER RESULT AFTER RESULT AFTER
RESULT WITHOUT IMAGE RESULT AFTER IMAGE
APPLYING ONLY APPLYING ONLY DWT APPLYING ONLY ZERO
PROCESSING PROCESSING
WIENER FILTER (HH Diagonal) CROSSING

trainscg=Scaled trainscg=Scaled trainscg=Scaled trainscg=Scaled


trainscg=Scaled conjugate conjugate gradient conjugate gradient conjugate gradient conjugate gradient
trainFcn
gradient backpropagation. backpropagation. backpropagation. backpropagation. backpropagation.

HiddenLayerSize 10 10 10 10 10
three classes of breast mammography Benign = 376, Malignant = 453, Normal = 632.
Total number of image (dataset) 1461 1461 1461 1461 1461
Total number of trained image (dataset) 1315 1315 1315 1315 1315
Total number of test image (dataset) 146 146 146 146 146
Numiter = number of iteration 10 10 10 10 10
net = net = patternnet(hiddenLayerSize,trainFcn)

averageTrainError 14% 14% 11% 20% 11%


AVGtrainscore 82% 82% 89% 75% 89%

averageTestError 17% 17% 26% 27% 26%


AVGtestscore 78% 78% 65% 59% 65%

In order to obtain the average train error, and average train score we have to run the mat
lab program to obtain result after image processing, result without image processing,
Result after applying only wiener filter, result after applying only zero cross, and result
applying only dwt and result after applying dwt and zero crossing. In this process, we also
have to consider the following: the hidden layer size, three classes of breast
mammography, total number images(1461), total number of trained images(1315), total
number of test images() and Numiter=number of iteration.

The average test error was obtained by adding the average test result and the performance
test result from the neural network which resulted to the percentages as shown in table 4.5.
The average test score was obtained by dividing the average test and number of iteration
which is 10 Numiter in our case, which resulted to the percentages result shown in the
figure above. The difference in the result between average train error, average train score,
average test error, and average test score in the table above, is as a result of the different
output obtained from the different preprocessing techniques of the dataset that was used as
the input to the neural network in running the matlab program.

Figure 4.7 NN Training confusion matrix.


On the graph of the confusion matrix, as shown in Figure 4.7, the rows correspond to the
predicted class (output class) and the columns indicate the real class (target class). The
diagonal cells indicate for how much (and what percentage) of examples the formed
network correctly estimates the classes of observations. In other words, it indicates what
percentage of the true and predicted classes corresponds. Diagonally off cells indicate
where the classifier made mistakes. The column on the far right of the graph shows the
precision of each predicted class, while the line at the bottom of the graph shows the
accuracy of each real class.. The cell in the bottom right of the plot shows the overall
accuracy. (Mr. Madhan S, Priyadharshuini P, Brindha C, Bairavi B. 2019).

Figs. 4.7 show the confusion matrix obtained from our experiment. At the all confusion
matrix partitions, we obtain 11.0 – 89.0%, using our model as compared to (A.M. Abdel-
Zaher, A.M. Eldeib 2016) result which varied from (0.5– 99.5%) best classifier accuracy of
deep belief network (DBN-NN)

Figure 4.8 NN Training performance


Performance Plot perform (TR) plots error vs. epoch for the training, validation, and test
performances of the training record TR returned by the function train. (figure. 4.8) outputs
= net (inputs); errors = gsubtract(target, outputs); performance = perform (net, targets,
outputs)

Figure 4.9 NN Receiver operating Characteristics

The above figure 4.9 is the characteristics receiver in which the NN is operating, it shows
the true positive rate / false positive rate of the training region ROC, validating ROC, test
ROC and all ROC put together
CHAPTER 5

CONCLUSIONS AND FUTURE WORK

Conclusions
Breast cancer is the most commonly diagnosed type of cancer in women. Although the
death rate is the second highest among women with cancer, early detection of the disease
greatly improves the chance of survival. Therefore, it is important to develop new and
improved methods for breast cancer screening.

This dissertation explored the potential benefits of a new proposed method for automated
detection of breast cancer using mammogram images.

The main contributions to the existing of knowledge are two: first, an overview of existing
image processing techniques currently used for CAD systems that can help diagnose breast
cancer; second, a new method for automated detection of breast cancer using
mammography images, image processing techniques and the machine learning algorithms.
The dissertation described in detail the new method proposed, its implementation in
Matlab and its evaluation on a dataset of 1461 breast mammogram images. The focus was
on exploring how the method performs in various conditions and not on providing an
overall accuracy result for the method.

The overarching goal of this thesis was to improve breast cancer screening by using neural
network to assist radiologists in the classification of breast lesions.

Additional, another aim of this thesis was to use data that was acquired from the Digital
Database for Screening Mammography (DDSM) (Michael Heath, Kevin Bowyer et el
2001), at the University of South Florida (K. Bowyer), and Sandia National Laboratories
(P. Kegelmeyer), where 1461 images where extracted from these cases. We used real data
this helped us to evaluate algorithms that has been used.

The main task of the thesis was to find features in the data that would distinguish normal
samples from those containing tumours. Use wavelet technique to extract features and
filters to reduced noises and fuzzy are well defined in Matlab.
Future Work

Considering the initial, exploratory nature of the work done for this dissertation, the results
are also informative with respect to potential directions for future work that are likely to
yield valuable results. For instance, a first future step would be to evaluate the method
more thoroughly by using other programming because Matlab did not allow us to use large
number of features with ANN.

The tests focused solely on the accuracy of tumour detection. However, additional tests on
other suitably annotated data sets can reveal the accuracy of the method in detecting each
type of tissue. In turn, this could be very helpful for practitioners and even to further
improve the diagnosis accuracy of the method, since it is known that two types of breast
tissue (the denser ones) can hide more easily signs of cancer so that they are often missed
at scans and not visible until later. Thus, reliable information on the distribution of such
tissue and perhaps even a technique to further investigate such tissue more thoroughly
could offer additional useful diagnosis help.

Another direction for future work is ,our focus on the examination of the image either
cancer or normal , it is possible to bring samples where the cancer is classified into
malignant and benign and use method to distinguish between type of cancer malignant and
benign.
REFERENCES
2016 IEEE International Conference on Systems , Man, and Cybernetics SMC
20161 October 9-12 , 2016 Budapest , HungaryA “Versatile Edge Preserving
Image Enhancement Approach For Medical Images Using Guided Filte”

A Dualistic Sub-Image Histogram Equalization Based Enhancement and


SegmentationTechniques for Medical Images, K. RajMohan, Asst.Professor,
Dr.G.Thirugnanam, Asst. Professor

Acr, 2013 ACR BI-RADS Atlas: Breast Imaging Reporting and Data System. American
College of Radiology, 2013.

Acha ,B, Rangayann , R,M, Desautels, J,E, L.(2006). "Detection of microcalcifications in


mammograms" .SPIE ,Bellingham, Recent Advances in Breast Imaging
,Mammography, and Computer Aided Diagnosis of Breast Cancer .

Akay, M. (2006). Wiley Encyclopedia of Biomedical Engineering (1st ed.). Wiley


Interscince.

Al-Shamlan, Hala, and Ali El-Zaart. “Feature extraction values for breast cancer
mammography images.” In Bioinformatics and Biomedical Technology (ICBBT),
2010 International Conference on, pp. 335-340. IEEE, 2010.

Altrichter M., Ludanyi, Z., Horvath, G., ”Joint analysis ofmultiple mammographic
views in cad systems for breast cancer detection,” In: Proc. of Image Analysis.
14th Scandinavian Conference, 2005.

Andreea, Gheonea Ioana, R. A. L. U. C. A. Pegza, L. U. A. N. A. Lascu, S. I. M. O. N. A.


Bondari, Z. O. I. A. Stoica, and A. Bondari. “The role of imaging techniques in
diagnosis of breast cancer.” J. Curr. Health Sci 37, no. 2 (2011): 241-248.

Arnau, O. (2007). Automatic mass segmentation in mammographic images.


Universitat de Girona, Department of Electronics, Computer Science and
Automatic Control, Girona.
Ashmithakhaleel, K. N. (2014). Wavelet based automatic lesion detection using
improved OTSU method.International Journal of Computer Science & Network
Solutions, 119-127.

Astley, S. (2003). Computer-aided detection for screening mammography. International


Congress Series, 1256, pp. 927-932.

Autier, P., Boniol, M., LaVecchia, C., Vatten, L., Gavin, A., Héry, C., et al. (2010).
Disparities in breast cancer mortality trends between 30 European countries:
retrospective trend analysis of WHO mortality database. BMJ 2010, 341:c3630.

Artificial Neural Networks.. Ani1 K. Jain Michigan State University.. Jianchang Mao

K.M. Mohiuddin IBM AZmaden Research Centere. 1996..


https://csc.lsu.edu/~jianhua/nn.pdf

Baert, A., Reiser, M., Hricak, H., & Kanuth, M. (2010). Digital Mammography. Springer.

Baker, J., Rosen, E., Lo, J., Gimenez, E., Walsh, R., & Soo, M. (2003). Computer-
Aided Detection (CAD) in Screening Mammography: Sensitivity of
Commercial CAD Systems for Detecting Architectural Distortion. AJR Am J
Roentgenol., 181, No 4.

Barrett, & A. Gmitro, Information Processing in Medical Imaging (Vol. 687, pp. 472 -
486).

Bick U., F. Diekmann, "Digital Mammography Book," Springer-Verlag Berlin


Heidelberg, 2010

Brandt, Sami S., Gopal Karemore, Nico Karssemeijer, and Mads Nielsen. “An
anatomically oriented breast coordinate system for mammogram analysis.” IEEE
transactions on medical imaging 30, no. 10 (2011): 1841-1851.

Bruce, L. M., & Adhami, R. R. (1999, Dec.). Classifying Mammographic Mass


Shapes Using the Wavelet Transform Modulus-Maxima Method. IEEE Trans.
Medical Imaging, 18, No.12, pp. 1170-1177.
C. Manoharan and N. S. R. Lakshmi, “Classification of micro calcifications in
mammogram using combined feature set with svm,” International Journal of
Computer Applications, vol. 11, no. 10, pp. 30–34, 2010.

Calas, Maria Julia Gregorio, Bianca Gutfilen, and Wagner Coelho de Albuquerque Pereira.
“CAD and mammography: why use this tool?.” Radiologia Brasileira 45, no. 1
(2012): 46-52.

Cao, Ying, Xin Hao, Xiaoen Zhu, and Shunren Xia. “An adaptive region growing
algorithm for breast masses in mammograms.” Frontiers of Electrical and Electronic
Engineering in China 5, no. 2 (2010): 128-136.

Chandrika Saxena, Prof. Deepak Kourav‖ Noises and Image Denoising Techniques: A
Brief Survey‖ Versha Rani et al, Journal of Global Research in Computer Science, 4
(4), April 2013, 166-171

Cheng, H. D., X. J. Shi, Rui Min, L. M. Hu, X. P. Cai, and H. N. Du. “Approaches for
automated detection and classification of masses in mammograms.” Pattern
recognition 39, no. 4 (2006): 646-668.

Cheng, H., Cai, X., Chen, X., Hu, L., & Lou, X. (2003). Computer-aided detection
and classication of microcalcifications in mammograms: A survey. Pattern
Recognition, 36, pp. 2967 – 2991.

Chun-Ming Tsai, “Adaptive Local Power LawTransformation for Color Image


Enhancement”,International Journal on Applied Mathematics and Information
Science, ISSN: 2019-2026

Cireşan, Dan C., Alessandro Giusti, Luca M. Gambardella, and Jürgen Schmidhuber.
“Mitosis detection in breast cancer histology images with deep neural networks.” In
International Conference on Medical Image Computing and Computerassisted
Intervention, pp. 411-418.

Cristobal G., Navarro. R., Space and frequency varient image enhancment based in Gabor
D. Donoho, I. Johnstone, G. Kerkyacharian, D. Picard, “Wavelet shrinkage:
asymptopia?”, Journal of the Royal Statistical Society B, vol.57, pp. 301-369,
1995.

D.L. Dohono, “De-noising by soft-thresholding,” IEEE Information Theory,


vol.41,no.3, pp. 613-627, 1995

Darshana Mistry, Asim Banerjee, “Discrete Wavelet Transform using Matlab,”


International Journal of Computer Engineering and Technology (IJCET)
Volume 4, Issue 2, March – April (2013), pp. 252-259

Dengler, J., Behrens, S., & Desaga, J. (1993, Dec). Segmentation of


Microcalcifications in Mammograms. IEEE Trans on Medical
Imaging, 12, No 4.

Dinsha, D., and N. Manikandaprabu. “Breast tumor segmentation and classification using
SVM and Bayesian from thermogram images.” Unique Journal of Engineering and
Advanced Sciences 2, no. 2 (2014): 147-151.

Doi, K. (2007). Computer-aided diagnosis in medical imaging: Historical review,


current status and future potential. Comput Med Imaging Graph., 31, No. 4-5, pp.
198-211.

E. D. Pisano, S. Zong, B. M. Hemminger, M. DeLuca, R.E. Johnston, K. Muller, M. P.


Braeuning and S. M. Pizer,“Contrast Limited Adaptive Histogram Equalization
Image Processing to Improve the Detection of Simulated Spiculations in Dense
Mammograms,” Journal of Digit Imaging, Vol. 11, No. 4, 1998, pp. 193-200.

E. Exhibit, V. M. Campos, J. M. S. Martinez, C. B. Carron, J. A. Guirola, and J. A. F.


Gomez, “Tracks to face a breast imaging and succeed,” pp. 1–44, 2013.

Etehad Tavakol, Mahnaz, Vinod Chandran, E. Y. K. Ng, and Raheleh Kafieh. “Breast
cancer detection from thermal images using bispectral invariant features.”
International Journal of Thermal Sciences 69 (2013): 21-36.
F. Laine, S. Schuler, J. Fan, and W. Huda, “Mammographic feature enhancement by
multiscale analysis.,” IEEE transactions on medical imaging, vol. 13, pp. 725–40,
Jan. 1994.

Freer, T., & Ulissey, M. (2001). Screening mammography with computer-aided


detection: Prospective study of 12860 patients in a Community Breast Center.
Radiology, 220, pp. 781-786.

Gaur, Shantanu, Vandana Dialani, Priscilla J. Slanetz, and Ronald L. Eisenberg.


“Architectural distortion of the breast.” American Journal of Roentgenology 201, no.
5 (2013): W662-W670.

George, Yasmeen M., Bassant M. Bagoury, Hala H. Zayed, and Mohamed I. Roushdy.
“Automated cell nuclei segmentation for breast fine needle aspiration cytology.”
Signal Processing 93, no. 10 (2013): 2804-2816.

Giger, M. L. (2004, Oct). Computerized Analysis of Images in the Detection and


Diagnosis of Breast Cancer. Semin Ultrasound CT MR, 25, Nº5.

Gonzalez R.C., Woods R.E., Digital Image Processing, Upper Saddle River, NJ Prentice

Gray, H. (2000). Anatomy of the Human Body. New York: Bartleby.

Gubern-Mérida, Albert, Michiel Kallenberg, Ritse M. Mann, Robert Marti, and Nico
Karssemeijer. “Breast segmentation and density estimation in breast MRI: a fully
automatic framework.” IEEE journal of biomedical and health informatics 19, no. 1
(2015): 349-357.

Gubern-Mérida, Albert, Robert Martí, Jaime Melendez, Jakob L. Hauth, Ritse M. Mann,
Nico Karssemeijer, and Bram Platel. “Automated localization of breast cancer in
DCE-MRI.” Medical image analysis 20, no. 1 (2015): 265-274.

Gunderman, R. B. (2006). Essential radiology: clinical presentation, pathophysiology,


imaging. Thieme.
Guyton, A. C., & Hall, J. E. (2000). Textbook of Medical Physiology (10th ed.). (W. S.
Company, Ed.)

Hall, 2008.

Haus AG, Yaffe MJ., “Screen-film and Digital mammography: Image Quality and
Radiation Dose Considerations,” Radiol Clin North America, 2000,38:871– 898.

Heywang-Köbrunner, Sylvia H., Astrid Hacker, and Stefan Sedlacek. “Advantages and
disadvantages of mammography screening.” Breast care 6, no. 3 (2011): 199-207.

http://www.breastcancer.org/symptoms/understand_bc/statistic

http://www.mathworks.com/matlabcentral/fileexchange/19084

Huang Q., Gao W., Cai W., Thresholding technique with adaptive window selection for
uneven lighting image, Pattern Recognition Letters, Elsevier, 2004, 26, p. 801-808.

I. I. Andreadis, G. M. Spyrou, and K. S. Nikita, “A comparative study of image features


for classification of breast micro-calcifications,” Measurement Science and
Technology, vol. 22, p. 114005, Nov. 2011.

iCAD. (2009). iCad. Retrieved Jan. 16, 2011, from http://www.icadmed.com/

Irshad, Humayun, Antoine Veillard, Ludovic Roux, and Daniel Racoceanu. “Methods for
nuclei detection, segmentation, and classification in digital histopathology: a
review—current status and future potential.” IEEE reviews in biomedical
engineering 7 (2014): 97-114.

J. Dheeba and S. Selvi, “Classification of malignant and benign micro-calcification using


svm classifier,” in Emerging Trends in Electrical and Computer
Technology(ICETECT),2011InternationalConferenceon,pp.686–690,IEEE, 2011.

J.K.KimandH.W.Park,“Statisticaltexturalfeaturesfordetectionofmicrocalcifications in
digitized mammograms.,” IEEE transactions on medical imaging, vol. 18, pp. 231–8,
Mar. 1999.
K. Bowyer, D. Kopans, W. Kegelmeyer, R. Moore, M. Sallam, K. Chang, and K. Woods,
“The digital database for screening mammography,” in Third International
Workshop on Digital Mammography, vol. 58, 1996.

Karssemeijer, N., & Brake, G. (1996, Oct). Detection of Stellate Distortions in


Mammograms. IEEE Trans on Medical Imaging, 15, No 5.

Kim, Hyoung-Joon, et al. "Contrast enhancement using adaptively modified histogram


equalization." Advances in Image and Video Technology. Springer Berlin
Heidelberg, 2006. 1150-1158.

Kiyan , T , Yildirim , T , "Breast cancer diagnosis using statistical neural networks " .
(2004 ). Journal of Electrical & Electronics Engineering 2(4).

Klienbaum, G, &klein (2010) "Logistic Regression " USA.

Kim, J., Park, J., Song, K., & Park, H. (1997, Oct). Adaptive mammographic image
enhancement using first derivative and local statistics. IEEE Trans on Medical
Imaging, 16, No 5.

Lim, J. S. and A. V. Oppenheim, “Enhancement and band width compression of noisy


speech,” Proc. of the IEEE, Vol. 67, No. 12,1586–1604, Dec. 1979.

M. Abdullah-AL-Wadud, M. Kahir, M. Dewan and O. Chae, “A dynamic histogram


equalization for image contrast enhancement,” IEEE Trans. Consumer
Electronics.,vol.53, no.2, pp. 593-600, May 2007.

MaheshMahadevappa,"DigitalMammography:AnOverview,"RadioGraphics, RSNA
2004, Vol 24, No. 6, 1747-1760

Michael Heath, Kevin Bowyer et el 2001


http://www.eng.usf.edu/cvprg/Mammography/Database.html

Otsu N. A threshold selection method from gray-level histograms[J].IEEE


Transactions on Systems, 1979, 9(1): 62-66.
P. P. S. J, P. K. Rajeswari, and I. M. April, “Membership functionmodification for
image enhancement using fuzzy logic,” vol. 2, no. 2,pp. 114–118, 2013

Ponraj, D. N., Jenifer, M. E., Poongodi, P. and Manoharan, J. S. (2011). A Survey on


the Preprocessing Techniques of Mammogram for the Detection of Breast
Cancer. Journal of Emerging Trends in Computing and Information Sciences
2(12),

Pradeep, N., Girisha, H., Sreepathi, B. andKa ribasappa K.(2012). Feature extraction
of mammograms. International Journal of Bioinformatics

R. C. Gonzalez and R. E. Woods “Digital Image Processing,” 2nd edition Pearson


Education, 2002.

Rafel C.Gonzalez, Richard E. Woods, “Digital Image Processing,” third edition,


Pearson Publication. pp. 466-474, 2007

Reversible data hiding in medical image for contrat enhancement of ROI, Ying-Hui
XIA, Hao-Tian WU

Rohit verma and Jahid ali, ―A comparative study of various types of image noise and
efficient noise removal techniques‖, International journal of advanced research
in computer science and software engineering, volume 3,issue 10 October 2013

Saruchi, Madan Lal, “Comparative Study Different Image Enhancement Techniques”,


International Journal of Computers & Technology ISSN 2277-3061

Sivaramakrishna, R., Obuchowski, N., Chilcote, WCardesona, G. and Powell, K.


(2000), Comparing the performance of mammographic enhancement
algorithms, American Journal of Roentgenology. 175:45-51

Thamizharasi, A. M. E., 2010, Performance Analysis of Face Recognition by


Combining Multiscale Techniques and Homomorphic Filter using Fuzzy K
Nearest Neighbour Classifier. In: IEEE International Conference on
Communication Control and Computing Technologies (ICCCCT), pp. 643-650.
Toran Lal Sahu Mrs. Deepty Dubey‖ A Survey on Image Noises and Denoise
Techniques‖ International Journal of Advanced Research in Computer
Engineering & Technology (IJARCET) Volume 1, Issue 9, November 2012

USFdigitalmammographyhomepage,”http://marathon.csee.usf.ecu/Mammography/Dat
abase.html

Wu H T, Huang J, Shi Y Q. A reversible data hiding method with contrast


enhancement for medical images[J]. Journal of Visual Communication and
Image Representation, 2015, 31: 146-153.

www.Worldwidebreastcancer.com/wpcontent/uploads/2011/08/breastcancerstatsworld
wide.jp

JuCheng Yang, DongSun Park 2004 IEEE International Conference on Multimedia and
Expo (ICME)
Emma Regentova, Lei Zhang, Jun Zheng, and Gopalkrishna Veni. Proceedings of the
28th IEEE EMBS Annual International Conference New York City, USA, Aug
30-Sept 3, 2006

https://medium.com/analytics-vidhya/cnns-architectures-lenet-alexnet-vgg-googlenet-
resnet-and-more-666091488df5

Das könnte Ihnen auch gefallen