Sie sind auf Seite 1von 4

GRD Journals- Global Research and Development Journal for Engineering | Volume 6 | Issue 1 | December 2020

ISSN- 2455-5703

Machine Learning Algorithms to Improve the


Performance Metrics of Breast Cancer Diagnosis
Dr. V. S. R. Kumari
Principal ( Professor)
Department of Electronics and Communication Engineering
Sri Mittapalli Institute of Technology for Women /JNTU Kakinada

Suresh Veesa Srinivasa Rao Chevala


Associate Professor Assistant Professor
Department of Electronics and Communication Engineering Department of Electronics and Communication Engineering
Sri Mittapalli Institute of Technology for Women /JNTU Sri Mittapalli Institute of Technology for Women /JNTU
Kakinada Kakinada

Abstract
Cancer is the common problem for all people in the world with all types. Particularly, Breast Cancer is the most frequent disease
as a cancer type for women. Therefore, any development for diagnosis and prediction of cancer disease is capital important for a
healthy life. Cancer is a term for diseases in which abnormal cells divide without control and can invade nearby tissues. Cancer
cells can also spread to other parts of the body through the blood and lymph systems. so, detecting the cancer in early stages is
important for diagnosis. There are several main types of cancer. Carcinoma is a cancer that begins in the skin or in tissues that
line or cover internal organs. Breast cancer starts when cells in the breast begin to grow out of control. These cells usually form a
tumor that can often be seen on an x-ray or felt as a lump. Machine learning techniques can make a huge contribute on the
process of early diagnosis and prediction of cancer. In this project I am mainly focusing on breast cancer. Features are computed
from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in
the image. The classification performance of these techniques has been compared with each other using the values of accuracy,
precision, recall and ROC Area. The best performance has been obtained by Support Vector Machine technique with the highest
accuracy.
Keywords- Machine Learning, Breast Cancer, Classification, Early Diagnosis Necessary

I. INTRODUCTION
Cancer is the second reason of human death all over the world and accounts for roughly 9.6 million deaths in 2018. Globally, for
1 human death in 6 can be said that is caused by cancer. Almost 70 percent of the deaths from cancer disease happen in countries
that have low and middle income. The most common cancer type among women are breast, lung and colorectal, which totally
symbolize half of the all cancer cases. People says that everyone knows someone who has breast cancer but what I had seen is
everyone has someone close who has breast cancer--Debbie Wasserman Schultz, US House of Representatives, breast cancer
survivor. There were 1.7 billion breast cancer cases were diagnosed in 2012. In 2019, there will be an estimated 271,270 new
cases of invasive breast cancer diagnosed in women and 2,670 cases diagnosed in men. As we can see that out of all new cases
fifty percent are prone to death. By early detection we can reduce this percentage of death. The above figure shows that out of all
cancers the cases are more for breast cancer. To discourage the growth of breast cancer, it is important to focus on early
detection. Early diagnosis and screening are two main methods of advance detection of breast cancer. From the last few decades,
ML techniques healthcare systems, especially for breast cancer (BC) diagnosis and prognosis. Traditionally the diagnostic
accuracy of a patient depends on a physician’s experience; however, this expertise is built up over many years of observations of
different patients’ symptoms and confirmed diagnoses. Even then the accuracy cannot be guaranteed. With the advent of
computing technologies, it is now relatively easy to acquire and store a lot of data.

All rights reserved by www.grdjournals.com 8


Machine Learning Algorithms to Improve the Performance Metrics of Breast Cancer Diagnosis
(GRDJE/ Volume 6 / Issue 1 / 003)

Fig. 1: Statistics of cancer

Without the help of computers it is impossible for health specialists to analyze these complex datasets particularly when
undertaking complex searching of the data. The intelligent healthcare system is therefore a precious and important domain. The
intelligent healthcare system can assist physicians to diagnose patients with greater accuracy or provide more meaningful
benchmarks, and further it can aid people to plan for their physical condition into the future. In this context, ML technique scan
take over some complex manual works from the physicians, for instance, text and voice analysis, which have been applied to
identify/code patient emotions corresponding to healthcare professionals responses. Recently, ML techniques are playing a
important role in diagnosis and forecast of breast cancer by applying classification techniques to identify people with breast
cancer, differentiate benign from malignant tumour and to predict prognosis. Bektas and Babur have studied on diagnosis of
breast cancer using machine learning techniques. Kent Ridge Microarray has been used 2 datasets and support vector machine, k-
star, random forest algorithm and voted perceptron have been applied. Random forest algorithm has been showed more
performance than applied feature selection method [7]. Chen et al. have applied Support Vector Machine classification algorithm
on Wisconsin Diagnostic Breast Cancer dataset. In the study, the training and testing sets have been split as 50-50%, 70-30% and
80-20%. According to different training/testing percent, accuracy values have been calculated [8]. In this paper, as SVM and
ANN two of the most popular machine learning techniques are applied on Wisconsin Breast Cancer (Original) dataset and the
result of applied machine learning (ML) techniques are compared according to performance metrics. Accurate classification can
further assist clinicians to prescribe the most appropriate treatment regime. Classification is a kind of complex optimization
problem. Many ML techniques have been applied by researchers in solving this classification problem. In the following sections,
a comprehensive explanation of different classification methods applied to breast cancer will be given. We focus on the artificial
neural network (ANNs), support vector machine (SVMs), decision tree (DTs) and k-nearest neighbor(k -NNs) techniques as they
are the main methods used in breast cancer diagnosis and prognosis. Scientists strive to find the best algorithm to achieve the
most accurate classification result, however, data of variable quality will also influence the classification result.

II. METHODOLOGY

A. Support Vector Machines (SVM)


Support Vector Machines (SVMs) have been first explained by Vladimir Vapnik and the good performances of SVMs have been
noticed in many pattern recognition problems. SVMs can indicate better classification performance when it is compared with
many other classification techniques. SVM is one of the most popular machines learning classification technique that is used for
the prognosis and diagnosis of cancer. According to SVM, the classes are separated with hyperplane that is consisted of support
vectors that are critical samples from all classes. The hyperplane is a separator that is identified as decision boundary among the
two sample clusters. SVM can be used for classifying tumors as benign or malignant based on patient’s age and tumors size.

B. Artificial Neural Networks


An artificial neuron network (ANN) is a computational model based on the structure and functions of biological neural networks.
Information that flows through the network affects the structure of the ANN because a neural network changes - or learns, in a
sense - based on that input and output. ANNs are considered nonlinear statistical data modeling tools where the complex
relationships between inputs and outputs are modeled or patterns are found. ANN is also known as a neural network. Activation
functions are really important for a Artificial Neural Network to learn and make sense of something really complicated and Non-
linear complex functional mappings between the inputs and response variable. They introduce non-linear properties to our

All rights reserved by www.grdjournals.com 9


Machine Learning Algorithms to Improve the Performance Metrics of Breast Cancer Diagnosis
(GRDJE/ Volume 6 / Issue 1 / 003)

Network. Their main purpose is to convert a input signal of a node in a A-NN to an output signal. That output signal now is used
as a input in the next layer in the stack. Specifically in A-NN we do the sum of products of inputs(X) and their corresponding
Weights (W) and apply a Activation function f(x) to it to get the output of that layer and feed it as an input to the next layer.

Fig. 2: Artificial neural network data flow

III. RESULTS AND DISCUSSION


In this paper, we have applied SVM and ANN techniques for prediction of the classification of breast cancer to find which
machine learning methods performance is better.

Fig. 3: Confusion Matrix for SVM model

Fig. 4: ANN confusion matrix

All rights reserved by www.grdjournals.com 10


Machine Learning Algorithms to Improve the Performance Metrics of Breast Cancer Diagnosis
(GRDJE/ Volume 6 / Issue 1 / 003)

The dataset is divided into train and test data. Using this test dataset in confusion matrix I got accuracy as 98%.Using ANN and
SVM the false negatives has been reduced.

IV. CONCLUSION
Breast Cancer is the most frequent disease as a cancer type for women. Therefore, any development for diagnosis and prediction
of cancer disease is capital important for a healthy life. In this paper, the cancer dataset which was taken from uci website does
not contain any mussing values. This type prediction comes under the classification but using classification algorithms like Naïve
bayes and logistic regression there are high false negatives so I tried to use state vector machine and artificial Neural network to
reduce the false negatives. The accuracy I got is 98%.

REFERENCES
[1] E. A. Bayrak, P. K?rc? and T. Ensari, "Comparison of Machine Learning Methods for Breast Cancer Diagnosis," 2019 Scientific Meeting on Electrical-
Electronics & Biomedical Engineering and Computer Science (EBBT), Istanbul, Turkey, 2019, pp. 1-3, doi: 10.1109/EBBT.2019.8741990
[2] Yue, Wenbin & Wang, Zidong & Chen, Hongwei & Payne, Annette & Liu, Xiaohui. (2018). Machine Learning with Applications in Breast Cancer
Diagnosis and Prognosis. Designs. 2. 13. 10.3390/designs2020013
[3] Cancer, https://www.who.int/en/news-room/fact-sheets/detail/cancer. Last Access: 25.01.2019.
[4] Kumar V., Mishra B.K., Mazzara M., Thanh D.N.H., Verma A. (2020) Prediction of Malignant and Benign Breast Cancer: A Data Mining Approach in
Healthcare Applications. In: Borah S., Emilia Balas V., Polkowski Z.(eds) Advances in Data Science and Management. Lecture Notes on Data Engineering
and Communications Technologies, vol 37. Springer, Singapore. https://doi.org/10.1007/978-981-15-0978-0_43
[5] Siegel, R. L., Miller, K. D., &Jemal, A. (2018). Cancer statistics, Ca-a Cancer Journal for Clinicians, 68 (1), pp. 7-30.
[6] Maity, N. G., & Das, S. (2017). Machine learning for improved diagnosis and prognosis in healthcare.In 2017 IEEE Aerospace Conference, pp. 1-9.
[7] Huang, M. W., Chen, C. W., Lin, W. C., Ke, S. W., & Tsai, C. F. (2017). SVM and SVM ensembles in breast cancer prediction.PloS one, 12 (1).
[8] Bazazeh, D., &Shubair, R. (2016). Comparative study of machine learning algorithms for breast cancer detection and diagnosis. In 2016 5th International
Conference on Electronic Devices, Systems and Applications, pp. 1-4.
[9] Ahmad, L. G., Eshlaghy, A. T., Poorebrahimi, A., Ebrahimi, M., &Razavi, A. R. (2013). Using three machine learning techniques for predicting breast
cancer recurrence. J Health Med Inform, 4 (124).
[10] Bektas, B., & Babur, S. (2016). Machine learning based performance development for diagnosis of breast cancer, Medical Technologies National Congress,
pp. 1-4.
[11] Umadevi, S., &Marseline, K. J. (2017). A survey on data mining classification algorithms. In 2017 International Conference on Signal Processing and
Communication, pp. 264-268.
[12] https://www.cancer.gov/publications/dictionaries/cancer-terms/def/cancer
[13] https://siteman.wustl.edu/glossary/cdr0000045333/
[14] http://www.omegahospitals.com/Breast-Onocology-Omega-Cancer-Hospital.pdf
[15] https://archive.ics.uci.edu/ml/datasets/Breast%2BCancer%2BWisconsin%2B(Diagnostic)

All rights reserved by www.grdjournals.com 11

Das könnte Ihnen auch gefallen