Beruflich Dokumente
Kultur Dokumente
=
=
(11)
) (
) (
) (
A
A A
v d card
v d card
D
=
=
=
(12)
4.1.3.Rough set Feature Selection
Rough sets for feature selection [20] is valuable, as the
selected feature subset can generate more general
decision rules and better classification quality of new
samples. So some heuristic or approximation algorithms
have to be considered. K.Y. Hu [21] computes the
significance of an attribute using heuristic ideas from
discernibility matrices and proposes a heuristic reduction
algorithm (DISMAR). X. Hu [22] gives a rough set
reduction algorithm using a positive region-based
attribute significance measure as a heuristic (POSAR).
G.Y. Wang [23] develops a conditional information
entropy reduction algorithm (CEAR).
4.2.Particle Swarm Optimization (PSO)
Particle swarm optimization (PSO) is an evolutionary
computation technique developed by Kennedy and
Eberhart [24], [25]. The original idea was to graphically
simulate the choreography of a bird flock. Shi.Y.
introduced the concept of inertia weight into the particle
swarm optimizer to produce the standard PSO algorithm
[26].The concept of particle swarms has become very
popular these days as an efficient search and
optimization technique. The Particle Swarm
Optimization (PSO) [27], [30] does not require any
gradient information of the function to be optimized, uses
only primitive mathematical operators, and is
conceptually very simple. Since its advent in 1995, PSO
has attracted the attention of many researchers all over
the world resulting in a huge number of variants of the
basic algorithm and many parameter automation
strategies.
An Analysis of the Advantages of the Basic Particle
Swarm Optimization Algorithm discussed in [28]:
- PSO is based on the intelligence. It can be applied
into both scientific research and engineering use.
- PSO have no overlapping and mutation calculation.
The search can be carried out by the speed of the
particle. During the development of several
generations, only the most optimist particle can
transmit information onto the other particles, and the
speed of the researching is very fast.
- The calculation in PSO is very simple. In compared
with the other developing calculations, it occupies
the biggest optimization ability and it can be
completed easily.
)} ( ) ( , ), ( ) ( | { Red D D R B D D C R
C B C R
= c = _ =
(8)
I nt ernat i onal Journal of E mergi ng Trends & Technol ogy i n Comput er Sci ence(I JE TTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume I, Issue 3, September-October 2012 ISSN 2278-6856
Vol ume I , I ssue 3 , Sept ember -Oct ober 2 0 1 2 Page 3 2
- PSO adopts the real number code, and it is decided
directly by the solution. The number of the
dimension is equal to the constant of the solution.
PSO is initialized with a population of particles. Each
particle is treated as a point in an S-dimensional space.
The ith particle is represented as ) ,..., , (
2 1 iS i i i
x x x X = .
The best previous position (pbest, the position giving the
best fitness value) of any particle
is ) ,..., , (
2 1 iS i i i
p p p P = . The index of the global best
particle is represented by gbest. The velocity for particle
is ) ,..., , (
2 1 iS i i i
v v v V = . The particles are manipulated
according to the following equation:
) ( * () * ) ( * () * *
2 1 id gd id id id id
x p Rand c x p rand c v w v + + =
(13)
id id id
v x x + =
(14)
Where w is the inertia weight, suitable selection of the
inertia weight provides a balance between global and
local exploration and thus require less iterations on
average to find the optimum. If a time varying inertia
weight is employed, better performance can be expected
[29]. The acceleration constants c1 and c2 in equation
(13) represent the weighting of the stochastic
acceleration terms that pull each particle toward pbest
and gbest positions. Low values allow particles to roam
far from target regions before being tugged back, while
high values result in abrupt movement toward, or past,
target regions. rand () and Rand() are two random
functions in the range [0,1]. Particles velocities on each
dimension are limited to a maximum velocity Vmax. If
Vmax is too small, particles may not explore sufficiently
beyond locally good regions. If Vmax is too high
particles might fly past good solutions.
The first part of equation (13) enables the flying
particles with memory capability and the ability to
explore new search space areas. The second part is the
cognition part, which represents the private thinking of
the particle itself. The third part is the social part,
which represents the collaboration among the particles.
Equation (13) is used to update the particles velocity.
Then, the particle flies toward a new position according
to equation (14). The performance of each particle is
measured according to a pre-defined fitness function.
The process for implementing the PSO algorithm is as
follows [7]:
(1) procedure PSO
(2) repeat
(3) for i = 1 to number of individuals do
(4) if G (xi) > G (pi) then
/ / G () evaluates goodness
(5) for d = 1 to dimensions do
(6) pid = xid .
// pid is the best state found so far
(7) end for
(8) end if
(9) g = i.// Arbitrary
(10) for j = indexes of neighbors do
(11) if G (pj) > G (pg) then
(12) g = j.
//g is the index of the best performer in the
neighborhood
(13) end if
(14) end for
(15) for d = 1 to number of dimensions do
(16) vid (t) = f(xid(t 1), vid(t 1), pid, pgd)
//Update velocity
(17) vid 2 in (Vmax,+Vmax)
(18) xid(t) = f(vid(t), xid(t 1)) .
//Update position
(19) end for
(20) end for
(21) until stopping criteria
(22) end procedure
Figure 3: Standard Particle Swarm Optimization
(PSO)
Definitions and Variables used in Figure 3:
- t means the current time step, t 1 means the
previous time step.
- xid(t) is the current state (position) at site d of
individual i.
- vid (t) is the current velocity at site d of individual i.
- Vmax is the upper/lower bound placed on vid.
- pid is the individuals i best state (position) found so
far at site d.
- pgd is the neighborhood best state found so far at site
d
4.3.Problems' Description and Basic
Experimentation Setup
Breast cancer UCI dataset [31] was obtained from
University of Wisconsin Hospitals, Madison from Dr.
William H. Wolberg. We perform experimentation on the
dataset summarized in Table 2.
Table2: Data used in the experiments
Name Instances
699
Class
Distributio
n
Validati
on
Wisconsin
Breast
Cancer
Diagnosti
c
Attribute
s
11
Benign
cases: 458
(65.5%)
Malignant
cases: 241
(34.5%).
Trainin
g 80%
Testing
:140
case
The data will be nine conditional features and one
decision feature as the first attribute that describes
sample code number will be removed as shown later.
Data was implemented in WEKA software more
information about it can be found in [32]
Steps to be implemented:
I nt ernat i onal Journal of E mergi ng Trends & Technol ogy i n Comput er Sci ence(I JE TTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume I, Issue 3, September-October 2012 ISSN 2278-6856
Vol ume I , I ssue 3 , Sept ember -Oct ober 2 0 1 2 Page 3 3
Step 1: Remove sample code number from data (no effect
on data) with removal filter.
Step 2: Dataset was discretized from numeric to nominal
data using NumericToNominal filter which is defined as
an instance filter that discretizes a range of numeric
attributes in the dataset into nominal attributes.
Step 3: Replace missing values for nominal and numeric
attributes with modes and means from the training data
that will be done by using ReplaceMissingValues filter.
Step 4: To find the reducts we applied the supervised
attribute selection filter RSARSubsetEval(Rough Set
Attribute Reduction) that is the implementation of the
QuickReduct algorithm of rough set attribute reduction
and we use the search method as PSOsearch that explores
the attribute space using the Particle Swarm
Optimization (PSO) algorithm described in [33] and
parameters showed in figure4 and table 3
Table 3: PSO Parameters
Figure 4 Preprocess Implementation
This stage was important because Rough Set filtering was
used to eliminate the unimportant and redundant features
(First phase in Figure1) and to reduce the number of
iterations that PSO has to perform in finding an optimum
feature subset.
Step 5: We used some of classification techniques as
showed in table 4 to classify the data .The number of
decision rules and the classification accuracy are also
shown.
From the results, we could conclude that an increase of
particle/individual above 20 does not bring any relevant
improvement in the algorithms performance. The
increment or the decrement in number of iterations has
also no influence on algorithm's performance as its ideal
result by experiments is on 20 iterations. The best result
in all of the classification algorithms obtains with
minimum feature subset .This achieves our view to obtain
best results with minimum features subset. Finally, the
evaluation results show that using Nave Bayes with 5
population size and 20 iterations obtains the best result
among the other methods.
Table 4: Comparison of classification results by using
various classification techniques
C
l
a
s
s
i
f
i
c
a
t
i
o
n
T
e
c
h
n
i
q
u
e
Correct
ly
Classifi
ed
Instanc
es
Incorr
ectly
Classif
ied
Instan
ces
TP
Rat
e
(A
V
G)
F
P
R
at
e
(A
V
G)
Pre
cisi
on
(A
VG
)
R
ec
all
Popul
ation
Size
Feat
ure
Selec
tion
N
a
v
e
B
a
y
e
s
135
96.428
6%
5
3.571
4 %
0.9
64
0.
03
8
0.9
65
0.
96
4
20 10
Attri
butes
Confusion Matrix
a b <------classified as
87 3 | a = 2
2 48 | b = 4
136
97.142
9%
42.
8571
%
0.9
71
0.
02
5
0.9
72
0.
97
1
10 9
Attri
butes
Confusion Matrix
a b <-- classified as
87 3 | a = 2
1 49 | b = 4
136
97.142
9%
42.
8571
%
0.9
71
0.
02
5
0.9
72
0.97
1
5 8
Attri
butes
Confusion Matrix
a b <-- classified as
87 3 | a = 2
1 49 | b = 4
D
e
c
i
s
i
o
n
T
a
b
l
e
129
92.142
9%
117
.8571
%
0.9
21
0.
10
6
0.9
21
0.
92
1
20 10
Attri
butes
129
92.142
9%
117
.8571
%
0.9
21
0.
10
6
0.9
21
0.
92
1
10
9
Attri
butes
10
Rule
s
0.08
Seco
nds
129
92.142
9%
117
.8571
%
0.9
21
0.
10
6
0.9
21
0.
92
1
5 8Attr
ibute
s
10
Rule
s
0.07
Seco
nds
Confusion Matrix for the above three cases
a b <-- classified as
86 4 a = 2
7 43 b = 4
PSO
Parameter
s
Individua
l
Weight
Inertia
Weigh
t
Social
Weigh
t
Iteration
s
0.34 0.33 0.33 20
I nt ernat i onal Journal of E mergi ng Trends & Technol ogy i n Comput er Sci ence(I JE TTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume I, Issue 3, September-October 2012 ISSN 2278-6856
Vol ume I , I ssue 3 , Sept ember -Oct ober 2 0 1 2 Page 3 4
P
r
i
s
m
122
87.142
9%
96.
4286
%
0.93
1
0.
10
4
0.9
32
0.
93
1
20 10
Attri
butes
Uncl
assifi
ed
96
.428
6 %
Confusion Matrix
a b <-- classified as
82 2 | a = 2
7 40 | b = 4
119
85 %
117
.1429
%
0.9
22
0.
12
6
0.927 0.922 10 9 9
Attri
butes
Uncl
assifi
ed
11
(7.8
571
%)
Confusion Matrix
a b <-- classified as
81 1 | a = 2
9 38 | b = 4
120
85.714
3%
9
6.428
6 %
0.9
3
0.
12
6
0.9
37
0.
93
5 8
Attri
butes
Uncl
assifi
ed
11
7.85
71%
Confusion Matrix
a b <-- classified as
83 0 | a = 2
9 37 | b = 4
J
4
8
1339
5%
7 5
%
0.9
5
0.5
4
0.9
5
0.9
5
20 10
Attrib
utes
Num
ber of
Leav
es :29
Tree
Size
:32
1339
5%
7 5
%
0.9
5
0.5
4
0.9
5
0.9
5
10 9
Attrib
utes
Num
ber of
Leav
es :29
Tree
Size
:32
Confusion Matrix
a b <-- classified as
86 4 | a = 2
3 47 | b = 4
130
92.857
1%
107
.1429
%
0.9
29
0.
09
3
0.9
28
0.
92
9
5 8
Attri
butes
Leav
es:37
Tree
Size
:41
Confusion Matrix
a b <-- classified as
86 4 | a = 2
6 44 | b = 4
J
R
i
p
w
i
t
h
k
=
2
o
p
t
i
m
i
z
a
t
i
o
n
s
,
F
o
l
d
s
3
130
92.857
1%
107
.1429
%
0.9
29
0.
08
4
0.9
29
0.
92
9
20 10At
tribu
tes
14
Rule
s
0.11
seco
nds
Confusion Matrix
a b <-- classified as
85 5 | a = 2
5 45 | b = 4
135
96.428
6%
53.
5714
%
0.9
64
0.
05
5
0.9
65
0.
96
4
10 9Attr
ibute
s 13
Rule
s
0.08
seco
nds
Confusion Matrix
a b <-- classified as
89 1 | a = 2
4 46 | b = 4
133
95%
75
%
0.9
5
0.
07
2
0.9
5
0.
95
5 8
Attri
butes
15
Rule
s
Confusion Matrix
a b <-- classified as
88 2 | a = 2
5 45 | b = 4
I nt ernat i onal Journal of E mergi ng Trends & Technol ogy i n Comput er Sci ence(I JE TTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume I, Issue 3, September-October 2012 ISSN 2278-6856
Vol ume I , I ssue 3 , Sept ember -Oct ober 2 0 1 2 Page 3 5
Nave Bayes Classifier shows that classification process
with minimum features only 8 attributes achieve higher
result with 136 correctly classified instances. The same
result the nave base classifier achieved when it used 9
attributes and 10 population size .This shows that the
best result is on minimum features selection .Decision
Table Classifier gives 129 correctly classified instances
with minimum features subset .Only 8 attributes give the
same result as 10 or 9 attributes used.
Prism classifier gives the worst results .It achieves 122
correctly classified instances with ten attributes used.
Decrement number of correctly classified instances to
120 when 8 attributes and 5 population size are used .So
we can say that decrement number of attributes with
Prism classifier gives counterproductive and decreases
the classification accuracy.J48 classifier achieves best
result of 133 correctly classified with 9 attributes and
population size of 10 .JRip achieves best result of 135
correctly classified instances with 9 attributes and 10
population size .Best results were extracted by Nave
Bayes then JRip classifier in terms of classification
accuracy and feature reduction
5. Future Plans
The blending with the other intelligent optimization
algorithm [28]
The Blending Process is to combine the advantages of the
PSO with the advantages of the other intelligent
optimization algorithms to create the compound
algorithm that has practical value. For example, the
particle swarm optimization algorithm can be improved
by the simulated annealing (SA) approach .It can be
connected with the hereditary agents, the algorithm of a
colony of ants, vague method and etc.
The application area of the Algorithm
At present, the most research on PSO in the coordinate
system. There is less research on the PSO algorithm
application in non-coordinate system, scattered system
and compound optimization system.
6. CONCLUSION
Medical diagnosis is considered as an intricate task that
needs to be carried out precisely and efficiently. The
automation of the same would be highly beneficial.
Clinical decisions are often made based on doctor's
intuition and experience. Data mining techniques have
the potential to generate a knowledge-rich environment
which can help to significantly improve the quality of
clinical decisions. Rough set theory supplies essential
tools for knowledge analysis. It provides algorithms for
knowledge reduction, concept approximation, decision
rule induction and object classification. The methods of
rough set theory rest on indiscernibility and related
notions, in particular on notions related to rough
inclusions. All constructs needed in implementing rough
set based algorithms can be derived from data tables with
no need for priori estimates or preliminary assumptions.
The combination of rough sets with other intelligence
techniques is able to provide a more effective approach.
We have illustrated that rough sets have been
successfully combined with particle swarm optimization
algorithms that is described as new heuristic optimization
method based on swarm intelligence. It is very simple,
easily implemented and it needs fewer parameters, which
made it fully developed and applied for feature extraction
task.
References
[1] R. Roselin, K. Thangavel, and C.
Velayutham,Fuzzy-Rough Feature Selection for
Mammogram Classification, Journal of Electronic
Science and Technology, Vol. 9, No. 2, JUNE 2011.
[2] J.R. Quinlan, Induction of Decision Trees,
Machine Learning, pp.81-106, Vol.1, 1986.
[3] Sarvestan Soltani A., Safavi A. A., Parandeh M.
N. and Salehi M., Predicting Breast Cancer
Survivability using data mining techniques,
Software Technology and Engineering (ICSTE), 2nd
International Conference ,pp.227-231, Vol.2, 2010.
[4] Chang Pin Wei and Liou Ming Der, Comparison
of three Data Mining techniques with Genetic
Algorithm in analysisof Breast Cancer data,
Available:http://www.ym.edu.tw/~dmliou/Paper/com
par_threedata.pdf.
[5] Gandhi Rajiv K., Karnan Marcus and Kannan S.,
Classification Rule Construction Using Particle
Swarm Optimization Algorithm for Breast Cancer
Datasets, Signal Acquisition and Processing,
ICSAP, International Conference, pp. 233 237,
2010.
[6] S.Das, A. Abraham, S.K. Sarkar,A Hybrid
Rough SetParticle Swarm Algorithm for Image
Pixel Classification,Proc.of the SixthInt.Conf. on
Hybrid Intelligent Systems, pp. 26-32, 2006.
[7] Matthew Settles, An Introduction to Particle
Swarm Optimization, November 2007.
[8] S. B. Kotsiantis, Supervised Machine Learning:
A Review of Classification Techniques,
Informatica, Vol.31, 249-268, 2007.
[9] Jensen, R. and Shen, Q., Fuzzy-rough Data
Reduction with Ant Colony Optimization, Journal
of Fussy Sets and Systems, pp.5-20, Vol. 149, 2005.
[10] Monteiro, S., Uto, TK., Kosugi, Y.,
Kobayashi,N.,Watanabe, E. and Kameyama,
K,Feature Extraction of Hyperspectral Data for
Under Spilled Blood Visualization Using Particle
Swarm Optimization,International Journal of
Bioelectromagnetism, pp.232-235, Vol. 7,No.1 ,
2005.
[11] Yan WANG, Lizhuang MA, Feature Selection
for Medical Dataset Using Rough Set Theory,
I nt ernat i onal Journal of E mergi ng Trends & Technol ogy i n Comput er Sci ence(I JE TTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com
Volume I, Issue 3, September-October 2012 ISSN 2278-6856
Vol ume I , I ssue 3 , Sept ember -Oct ober 2 0 1 2 Page 3 6
Proceedings of the 3rd WSEAS International
Conference on COMPUTER ENGINEERING and
APPLICATIONS.
[12] Jensen, R., Shen, Q., & Tuson, A.,Finding
Rough Set Reducts with SAT, In Proceedings of the
10th International conference on Rough Sets, Fuzzy
Sets, Data Mining and Granular Computing, LNAI
3641, pp. 194-203, 2005.
[13] Z. Pawlak, Rough Sets: Theoretical aspects of
reasoning about data, Kluwer Academic Publishers,
Dordrecht, 1991.
[14] A.E. Hassanien,Rough Set Approach for
Attribute Reduction and Rule Generation: A Case
of Patients with Suspected Breast Cancer, Journal
of the American society for Information science and
Technology ,pp. 954-962, Vol.55,No.11, 2004.
[15] S. Tsumoto,Mining Diagnostic Rules from
Clinical Databases Using Rough Sets and Medical
Diagnostic Model, Information Sciences, pp.65-80,
Vol.162, 2004.
[16] J. Komorowski, A. Ohrn, Modeling Prognostic
Power of Cardiac Tests Using Rough Sets, Artificial
Intelligence in Medicine , pp. 167-191, Vol.15,
1999.
[17] ZPawlak, Rough Set Approach to Knowledge-
Based Decision Support", European Journal of
Operational Research, pp. 48-57, Vol.99, 1997.
[18] Xiangyang Wang , Jie Yang , Xiaolong Teng ,
Weijun Xia , Richard Jensen , Feature Selection
based on Rough Sets and Particle Swarm
Optimization .
[19] A.Skowron, C.Rauszer, The Discernibility
Matrices and Functions in Information Systems, In:
R.W. Swiniarski (Eds.): Intelligent Decision
SupportHandbook of Applications and Advances
of the Rough Sets Theory, Kluwer Academic
Publishers, Dordrecht, pp. 311-362, 1992.
[20] R.W. Swiniarski, A. Skowron, Rough set
methods in feature selection and recognition,
Pattern Recognition Letters, pp. 833-849, Vol. 24,
2003.[21] K.Y. Hu, Y.C. Lu, C.Y. Shi, Feature
ranking in rough sets, AI Communications, pp. 41-
50, Vol.16,No.1, 2003.
[22] X. Hu, Knowledge Discovery in Databases: An
Attribute-Oriented Rough Set Approach,Ph.D
thesis, Regina University,1995.
[23] G.Y. Wang, J. Zhao, J.J. An, Y. Wu, Theoretical
Study on Attribute Reduction of Rough Set Theory:
Comparison of Algebra and Information Views", In:
Proceedings of the Third IEEE International
Conference on Cognitive Informatics, (ICCI04),
2004.
[24] J .Kennedy, R.Eberhart, Particle Swarm
Optimization",In :Proc IEEE Int. Conf. On Neural
Networks, Perth, pp. 1942-1948, 1995.
[25] J.Kennedy, R.C.Eberhart,A new optimizer
using particle swarm theory, In: Sixth International
Symposium on Micro Machine and Human Science,
Nagoya, pp. 39-43, 1995.
[26] Y. Shi, R. Eberhart,A Modified Particle Swarm
Optimizer", In: Proc. IEEE Int. Conf. On
Evolutionary Computation, Anchorage, AK, USA,
pp. 69-73, 1998.
[27] Kennedy.J,Small Worlds and Mega-Minds:
Effects of Neighborhood Topology on Particle
Swarm Performance", Proceedings of the 1999
Congress of Evolutionary Computation, IEEE Press,
Vol. 3, pp. 1931-1938, 1999.
[28] Qinghai Bai, Analysis of Particle Swarm
Optimization Algorithm",Computer and Information
Science,Vol.3,No.1, 2010.
[29] Y. Shi, R. C. Eberhart,Parameter Selection in
Particle Swarm Optimization in Evolutionary
Programming, VII: Proc. EP98, New York:
Springer-Verlag, pp. 591-600, 1998.
[30] R.C. Eberhart, Y. Shi,Particle Swarm
Optimization: Developments, Applications and
Resources, In: Proc. IEEE Int. Conf. On
Evolutionary Computation, Seoul, pp. 81-86, 2001.
[31] http://archive.ics.uci.edu/ml/machine-learning-
databases/breast-cancer-wisconsin/breast-cancer-
wisconsin.data
[32] http://www.cs.waikato.ac.nz/ml/weka/
[33] Moraglio, A., Di Chio, C., and Poli, R.,
Geometric Particle Swarm Optimization,EuroGP,
LNCS 445, pp. 125-135, 2007.