Sie sind auf Seite 1von 13

Presentation on Statement

problem
Name: Kassahun Azezew
PRN. 031

Advisor: Dr. Preeti mulay


18/08/2017 1
1. Introduction
The data size is increasing regarding records and
dimensions both. It presents challenges to the
machine learning community which is working on
new methods and techniques to fasten the data
exploration, analysis, and validation tasks. This
phenomenon is known as the curse of
dimensionality. A such like problems can be
addresed by applying appropriate feature selection
techniques.

18/08/2017 2
Cont...
Therefore, the researchers pay more
attention on the feature selection to
enhance the performance of the machine
learning algorithms. Identifying the
suitable feature selection method is very
essential for a given machine learning task
with high dimensional data sets like
microarray dataset,geographical data sets,
etc.

18/08/2017 3
Cont...
so,it is required to conduct the study on
the various feature selection methods for
the research community especially
dedicated to develop the suitable feature
selection method for enhancing the
performance of the learning algorithm
tasks on high-dimensional data.hence,the
proposed approach focussed on cancer
diagnosis microarray datasets to apply the
the experimental analysis.

18/08/2017 4
2. Problem in the existed approach
In robust ranking aggregation there has a
redundancy problem which is not solved by the
author.
The author is not clearly stated the techniques to
solve missing values in the data set.thus may
degrading the accuracy level of a classifier on some
data sets.
The accuracy of a learning algorithms are not reach
the expected level.

18/08/2017 5
3. Problem statement
Correlation based ensemble feature
selection in high dimensional data via
ensemble aggregation techniques.

18/08/2017 6
4. General Objectives
Increasing the performance of learning
algorithms and stability of ensemble feature
selection on high diamensional data via a
robust feature selection aproaches.

4.1. Specific objectives


Iimprove the accuracy of a classifier
Reduce over fitting.
Enable the machine learning algorithm to train
faster.
Reducing the complexity of a model and makes it
easier to interpret
18/08/2017 7
5. Related works
The concept of ensemble feature selection based
feature selectors aggregation was introduced by
Saeys et al.(2008).
ensemble concept for feature selection can be also in
the form of parallel application of multiple feature
selection algorithms by Mitchelletal.(2014).
Robust Rank Aggregate (RRA) This method,proposed
by Koldeetal.(2012),detects features that are ranked
consistently better than expected under the null
hypothesis of uncorrelated inputs and assigns a
signicance score for each feature.

18/08/2017 8
Cont...
Classication accuracy based aggregation
(CAA) Chan et al. (2008) proposed this method
that assigns a score to each feature in the
different lists as the sum of accuracies for all
classiers that include that feature. Such a
scoring scheme favours the features that lead to
more accurate classication but it is considered
simple.

18/08/2017 9
6. Methodologies
Correlation based aggregation
Hybrid ensemble constraction techniques
6.1. Methods
K-NN
tenFoldCV
F-Measure
Matlab

18/08/2017 10
Reference
[1] Afef Ben Brahim1.Mohamed Limam(2017)
Ensemble feature selection for high dimensional data:
a new method and a comparative study. Adv Data Anal
Classif DOI 10.1007/s11634-017-0285-y.
[2] Saeys Y, Abeel T, Peer Y (2008) Robust feature
selection using ensemble feature selection techniques.
In: Proceedings of the European conference on
machine learning and knowledge discovery in
databases Part II, ECML PKDD 08, Springer-Verlag,
Berlin, Heidelberg, pp 313325 .
[3]AbeelT,HelleputteT,VandePeerY,DupontP,SaeysY(20
10) Robust biomarker identication for cancer
diagnosis with ensemble feature selection methods.
Bioinformatics 26(3):392398 .

18/08/2017 11
Cont...
[4]KoldeR,LaurS,AdlerP,ViloJ(2012)Robustranka
ggregationforgenelistintegrationandmeta-
analysis. Bioinformatics 28(4):573580.
[5]PengH,LongF,DingC(2005)Featureselectionbas
edonmutualinformation:criteriaofmax-
dependency, max-relevance, and
min-redundancy. IEEE Trans Pattern Anal Mach
Intell 27:12261238.

18/08/2017 12
Thank you !

18/08/2017 13

Das könnte Ihnen auch gefallen