Beruflich Dokumente
Kultur Dokumente
APRIL 2019
(Declared as Deemed-to-be university-under Sec-3 of the UGC Act, 1956)
KARUNYANAGAR,COIMBATORE–641114
BONAFIDE CERTIFICATE
This is to certify that the project report entitled, “ADHD signal diagnosis using Raspberry Pi ”
is a bonafide record of work of the following candidates who carried out the project work under my
supervision during the academic year 2018-2019.
Submitted for the Full Semester/Half Semester Viva Voce examination held on ……………………….
…………………….. ……………………..
(Internal Examiner) (External Examiner)
i
ACKNOWLEDGEMENT
First and foremost, we would like to thank Almighty God for all the blessings He
has bestowed upon us to work thus far and finish this project.
Our sincere and special thanks to our guide, Dr. S. THOMAS GEORGE,
Assistant Professor, for his immense help and guidance. We would like to extend a
thankful heart for his constant support through the entire project.
Finally we would like to extend our deepest appreciation to our family and friends
for all that they were to us during the project period.
ii
ABSTRACT
TABLE OF CONTENTS
Chapter Page No
ACKNOWLEDGEMENT i
ABSTRACT ii
LIST OF TABLES v
LIST OF FIGURES vii
LIST OF NOMENCLATURE viii
1. INTRODUCTION 1
Statement of Problem 2
Image preprocessing 2
Types of classifiers 3
Naïve Bayes 3
Summary 4
2. LITERATURE SURVEY 5
3. IMAGE PRE-PROCESSING 9
Overall Block Diagram 9
Raspberry Pi 10
Independent Component Analysis 10
Ambiguities of ICA 12
Illustration of ICA 12
Summary 15
4. FEATURE EXTRACTION 16
Local Binary Pattern 16
Normalization 18
Energy 20
Entropy 20
Standard Deviation 21
Covariance 22
Summary 22
iv
5. CLASSIFIERS 23
Naïve Bayes 29
K-Nearest Neighbor 30
Algorithm 31
Properties 32
Features of KNN 33
Support Vector Machine 35
Applications 37
Summary 37
6. RESULT AND INFERENCE 38
Result 38
Inference 39
7. CONCLUSION 40
REFERENCES 41
v
LIST OF TABLES
LIST OF FIGURES
LIST OF NOMENCLATURE
SYMBOLS DESCRIPTION
CHAPTER 1
INTRODUCTION
Image processing is very important tool in differentiating the images
with normal patient and affected patient. In our projectwe are discussing about
the ADHD affected patients. ADHD is a neuro-developmental, psychiatric and
the most chronic mental disorder present in children frequently seen in
preschool and early school years. ADHD is a strong genetic basis
multifunctional disorder with complex diagnosis. Features of ADHD are
hyperactivity, inattention, academic, behavioural, emotional, impulsivity and
social functioning. ADHD impacts focus, self control and other skills important
in daily life. It is caused by difference in brain anatomy and wiring. ADHD is a
neurodevelopment disorder, which cannot be seen naturally with our naked eye.
As we grow up from children to adults ADHD is gradually increasing. ADHD
has impacts on hyperactivity, inattention and impulsivity.
It is normal to have these symptoms but with ADHD children the
symptoms are severe, often and reduce the task completion. ADHD can be
caused because of the carelessness of parents like mother’s intake of drugs,
alcohol and tobacco during pregnancy. Birth complications like babies born
before due date or terribly low birth weight. Exposure to lead or alternative
virulent substance, extreme negligence, abuse or social deprivation, food
additives like artificial food colouring, brain injury. Children having this
problem faces many difficulties in academic, social and relationship problems.
ADHD affects 11% of school age children. ADHD is also known as
minimal brain dysfunction, hyperkinetic reaction of childhood. ADHD is most
commonly seen in the children born with a low birth weight, born premature or
whose mothers had difficult pregnancies.
IMAGE PRE-PROCESSING
TYPES OF CLASSIFIERS
Three types of classifiers are used
SUMMARY
CHAPTER 2
LITERATURE SURVEY
The joint problems of EEG source segregation, identification, and localization are very
difficult, since the problem of determining brain electrical sources from potential
patterns recorded on the scalp surface is mathematically underdetermined. Recent
efforts to identify EEG sources have focused mostly on performing spatial segregation
and localization of source activity. By applying the ICA algorithm of Bell and
Sejnowski, we attempt to completely separate the twin problems of source
identification and source localization. The ICA algorithm derives independent sources
from highly correlated EEG signals statistically and without regard to the physical
location or configuration of the source generators. ICA appears to be a promising new
analysis tool for human EEG and ERP research. It can isolate a wide range of artifacts
to a few output channels while removing them from remaining channels. The algorithm
also appears to be useful for decomposing evoked response data into spatially distinct
sub components, while measures of non stationarity in the ICA source solution may be
useful for observing brain state changes.
level of recognition performance for geometric features and local binary pattern (LBP)
features.
Signals from eye movements and blinks can be orders of magnitude larger than brain-
generated electrical potentials and are one of the main sources of artifacts in
electroencephalographic (EEG) data. Rejecting contaminated trials causes substantial
data loss, and restricting eye movements/blinks limits the experimental designs
possible and may impact the cognitive processes under investigation. This article
presents a method based on blind source separation (BSS) for automatic removal of
electroocular artifacts from EEG data. BBS is a signal-processing methodology that
includes independent component analysis (ICA). In contrast to previously explored
ICA-based methods for artifact removal, this method is automated.
In this work, we proposed a versatile signal processing and analysis framework for
Electroencephalogram (EEG). Within this framework the signals were decomposed
into the frequency sub-bands using DWT and a set of statistical features was extracted
from the sub-bands to represent the distribution of wavelet coefficients. Principal
components analysis (PCA), independent components analysis (ICA) and linear
discriminant analysis (LDA) is used to reduce the dimension of data. Then these
features were used as an input to a support vector machine (SVM) with two discrete
outputs: epileptic seizure or not. The performance of classification process due to
different methods is presented and compared to show the excellent of classification
process.
13
KNN is a method which is used for classifying objects based on closest training
examples in the feature space. KNN is the most basic type of instance-based learning
or lazy learning. It assumes all instances are points in n-dimensional space. A distance
measure is needed to determine the “closeness” of instances. KNN classifies an
instance by finding its nearest neighbors and KNN is a method which is used for
classifying objects based on closest training examples in the feature space. KNN is the
most basic type of instance-based learning or lazy learning. It assumes all instances are
points in n-dimensional space. A distance measure is needed to determine the
“closeness” of instances. KNN classifies an instance by finding its nearest neighbors
and picking the most popular class among the neighbors.
The SVM is a binary classifier, which can be extended by fusing several of its kind into
a multiclass classifier. In this paper, we fuse SVM decisions using the ECOC approach,
adopted from the digital communication theory [. In the ECOC approach,up to 2n−1 −
1 (where n is the number of classes) SVMs are trained, each of them aimed at
separating a different combination of classes. For three classes (A, B, and C) we need
three classifiers; one SVM classifies A from B and C, a second SVM classifies B from
A and C, and a third SVM classifies C from A and B. T he multiclass-classifier-
output code for a pattern is a combination of targets of all the separate SVMs. In our
example, vectors from classes A, B, and C have codes (1,−1,−1), (−1,1,−1), and
(−1,−1,1), respectively. If each of the separate SVMs classifies a pattern correctly, the
multiclassclassifier- target code is met and the ECOC approach reports no error for that
pattern. However, if at least one of the SVMs misclassifies the pattern, the class
selected for this pattern is the one its target code closest in the Hamming distance
senses to the actual output code and this may be an erroneous decision.
Support Vector Machine (SVM) have been very popular as a large margin classifier
due its robust mathematical theory. It has many practical applications in a number of
fields such as in bioinformatics, in medical science for diagnosis of diseases, in various
engineering applications for prediction of model, in finance for forecasting etc. It is
widely used in medical science because of its powerful learning ability in
classification. It can classify highly nonlinear data using kernel function. This paper
proposes and analyses diagnostic model to classify the most common skin illnesses and
also provide a useful insight into the SVM algorithm. In rural areas where people are
generally treated by paramedical staff, skin patients are not subject to proper diagnosis
resulting in mistreatment. We think SVM is a good tool for proper diagnosis.
15
CHAPTER 3
IMAGE PRE- PROCESSING
The overall block diagram describes that the EEG signal is taken and
image pre-processing is done using EEGLAB. ICA converts the signal into
linearly independent components known as topoplots. These topoplots are further
used in LBP for image feature extraction. The features obtained are used to train
the classifiers and helps in the classification of artifact and non-artifact. The
rejected artifacts are removed and the remaining topoplots are used for
reconstruction of the denoised signal.
Classification Feature
Non- Results
ADHD Extraction(LBP)
ma
ADHD
Maps
RASPBERRY PI 3
The Raspberry Pi is a computer, very like the computers with which
you’re already familiar. It uses a different kind of processor, so you can’t
install Microsoft Windows on it. But you can install several versions of the
Linux operating system that look and feel very much like Windows. If you
want to, you can use the Raspberry Pi to surf the internet, send an email or
write a letter using a word processor. But you can also do so much more.
Easy to use but powerful, affordable and (as long as you’re careful)
difficult to break, the Raspberry Pi is the perfect tool for aspiring computer
scientists. What do we mean by computer science? We mean learning how
computers work so you can make them do what you want them to do, not
what someone else thinks you should do with them.
And who do we mean by computer scientists? We mean you. You may
finish this manual and decide you want to be next Tim Berners Lee, but even
if you don’t, we hope you have fun, learn something new and get a feel for
how computers work. Because no matter what you do in life, computers are
bound to be part of it.
Hardware
The Raspberry Pi hardware has evolved through several versions that
feature variations in memory capacity and peripheral-device support.
This block diagram describes Model B and B+; Model A, A+, and the
Pi Zero are similar, but lack the Ethernet and USB hub components. The
Ethernet adapter is internally connected to an additional USB port. In
Model A, A+, and the Pi Zero, the USB port is connected directly to
the system on a chip (SoC). On the Pi 1 Model B+ and later models the
USB/Ethernet chip contains a five-port USB hub, of which four ports are
available, while the Pi 1 Model B only provides two. On the Pi Zero, the USB
port is also connected directly to the SoC, but it uses a micro USB (OTG)
port.
17
PROCESSOR
The Broadcom BCM2835 SoC used in the first generation Raspberry
Pi includes a 700 MHz ARM11 76JZF-S processor, VideoCore IV graphics
processing unit (GPU), and RAM. It has a level 1 (L1) cache of 16 KB and a
level 2 (L2) cache of 128 KB. The level 2 cache is used primarily by the GPU.
The SoC is stacked underneath the RAM chip, so only its edge is visible. The
1176JZ(F)-S is the same CPU used in the original iPhone, although at a
higher clock rate, and mated with a much faster GPU.
The earlier V1.1 model of the Raspberry Pi 2 used a Broadcom
BCM2836 SoC with a 900 MHz 32-bit, quad-core ARM Cortex-
A7 processor, with 256 KB shared L2 cache. The Raspberry Pi 2 V1.2 was
upgraded to a Broadcom BCM2837 SoC with a 1.2 GHz 64-bit quad-
core ARM Cortex-A53 processor, the same SoC which is used on the
Raspberry Pi 3, but under clocked (by default) to the same 900 MHz CPU
clock speed as the V1.1. The BCM2836 SoC is no longer in production as of
late 2016.
The Raspberry Pi 3+ uses a Broadcom BCM2837B0 SoC with a
1.4 GHz 64-bit quad-core ARM Cortex-A53 processor, with 512 KB shared
PERFORMANCE
While operating at 700 MHz by default, the first generation
Raspberry Pi provided a real-world performance roughly equivalent to
0.041 GFLOPS. On the CPU level the performance is similar to a
300 MHz Pentium II of 1997–99. The GPU provides 1 Gpixel/s or
1.5 Gtexel/s of graphics processing or 24 GFLOPS of general purpose
computing performance. The graphical capabilities of the Raspberry Pi are
roughly equivalent to the performance of the Xbox of 2001.
Raspberry Pi 2 V1.1 included a quad-core Cortex-A7 CPU running at
900 MHz and 1 GB RAM. It was described as 4–6 times more powerful than
its predecessor. The GPU was identical to the original. In parallelised
benchmarks, the Raspberry Pi 2 V1.1 could be up to 14 times faster than a
Raspberry Pi 1 Model B+.
The Raspberry Pi 3, with a quad-core ARM Cortex-A53 processor, is
described as having ten times the performance of a Raspberry Pi 1. This was
suggested to be highly dependent upon task threading and instruction set use.
Benchmarks showed the Raspberry Pi 3 to be approximately 80% faster than
the Raspberry Pi 2 in parallelised tasks.
18
PHYTHON
Python is a wonderful and powerful language and with raspberry pi it
lets to connect the project with real world. Python syntax is very clean, with
an emphasis on readability and uses Standard English keywords. The easiest
introduction to python is through Integrated Development and Learning
Environment (IDLE), a python development environment. Python is the ocean
of libraries. A python library is a collection of functions and methods that
allows performing many actions and it decreases the code size also.
We have now dropped the time index t; in the ICA model, we assume
that each mixture xj as well as each independent component sk is a random
variable, instead of a proper time signal. The observed values xj(t), e.g., the
microphone signals in the cocktail party problem, are then a sample of this
random variable. Without loss of generality, we can assume that both the
mixture variables and the independent components have zero mean: If this is
not true, then the observable variables xi can always be centered by subtracting
the sample mean, which makes the model zero-mean.
The starting point for ICA is the very simple assumption that the
components si are statistically independent. It will be seen below that we must
also assume that the independent component must have non Gaussian
distributions. However, in the basic model we do not assume these distributions
known (if they are known, the problem is considerably simplified.) For
simplicity, we are also assuming that the unknown mixing matrix is square, but
this assumption can be sometimes relaxed. Then, after estimating the matrix A,
we can compute its inverse, say W, and obtain the independent component
simply by:
s = Wx. (3.4)
ICA is very closely related to the method called blind source separation
(BSS) or blind signal separation. A “source” means here an original signal, i.e.
independent component, like the speaker in a cocktail party problem. “Blind”
means that we know very little, if anything, on the mixing matrix, and make
little assumptions on the source signals. ICA is one method, perhaps the most
widely used, for performing blind source separation.
Illustration of ICA
1
p(𝑠𝑖) = { 2√3 if|si| ≤ √3 (3.5)
0 otherwise
21
The range of values for this uniform distribution were chosen so as to
make the mean zero and the variance equal to one, as was agreed in the
previous Section. The joint density of s1 and s2 is then uniform on a square.
This follows from the basic definition that the joint density of two independent
variables is just the product of their marginal densities, we need to simply
compute the product. The joint density is illustrated in Figure (3.2) by showing
data points randomly drawn from this distribution.
Now let as mix these two independent components. Let us take the
following mixing matrix:
2 3
A0 = ( ) (3.6)
2 1
This gives us two mixed variables, x1 and x2. It is easily computed that
the mixed data has a uniform distribution on a parallelogram. Note that the
random variables x1 and x2 are not independent any more an easy way to see
this is to consider, whether it is possible to predict the value of one of them, say
x2, from the value of the other. Clearly if x1 attains one of its maximum or
minimum values, then this completely determines the value of x2. They are
therefore not independent. (For variables s1 and s2 the situation is different:
from Figure (3.2) it can be seen that knowing the value of s1 does not in any
way help in guessing the value of s2.) The problem of estimating the data
model of ICA is now to estimate the mixing matrix A0 using only information
contained in the mixtures x1 and x2. Actually, you can see from Figure (3.3) an
intuitive way of estimating A: The edges of the parallelogram are in the
directions of the columns of A. This means that we could, in principle, estimate
the ICA model by first estimating the joint density of x1 and x2, and then
locating the edges. So, the problem seems to have a solution.
In reality, however, this would be a very poor method because it only
works with variables that have exactly uniform distributions. Moreover, it
would be computationally quite complicated. What we need is a method that
works for any distributions of the independent components, and works fast and
reliably. Next we shall consider the exact definition of independence before
starting to develop methods for estimation of the ICA model.
22
Figure 3.3 The joint distribution of the independent components s1 and s2 with uniform distributions.
Horizontal axis: s1, vertical axis: s2.
23
Figure 3.3. The joint distribution of the observed mixtures x1 and x2. Horizontal axis:
x1, vertical axis: x2.
SUMMARY
In this chapter we have discussed about the EEGLAB and described the
ICA technique used. As seen above EEGLAB is an interactive Matlab toolbox
for processing continuous and event-related EEG data using independent
component analysis (ICA), time/frequency analysis, and other methods
including artifact rejection. Using the Infomax algorithm we obtained linearly
independent components. These IC’s are further used for classification and
artifact rejection.
24
CHAPTER 4
FEATURE EXTRACTION
in which n is the number of different labels produced by the LBP operator, and
I{A} is 1 if A is true and 0 if A is false.
When the image patches whose histograms are to be compared have
different sizes, the histograms must be normalized to get a coherent description
𝐻
N𝑖 = ∑n−1i 𝐻 (4.2)
j=0 j
Normalization
After obtaining the LBP of the image, normalization is done in order to
bring consistency in the dynamic range of images. In image processing,
normalization is a process which is used to change the range of pixel intensity
values. It converts the image or any type of signal into a range that is more
familiar. After the normalization, we will obtain the image features such as
energy, entropy, standard deviation and covariance of the images. Energy is
defined based on a normalized image. It shows how the gray levels are
distributed. The entropy describes how much randomness or uncertainty is
27
with intensity values in the range (Min, Max), into a new image
with intensity values in the range (newMin, newMax). The linear normalization
of a grayscale digital image is performed according to the formula
Energy
Energy is defined based on a normalized image. Energy shows how
the gray levels are distributed. When the number of gray levels is low then
energy is high. The energy of an image gives information present on the image.
Entropy
Image entropy is a quantity which is used to describe the amount of
information which must be coded for by a compression algorithm. The image
that is perfectly flat will have entropy of zero. So they can be compressed to a
relatively small size. In high entropy images which have a great deal of
contrast from one pixel to the next, cannot be compressed as much as low
entropy images.
Standard Deviation
∑N,n∈𝑅 a2[N,n]−˄N2𝑎
=√ (4.4)
˄−1
Covariance
Covariance is a measure of how changes in one variable are associated
with changes in a second variable. Covariance measures the degree to which
two variables are linearly associated. The covariance matrix is used to capture
the spectral variability of a signal. C = cov( A ) returns the covariance. (If A is
a vector of observations, C is the scalar-valued variance. If A is a matrix whose
columns represent random variables and whose rows represent observations, C
is the covariance matrix with the corresponding column variances along the
diagonal.
SUMMARY
CHAPTER 5
CLASSIFIERS
NAIVE BAYES
Algorithm
KNN is a highly effective inductive inference method for noisy training data
and complex target functions. Target function for a whole space may be
described as a combination of less complex local approximations. In Knn
Learning is very simple and Classification is time consuming.
The test sample (green circle) should be classified either to the first
class of blue squares or to the second class of red triangles. If k = 3 (solid line
34
circle) it is assigned to the second class because there are 2 triangles and only 1
square inside the inner circle. If k = 5 (dashed line circle) it is assigned to the
first class (3 squares vs. 2 triangles inside the outer circle).
PROPERTIES
Dividing the training data into smaller subsets and building a model for each
subset then applying voting to classify testing data can enhance the classifier’s
performance.
Features of KNN
(5.14)
.
The training algorithm of SVM maximizes the margin between the training
data and class boundary, removing some meaningless data from the training
dataset. So, the resulting decision function depends only on the training data
called support vectors, which are closest to the decision boundary. Thus SVM
maximizing the boundary by minimizing the maximum loss and giving good
accuracy compared to the classifier which are based on the minimizing the
mean squared error. It is also effective in high dimensional space where
number of dimension is greater than the number of training data. SVM can
separate the classes which cannot be separated by linear classifier. SVM is
kernel based method. It uses the kernel induced feature space. Using a kernel
function it transforms data from input space into a high-dimensional feature
space in which it searches for a separating hyper plane. So, that nonlinear data
can also be separated using hyper plan in high dimensional space. This takes a
lot of computation power. But SVM overcome this problem using kernel trick.
In SVM kernel functions are defined in reproducing kernel Hilbert space
(RKHS). Hilbert space is complete inner product space so similarity between
training data points are measured by inner product which is less expensive
computationally. Also, kernels are Mercer’s kernel , i.e., positive semi definite
kernel and due to the Mercer’s kernel SVM gives global optimum.
SVMs are built on developments in computational learning theory.
Because of their accuracy and ability to deal with a large number of predictors,
they have more attention in biomedical applications. The majority of the
previous classifiers separate classes using hyper planes that split the classes,
using a flat plane, within the predictor space. SVMs broaden the concept of
hyper plane separation to data that cannot be separated linearly, by mapping the
predictors onto a new, higher-dimensional space in which they can be separated
linearly. The method’s name derives from the support vectors, which are lists
of the predictor values taken from cases that lie closest to the decision
boundary separating the classes. It is practical to assume that these cases have
the greatest impact on the location of the decision boundary. In fact, if they
were removed they could have large effects on its location. Computationally,
finding the best location for the decision plane is an optimization problem that
makes uses of a kernel function to build linear boundaries through nonlinear
transformations, or mappings, of the predictors. The intelligent component of
the algorithm is that it locates a hyper plane in the predictor space which is
stated in terms of the input vectors and dot products in the feature space. The
dot product can then be used to find the distances between the vectors in this
37
higher-dimensional space. A SVM locates the hyper plane that divides the
support vectors without ever representing the space explicitly. As an alternative
a kernel function is used that plays the role of the dot product in the feature
space. The two classes can only be separated absolutely by a complex curve in
the original space of the predictor. The best linear separator cannot totally
separate the two classes. On the other hand, if the original predictor values can
be projected into a more suitable feature space, it is possible to separate
completely the classes with a linear decision boundary. As a result, the problem
becomes one of finding the suitable transformation. The kernel function, which
is central to the SVM approach, is also one of the main problems, especially
with respect to the selection of its parameter values. It is also crucial to select
the magnitude of the penalty for violating the soft margin between the classes.
This means that successful construction of a SVM necessitates some decisions
that should be informed by the data to be classified. The basic support vector
classifier is very similar to the perceptron. Both are linear classifiers, assuming
separable data. In perceptron learning, the iterative procedure is stopped when
all samples in the training set are classified correctly. For linearly separable
data, this means that the found perceptron is one solution arbitrarily selected
from an (in principle) infinite set of solutions. In contrast, the support vector
classifier chooses one particular solution: the classifier which separates the
classes with maximal margin. The margin is defined as the width of the largest
‘tube’ not containing samples that can be drawn around the decision boundary.
It can be proven that this particular solution has the highest generalization
ability.
High learning ability, good generalization in classification and
regression makes SVM most popular learning algorithm in many real-life
applications such as bioinformatics, electrical load forecasting, pattern
recognition, image processing, field of Hydrology. SVM is used to predict
mechanical property such as hot-rolled plain carbon steel, to build credit
scoring models assessing the risk of default of clients, in fault diagnosis, for
forecasting failures and reliability in engine system. It is also used to evaluate
level of coal mine underground environment, in classification of drug and
nondrug problem, to diagnosis diabetes and erythematous disease, in drug
design, in qualitative and quantitative prediction from sensor data etc.
SVMs are among the best “off-the-shelf” supervised learning algorithms.
It is kernel based supervisedlearning algorithm for binary classification
problem. It separates the two classes using kernel function which isinduced
38
from the training data set. The goal is to produce a classifier that will work well
on unseen examples, i.e.give good generalization.
where, w is the normal to the hyper plane which is known as weight vector and
b is called the bias. We see that yi (wixi + b) > 0, i 1,2,3,...m.
Training data (instance) on the margin are called the support vectors
.
APPLICATONS
SUMMARY
CHAPTER 6
RESULT
The features obtained from LBP are applied to the classifiers. The objective of training
phase is to develop the classifier for distinguishing between ADHD maps and non-
ADHD maps. In order to analyze the output data obtained from the classifiers
accuracy, precision and sensitivity is calculated. First the data was applied to naive
bayes. It is observed that naïve bayes gives an accuracy of 35% and precision of
33.33%. Next the data was applied to KNN and SVM. It gives better accuracy than
naive bayes. It is observed that SVM has the best accuracy and precision than the other
two classifiers.
41
INFERENCE
The image features like Normalization, Standard Deviation, Energy, Entropy and
Covariance have been extracted for the output of Local Binary Pattern for the 15
brain maps obtained from Independent Component Analysis. These features are
used for training naive bayes, KNN and SVM classifiers. From Table 6.1, it is
inferred that SVM gives a better result than the KNN and naive bayes classifiers.
CHAPTER 7
CONCLUSION
CONCLUSION
REFERENCES
1. Jung, T. P., Humphries, C., Lee, T. W., Makeig, S., McKeown, M. J.,
Iragui, V., & Sejnowski, T. J. (1998, August). Removing
electroencephalographic artifacts: comparison between ICA and PCA. In
Neural Networks for Signal Processing VIII, 1998. Proceedings of the
1998 IEEE Signal Processing Society Workshop (pp. 63-72). IEEE.
2. Khan, H. A., Al Helal, A., Ahmed, K. I., & Mostafa, R. (2016, September).
Abnormal mass classification in breast mammography using rotation
invariant LBP. In Electrical Engineering and Information Communication
Technology (ICEEICT), 2016 3rd International Conference on (pp. 1-5).
IEEE.
3. Radüntz, T., Scouten, J., Hochmuth, O., & Meffert, B. (2015). EEG artifact
elimination by extraction of ICA-component features using image
processing algorithms. Journal of neuroscience methods, 243, 84-93.
4. Saxena, K., Khan, Z., & Singh, S. (2014). Diagnosis of Diabetes Mellitus
using K Nearest Neighbor Algorithm. International Journal of Computer
Science Trends and Technology (IJCST).
6. Yoko, S., Akutagawa, M., Kaji, Y., Shichijo, F., Nagashino, H., &
Kinouchi, Y. (2007). Simulation study on artifact elimination in EEG
signals by artificial neural network. In World Congress on Medical Physics
and Biomedical Engineering 2006 (pp. 1164-1166). Springer Berlin
Heidelberg.
8. Subasi, A., & Gursoy, M. I. (2010). EEG signal classification using PCA,
ICA, LDA and support vector machines. Expert Systems with
Applications, 37(12), 8659-8666.
44
10. Makeig, S., Bell, A. J., Jung, T. P., & Sejnowski, T. J. (1996). Independent
component analysis of electroencephalographic data. Advances in neural
information processing systems, 145-151.
11. Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-
scale and rotation invariant texture classification with local binary patterns.
IEEE Transactions on pattern analysis and machine intelligence, 24(7),
971-987.
13. Bouzalmat, A., Kharroubi, J., & Zarghili, A. (2013). Face Recognition
Using SVM Based on LDA. International Journal of Computer Science
Issues, 10(4), 171-179.