Beruflich Dokumente
Kultur Dokumente
1. INTRODUCTION:
A Radial Basis Function Network (RBFN) is a particular type of neural network. Generally,
Artificial Neural Networksare referring to the Multilayer Perceptron (MLP). Each neuron
in an MLP takes the weighted sum of its input values. That is, each input value is multiplied
by a coefficient, and the results are all summed together. A single MLP neuron is a simple
linear classifier, but complex non-linear classifiers can be built by combining these neurons
into a network.
The RBFN approach is more intuitive than the MLP. An RBFN performs classification by
measuring the inputs similarity to examples from the training set. Each RBFN neuron
stores a prototype, which is just one of the examples from the training set. When we want
to classify a new input, each neuron computes the Euclidean distance between the input and
its prototype. Roughly speaking, if the input more closely resembles the class A prototypes
than the class B prototypes, it is classified as class A.
2. RBF NETWORK ARCHITECTURE:
The above illustration shows the typical architecture of an RBF Network. It consists of an
input vector, a layer of RBF neurons, and an output layer with one node per category or
class of data.
2.1 The Input Vector:
The input vector is the n-dimensional vector that you are trying to classify. The entire input
vector is shown to each of the RBF neurons.
It should also be noted that all three distance measures are only valid for continuous
variables. In the instance of categorical variables the Hamming distance must be used.
Choosing the optimal value for K is best done by first inspecting the data. In general, a large
K value is more precise as it reduces the overall noise but there is no guarantee. Crossvalidation is another way to retrospectively determine a good K value by using an
independent dataset to validate the K value.
Example:
Consider the following data concerning credit default. Age and Loan are two numerical
variables (predictors) and Default is the target.
We can now use the training set to classify an unknown case (Age=48 and Loan=$142,000)
using Euclidean distance. If K=1 then the nearest neighbor is the last case in the training set
with Default=Y.
D = Sqrt[(48-33)^2 + (142000-150000)^2] = 8000.01 >> Default=Y
With K=3, there are two Default=Y and one Default=N out of three closest neighbors. The
prediction for the unknown case is again Default=Y.
In the above picture you can see that there exists multiple lines that offer a solution to the
problem. Is any of them better than the others? We can intuitively define a criterion to
estimate the worth of the lines:
A line is bad if it passes too close to the points because it will be noise sensitive and it will
not generalize correctly. Therefore, our goal should be to find the line passing as far as
possible from all points.
Then, the operation of the SVM algorithm is based on finding the hyperplane that gives the
largest minimum distance to the training examples. Twice, this distance receives the
important name of margin within SVMs theory. Therefore,
Therefore, the optimal separating
hyperplane maximizes the margin of the training data.
The decision boundary should be as far away from the data of both classes as possible.
The decision boundary can be found by solving the following constrained optimization
problem
DECISION TREES
1. INTRODUCTION:
A decision tree is a simple representation for classifying examples. Decision tree learning is
one of the most successful techniques for supervised classification learning. Each element of
the domain of the classification is called a class.
A decision tree or a classification tree is a tree in which each internal (non-leaf)
leaf) node is
labeled with an input feature. The arcs coming from a node labeled with a feature are
labeled with each of the possible values of the feature. Each leaf of the tree is labeled with a
class or a probability distribution over the classes.
To classify an example, filterr it down the tree, as follows:
Example:
Above figure shows two possible decision trees.
trees Each decision tree can be used to classify
according to the user's action. To classify a new example using the tree on the left, first
determine the length. If it is long, predict skips. Otherwise, check the thread. If the thread is
new, predict reads. Otherwise, check the author and predict read only if the author is
known.
The tree on the right makes probabilistic predictions
predictions when the length is short. In this case, it
predicts reads with probability 0.82 and so skips with probability 0.18.
A deterministic decision tree, in which all of the leaves are classes, can be mapped into a set
of rules, with each leaf of the tree corresponding
corresponding to a rule. The example has the
classification at the leaf if all of the conditions on the path from the root to the leaf are true.
2. ISSUES IN LEARNING DECISION TREES: