Sie sind auf Seite 1von 13

17/09/14

? Learning:
Machine
General Methodology

by
Pascual Campoy
Computer Vision Group
Technical University Madrid

Pascual Campoy

contents

Objectives
Supervised and not supervised learning
Learning challenges
Building machine learning models
Errors and validation

Pascual Campoy

17/09/14

Learning objecCves
Find a model that predicts the right output
for a new input, given previous outputs

?!
?!

Pascual Campoy

Supervised learning
Supervised learning concept

Working structure
y1
.
.

x1
.
.
xn

?!
Feature space

yd1
.
.
ydm

ym

Rn Rm function generalitation
4

Pascual Campoy

17/09/14

Unsupervised learning
Unsupervised learning concept

Working structure

x1

.
.
xn

y1
.
.
ym

?!
Feature space

Clustering
Pascual Campoy

Leaning challenges
lack of
learning samples
x1

presence of noise

x2

Pascual Campoy

17/09/14

Leaning challenges
Risk of uncorrelation between the training
data and their relevance for desired output
y(x)

x2

p(x)

x1
x

Pascual Campoy

contents

Objectives
Supervised and not supervised learning
Learning challenges
Building machine learning models
Errors and validation

Pascual Campoy

17/09/14

training samples!

Levels in building
machine learning models
Model type selection
(manual)

Model structure tuning

(manual/automatic)

Parameter fitting
(automatic)

training
error

model

Pascual Campoy

Parameter Kng
Index opCmizaCon
Mean Square error
E = ( ykn - ydkn )2
n k

Cost function :
Cj(x) = C(wj/wi) P(wi/x)
Trade-o:
opCmizaCon degree vs. compuCng Cme

Pascual Campoy

10

17/09/14

Model structure
cross samples!

Model type selection


(manual)

Model structure tuning


(manual/automatic)

Parameter fitting
(automatic)

Cross
validation
error

model

Pascual Campoy

11

Model structure
evaluaCon
input
universe

cross set or
validation set

model
training set

Cross validation:
"N%
Choose V samples for training and repeat it to obtain $V 'different
# &
models that are validated with the remaining N-V samples
Pascual Campoy

12

17/09/14

Model types
test
error

Model type selection


(manual)

Model structure tuning


(manual/automatic)

Parameter fitting
(automatic)

MODEL

Pascual Campoy

13

Model selecCon
Mathematic models for data fitting:
polinomios, splines, B-splines, ...
PCA,

Statistic models for data prediction:


ARX, ARMAX, Markov, Box-Jenkins, ...
Bayes classifier,

Neural Networks models:


MLP, RBF, SOM, ART, ...
Pascual Campoy

14

17/09/14

contents

Objectives
Supervised and not supervised learning
Learning challenges
Building machine learning models
Errors and validation

Pascual Campoy

15

Inuence of the number of


training samples
Training error vs. test error

x1

error
test error
Bayesian error

x2

training error

number of training samples


Pascual Campoy

16

17/09/14

Inuence of the
d.o.f. of the model
Undertraining
x1
test
error

x2

model d.o.f.

Pascual Campoy

17

Inuence of the
d.o.f. of the model
Overtraining

x1

test
error

x2

model d.o.f.

Pascual Campoy

18

17/09/14

Inuence of the d.o.f.


of the model
right training
x1
test
error

x2
model d.o.f.

Pascual Campoy

19

Pascual Campoy

10

17/09/14

Pascual Campoy

Pascual Campoy

11

17/09/14

Assignment 4.1
Complete the following program in order to evaluate the classification
errors (both training and test) using a Bayesian classifier for an
increasing number of training samples. Draw the resulting figure.
clear; clf; load data_D2_C2;
ns=5:5:250; % number of samples
for i=ns
for j=1:50 % number of tries to average
ind_rand=randperm(300); ind=ind_rand(1:i);
bayclass_train=classify(XXXX, XXXX, XXXX);
error_train(j)=length(find(bayclass_train'~=XXXX));
bayclass_test=classify(XXXX, XXXX, XXXX);
error_test(j)=length(find(bayclass_test'~=XXXX));
end
error_train_v(i)=mean(error_train); error_test_v(i)=mean(error_test);
end

Pascual Campoy

23

Result for assignment 4.1


plot(ns,error_train_v(ns),'b'); hold on; plot(ns,error_test_v(ns),'g');
hold off; legend('train error','test error'); xlabel('# of training samples');

Pascual Campoy

24

12

17/09/14

Assignment 4.1
Complete the following program in order to evaluate the classification
errors (both training and test) using a Bayesian classifier for an
increasing number of training samples. Draw the resulting figure.
clear; clf; load data_D2_C2;
ns=5:5:250; % number of samples
for i=ns
for j=1:50 % number of tries to average
ind_rand=randperm(300); ind=ind_rand(1:i);
bayclass_train=classify(p.value(:,ind)', p.value(:,ind)', p.class(:,ind));
error_train(j)=length(find(bayclass_train'~=p.class(:,ind)));
bayclass_test=classify(t.value', p.value(:,ind)', p.class(:,ind));
error_test(j)=length(find(bayclass_test'~=t.class));
end
error_train_v(i)=mean(error_train); error_test_v(i)=mean(error_test);
end

Pascual Campoy

25

13