Beruflich Dokumente
Kultur Dokumente
Goal of COLT
Inductively Learn a Target Function Given:
Only training examples of target function
Space of candidate hypotheses
Error of a Hypothesis
The true error of a hypothesis h
with respect to target concept c and distribution D
is the probability that h will misclassify an instance drawn at
random according to D
Error of hypothesis h
Instance Space X
c
+
+
Where c
and h disagree
10
PAC Learnable
Consider a concept class C defined over
a set of instances X of length n and
a learner L using hypothesis space H
C is PAC-learnable by L using H if
for all c N C,
distributions D over X,
such that 0 < <1/2
learner L will with probability at least (1)
output a hypothesis h H
such that errorD(h) <
in time that is polynomial in 1/, 1/, n and size(c)
12
Do so efficiently
In time that grows at most polynomially with 1/ and 1/,
which define the strength of our demands on the output
hypothesis
With n and size(c) that define the inherent complexity of the
underlying instance space X and concept class C
n is the size of instances in X
If instances are conjunctions of k Boolean variables, n=k
size (c) is the encoding length of c in C, eg, no of Boolean
features actually used to describe c
13
14
15
Version Space
Contains all plausible versions of the target concept
Hypothesis h
Hypothesis Space H
.
.
.
.
16
Version Space
A hypothesis h is consistent with training examples D
17
Version Space
Hypothesis h
Hypothesis Space H
.
VSH,D
Hypothesis Space H
.
.
error=.3
r = .1
error=.1
r = .2
VSH,D
error=.2
r=0
error=.3
r = .4
.
error=.1
r=0
.
error=.2
r = .3
19
error=.3
r = .1
error=.2
r=0
VSH,D
error=.1
r=0
= 0.21
.
error=.3
r = .4
.
error=.2
r = .3
20
|H|e
|H|e
Probability of failure
is below some desired level
Re arranging
1
m (ln | H | + ln(1 / )
1
m 2 (ln | H | + ln(1 / )
2
m grows as the square of 1/ rather than linearly
Called agnostic learning
23
(n ln 3 + ln(1 / )
24
(nk ln 3 + ln(1 / )
25