Beruflich Dokumente
Kultur Dokumente
Chapter 11
Machine Learning
What is learning?
Machine Learning
What is learning?
That is what learning is. You suddenly
understand something you've understood all
your life, but in a new way.
(Doris Lessing 2007 Nobel Prize in Literature)
Machine Learning
How to construct programs that automatically
improve with experience.
Machine Learning
How to construct programs that automatically
improve with experience.
Learning problem:
Task T
Performance measure P
Training experience E
Machine Learning
Chess game:
Task T: playing chess games
Performance measure P: percent of games won
against
opponents
Machine Learning
Handwriting recognition:
Task T: recognizing and classifying handwritten
words
10
11
12
15
16
New
problem
(initial
board)
Performance
Experiment
Generator
Generalizer
System
Solution trace
(game history)
Hypothesis
(V^)
Critic
Training examples
{(b1, V1), (b2,
V2), ...}
17
18
Example
Experience
Exampl
e
Sky
AirTem
p
Humidity
Wind
Water
Forecast
EnjoySpor
t
Sunny Warm
Normal
Stron
g
Warm
Same
Yes
Sunny Warm
High
Stron
g
Warm
Same
Yes
Rainy
High
Warm
Change
No
Sunny Warm
Low
High
Stron
g
Weak
Stron
g
Cool
Change
Yes
Rainy
High
Stron
g
Warm
Change
Sunny Warm
Normal
Stron
g
Warm
Same
Sunny Warm
Low
Stron
g
Cool
Same
Cold
Prediction
Cold
19
Example
Learning problem:
Task T: classifying days on which my friend enjoys water
sport
20
Concept Learning
Inferring a boolean-valued function from
training examples of its input (instances) and
output (classifications).
21
Concept Learning
Learning problem:
Target concept: a subset of the set of instances X
c: X {0, 1}
Target function:
Sky AirTemp Humidity Wind Water Forecast {Yes, No}
Hypothesis:
Characteristics of all instances of the concept to be learned
Constraints on instance attributes
h: X {0, 1}
22
Concept Learning
Satisfaction:
h(x) = 1 if x satisfies all the constraints of h
h(x) = 0 otherwsie
Consistency:
h(x) = c(x) for every instance x of the training examples
Correctness:
h(x) = c(x) for every instance x of X
23
Concept Learning
How to represent a hypothesis function?
24
Concept Learning
Hypothesis representation (constraints on instance
attributes):
<Sky, AirTemp, Humidity, Wind, Water, Forecast>
25
Concept Learning
General-to-specific ordering of hypotheses:
hj g hk if xX: hk(x) = 1 hj(x) = 1
Specifi
c
h1 = <Sunny, ?, ?, Strong, ? , ?
>
h1
h3
Lattice
(Partial
order)
Genera
l
h2
h2 = <Sunny, ?, ?,
>
, ? ,?
h3 = <Sunny, ?, ?,
>
, Cool, ?
H
26
FIND-S
Exampl
e
Sky
AirTem
p
Humidity
Wind
Water
Forecast
EnjoySpor
t
Sunny Warm
Normal
Stron
g
Warm
Same
Yes
Sunny Warm
High
Stron
g
Warm
Same
Yes
Rainy
High
Stron
g
Warm
Change
No
Sunny Warm
High
Stron
g
Cool
Change
Yes
Cold
h=< ,
, ,
>
, Strong, Warm,
27
FIND-S
Initialize h to the most specific hypothesis in H:
For each positive training instance x:
For each attribute constraint ai in h:
If the constraint is not satisfied by x
Then replace ai by the next more general
constraint satisfied by x
Output hypothesis h
28
FIND-S
Exampl
e
Sky
AirTem
p
Humidity
Wind
Water
Forecast
EnjoySpor
t
Sunny Warm
Normal
Stron
g
Warm
Same
Yes
Sunny Warm
High
Stron
g
Warm
Same
Yes
Rainy
High
Stron
g
Warm
Change
No
Sunny Warm
High
Stron
g
Cool
Change
Yes
High
Stron
g
Warm
Change
No
Cold
h = <Sunny, Warm,
>
Prediction
Cold
, Strong,
? ,
Rainy
Sunny Warm
Normal
Stron
g
Warm
Same
Yes
Sunny Warm
Low
Stron
g
Cool
Same
Yes 29
FIND-S
The output hypothesis is the most specific one
that satisfies all positive training examples.
30
FIND-S
The result is consistent with the positive training
examples.
31
FIND-S
Is the result is consistent with the negative training
examples?
32
FIND-S
Exampl
e
Sky
AirTem
p
Humidity
Wind
Water
Forecast
EnjoySpor
t
Sunny Warm
Normal
Stron
g
Warm
Same
Yes
Sunny Warm
High
Stron
g
Warm
Same
Yes
Rainy
High
Stron
g
Warm
Change
No
Sunny Warm
High
Stron
g
Cool
Change
Yes
Sunn
y
Normal
Stron
g
Cool
Change
No
Cold
Warm
h = <Sunny, Warm,
>
, Strong,
? ,
33
FIND-S
The result is consistent with the negative training
examples if the target concept is contained in H
(and the training examples are correct).
34
FIND-S
The result is consistent with the negative training
examples if the target concept is contained in H (and
the training examples are correct).
Sizes of the space:
Size of the instance space: |X| = 3.2.2.2.2.2 = 96
Size of the concept space C = 2|X| = 296
Size of the hypothesis space H = (4.3.3.3.3.3) + 1 = 973 << 296
The target concept (in C) may not be contained in H.
35
FIND-S
Questions:
Has the learner converged to the target concept, as
there can be several consistent hypotheses (with both
positive and negative training examples)?
Why the most specific hypothesis is preferred?
What if there are several maximally specific consistent
hypotheses?
What if the training examples are not correct?
36
List-then-Eliminate Algorithm
Version space: a set of all hypotheses that are
consistent with the training examples.
Algorithm:
Initial version space = set containing every hypothesis
in H
For each training example <x, c(x)>, remove from the
version space any hypothesis h for which h(x) c(x)
Output the hypotheses in the version space
37
List-then-Eliminate Algorithm
Requires an exhaustive enumeration of all
hypotheses in H
38
Compact Representation of
Version Space
G (the generic boundary): set of the most generic
hypotheses of H consistent with the training data D:
G = {gH | consistent(g, D) gH: g g g consistent(g,
D)}
Compact Representation of
Version Space
Version space = <G, S> = {hH | gG sS: g g h g s}
S
G
40
Candidate-Elimination Algorithm
Exampl
e
Sky
AirTem
p
Humidity
Wind
Water
Forecast
EnjoySpor
t
Sunny Warm
Normal
Stron
g
Warm
Same
Yes
Sunny Warm
High
Stron
g
Warm
Same
Yes
Rainy
High
Stron
g
Warm
Change
No
Sunny Warm
High
Stron
g
Cool
Change
Yes
Cold
S0 = {<, , , , , >}
G0 = {<?, ?, ?, ?, ?, ?>}
41
Candidate-Elimination Algorithm
S4 = {<Sunny, Warm, ?, Strong, ?, ?>}
<?, Warm, ?,
42
Candidate-Elimination Algorithm
Initialize G to the set of maximally general
hypotheses in H
Initialize S to the set of maximally specific
hypotheses in H
43
Candidate-Elimination Algorithm
For each positive example d:
Remove from G any hypothesis inconsistent with d
For each s in S that is inconsistent with d:
Remove s from S
Add to S all least generalizations h of s, such that h is
consistent with d and some hypothesis in G is more general
than h
Remove from S any hypothesis that is more general than
another
hypothesis in S
44
Candidate-Elimination Algorithm
For each negative example d:
Remove from S any hypothesis inconsistent with d
For each g in G that is inconsistent with d:
Remove g from G
Add to G all least specializations h of g, such that h is
consistent with d and some hypothesis in S is more specific
than h
Remove from G any hypothesis that is more specific than
another
hypothesis in G
45
Candidate-Elimination Algorithm
The version space will converge toward the
correct target concepts if:
H contains the correct target concept
There are no errors in the training examples
46
Candidate-Elimination Algorithm
Partially learned concept can be used to classify
new instances using the majority rule.
S4 = {<Sunny, Warm, ?, Strong, ?, ?>}
<?, Warm, ?,
?
47
Inductive Bias
Size of the instance space: |X| = 3.2.2.2.2.2 = 96
Number of possible concepts = 2|X| = 296
Size of H = (4.3.3.3.3.3) + 1 = 973 << 296
48
Inductive Bias
Size of the instance space: |X| = 3.2.2.2.2.2 = 96
Number of possible concepts = 2|X| = 296
Size of H = (4.3.3.3.3.3) + 1 = 973 << 296
a biased hypothesis space
49
Inductive Bias
An unbiased hypothesis space H that can
represent every subset of the instance space X:
Propositional logic sentences
Positive examples: x1, x2, x3
Negative examples: x4, x5
h(x) (x = x1) (x = x2) (x = x3) x1 x2 x3
h(x) (x x4) (x x5) x4 x5
50
Inductive Bias
x 1 x 2 x 3
x1 x2 x3 x6
x4 x5
Any new instance x is classified positive by half of the version
space, and negative by the other half
not classifiable
51
Inductive Bias
Exampl
e
Day
Actor
Price
EasyTicket
Mon
Famous
Expensiv
e
Yes
Sat
Famous
Moderat
e
No
Sun
Infamous
Cheap
No
Wed
Infamous
Moderat
e
Yes
Sun
Famous
Expensiv
e
No
Thu
Infamous
Cheap
Yes
Tue
Famous
Expensiv
e
Yes
Sat
Famous
Cheap
No
Wed
Famous
Cheap
52
Inductive Bias
Exampl
e
Quality
Price
Buy
Good
Low
Yes
Bad
High
No
Good
High
Bad
Low
53
Inductive Bias
A learner that makes no prior assumptions
regarding the identity of the target concept
cannot classify any unseen instances.
54
Homework
Exercises
55