Beruflich Dokumente
Kultur Dokumente
229 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Learning Disjunctive Sets of Rules
230 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Sequential Covering Algorithm
Sequential-
covering(Target attribute; Attributes; Examples; Thresho
Learned rules fg
Rule learn-one-
rule(Target attribute; Attributes; Examples)
while performance(Rule; Examples)
> Threshold, do
{ Learned rules Learned rules + Rule
{ Examples Examples ? fexamples
correctly classied by Ruleg
{ Rule learn-one-
rule(Target attribute; Attributes; Examples)
Learned rules sort Learned rules accord to
performance over Examples
return Learned rules
231 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Learn-One-Rule
IF
THEN PlayTennis=yes
IF Wind=weak
THEN PlayTennis=yes
...
IF Wind=strong IF Humidity=high
THEN PlayTennis=no IF Humidity=normal THEN PlayTennis=no
THEN PlayTennis=yes
IF Humidity=normal
Wind=weak
THEN PlayTennis=yes
...
IF Humidity=normal IF Humidity=normal
Wind=strong IF Humidity=normal Outlook=rain
THEN PlayTennis=yes Outlook=sunny THEN PlayTennis=yes
THEN PlayTennis=yes
232 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Learn-One-Rule
Pos positive Examples
Neg negative Examples
while Pos, do
Learn a NewRule
{ NewRule most general rule possible
{ NewRuleNeg Neg
{ while NewRuleNeg , do
233 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Subtleties: Learn One Rule
234 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Variants of Rule Learning Programs
235 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Learning First Order Rules
Why do that?
Can learn sets of rules such as
Ancestor(x; y) Parent(x; y)
Ancestor(x; y) Parent(x; z ) ^ Ancestor(z; y)
General purpose programming language
Prolog: programs are sets of such rules
236 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
First Order Rule for Classifying Web
Pages
[Slattery, 1997]
course(A)
has-word(A, instructor),
Not has-word(A, good),
link-from(A, B),
has-word(B, assign),
Not link-from(B, C)
Train: 31/31, Test: 31/34
237 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
FOIL(Target predicate; Predicates; Examples)
Pos positive Examples
Neg negative Examples
while Pos, do
Learn a NewRule
{ NewRule most general rule possible
{ NewRuleNeg Neg
{ while NewRuleNeg , do
238 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Specializing Rules in FOIL
239 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Information Gain in FOIL
0 1
Foil Gain(L; R) t log2 p + p1 ? log p0
BB CC
@ 2p +n A
1 n1 0 0
Where
L is the candidate literal to add to rule R
p0 = number of positive bindings of R
n0 = number of negative bindings of R
p1 = number of positive bindings of R + L
n1 = number of negative bindings of R + L
t is the number of positive bindings of R also
covered by R + L
Note
? log2 p p+n is optimal number of bits to indicate
0
240 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Induction as Inverted Deduction
241 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Induction as Inverted Deduction
242 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Induction is, in fact, the inverse operation of
deduction, and cannot be conceived to exist
without the corresponding operation, so that
the question of relative importance cannot
arise. Who thinks of asking whether addition
or subtraction is the more important process
in arithmetic? But at the same time much
dierence in diculty may exist between a
direct and inverse operation; : : : it must be
allowed that inductive investigations are of a
far higher degree of diculty and complexity
than any questions of deduction: : : :
(Jevons 1874)
243 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Induction as Inverted Deduction
244 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Induction as Inverted Deduction
Positives:
Subsumes earlier idea of nding h that \ts"
training data
Domain theory B helps dene meaning of \t"
the data
B ^ h ^ xi ` f (xi)
Suggests algorithms that search H guided by B
245 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Induction as Inverted Deduction
Negatives:
Doesn't allow for noisy data. Consider
(8hxi; f (xi)i 2 D) (B ^ h ^ xi) ` f (xi)
First order logic gives a huge hypothesis space H
! overtting...
! intractability of calculating all acceptable h's
246 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Deduction: Resolution Rule
P _ L
:L _ R
P _R
1. Given initial clauses C1 and C2, nd a literal L
from clause C1 such that :L occurs in clause C2
2. Form the resolvent C by including all literals
from C1 and C2, except for L and :L. More
precisely, the set of literals occurring in the
conclusion C is
C = (C1 ? fLg) [ (C2 ? f:Lg)
where [ denotes set union, and \?" denotes set
dierence.
247 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Inverting Resolution
C: PassExam V Study
C: PassExam V Study
248 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Inverted Resolution (Propositional)
249 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
First order resolution
250 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Inverting First order resolution
251 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Cigol
{Bob/y, Tom/z}
{Shannon/x}
252 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
Progol
253 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997