DM10

Deriving Classification Rules using Covering Approach
Chapter 4- Part (2)
Dr Fadi Fayez
Why Rule
Trees
big and busy no part can be understood without reference to the whole can confuse users
Rules
small independent chunks of knowledge can be easier to explain e.g if THIS then THAT if ANTECEDENT then CONSEQUENCE if LHS then RHS
Covering Algorithm
For building classification rules Separate and conquer approach iteratively Keep add in new rule to cover as many as possible a class of interest (positive instances) BUT try using at least number of rule as possible Choose attribute to separate which can maximize the probability of the desired classification hence higher accuracy Accuracy measured by p/t p = positive examples of a class t = total of instances covered by new rule (attribute being used)
Covering Algorithm (contd.)

Choose the attribute with maximum p/t For two or more attributes with equal p/t, choose the one with higher coverage, i.e. the one with greater p Do not need to consider the instances covered by the new rule accepted for next iteration e.g. Prism algorithm (no numeric attributes)
An Example of the Covering Algorithm
Step 1: If x > 1.2 then class =a. Step 2: If x > 1.2 and y>2.6 then class =a
Continue to Derive More Comprehensive Rules

The rule if x>1.2 and y>2.6, then class=a covers all as but one. A new rule if x>1.4 and y<2.4, then class =A may be added to cover all as.
Prism Algorithm
A simple covering algorithm developed by Cendrowksa in 1987. Available in the WEKA PRISM weka.classifiers.rules.Prism
Prism Pseudocode
For each class C Initialize E to the instance set While E contains instances in class C Create a rule R with an empty left-hand side that predicts class C Until R is perfect (or there are no more attributes to use) do
For each attribute A not mentioned in R, and each value v, Consider adding the condition A = v to the left-hand side of R Select A and v to maximize the accuracy p/t (break ties by choosing the condition with the largest p) Add A = v to R
Remove the instances covered by R from E
Prism: Separate and Conquer Approach

Methods like PRISM (for dealing with one class) are separate-and-conquer algorithms:
First, a rule is identified Then, all instances covered by the rule are separated out Finally, the remaining instances are conquered
Difference to a decision tree's divide-andconquer method: Subset covered by rule doesn't need to be explored any further
A More Comprehensive Example and the Prism Algorithm

Assume we want to derive a rule for recommendation = hard based on the following dataset.
Insert Table 1.1 on page 4
The Candidate Tests and Their Accuracies

age=young age=pre-presbyopic age=presbyopic spectacle prescription=myope spectacle prescription=hypermetrope 2/8 1/8 1/8 3/12 1/12
astigmatism=no
astigmatism=yes tear production rate=reduced tear production rate=normal
0/12
4/12 0/12 4/12
Among the 9 candidates, the following two have the highest accuracy astigmatis m yes 4 12
tear production rate normal 4 12
The First Intermediate Rule

Assume that we pick astigmatism = yes randomly. Then, we have the first intermediate rule:
If astigmatism = yes, then recommendation = hard.
Now, consider the remaining possible tests in order to refine the rule.
Tests to Refine the Intermediate Rule

age=young 2/4
age=pre-presbyopic
age=presbyopic spectacle prescription=myope spectacle prescription=hypermetrope
1/4
1/4 3/6 1/6
tear production rate=reduced

tear production rate=normal
0/6
4/6
The test tear production rate = normal is the apparent winner. Hence, the intermediate rule becomes
If astigmatism = yes and tear production rate = normal, then recommendation = hard.
Insert Table 4.9 on page 102
More Tests to Get the Perfect Rule age=young 2/2

age=pre-presbyopic age=presbyopic spectacle prescription=myope spectacle prescription=hypermetrope 1/2 1/2 3/3 1/3
We may include test spectacle prescription = myope to get a perfect rule. The rule now is
If astigmatism = yes and tear production rate = normal and spectacle prescription = myope, then recommendation = hard.
Deriving More Rules to Get 100% Coverage

The rule that we just derived covers 3 out of 4 instances that have recommendation = hard. Therefore, we delete these 3 instances and start the process over again.
The Complete Rules List for Recommendation = Hard

Eventually, we will get the following list of rules
If astigmatism = yes and tear production rate = normal and spectacle prescription = myope, then recommendation = hard. If age= young and astigmatism = yes and tear production rate = normal, then recommendation = hard.
Prism Overfitting Avoidance

Standard PRISM has no over-fitting avoidance strategy
Prism Limitations
PRISM algorithm silent on Order with which classes are explored (usually, majority first) Order with which attributes are explored (heh, lets pre-sort on, um, correlation to class?) Standard PRISM also demands that all attributes are added Bad idea Why not prune with info gain? Why not have an early stopping criteria? Standard PRISM has no support-based pruning Why not stop learning when support of selected rule falls too low? Currently, these options are unexplored.

DM10

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

DM10

Hochgeladen von

Copyright:

Verfügbare Formate

Deriving Classification Rules using Covering Approach

Chapter 4- Part (2)

Covering Algorithm (contd.)

An Example of the Covering Algorithm

Continue to Derive More Comprehensive Rules

Remove the instances covered by R from E

Prism: Separate and Conquer Approach

A More Comprehensive Example and the Prism Algorithm

Insert Table 1.1 on page 4

The Candidate Tests and Their Accuracies

The First Intermediate Rule

Tests to Refine the Intermediate Rule

tear production rate=reduced

Insert Table 4.9 on page 102

More Tests to Get the Perfect Rule age=young 2/2

Deriving More Rules to Get 100% Coverage

The Complete Rules List for Recommendation = Hard

Prism Overfitting Avoidance

Das könnte Ihnen auch gefallen