Beruflich Dokumente
Kultur Dokumente
Market-Basket transactions
Example of Association
Rules
{Diaper} {Beer},
{Milk, Bread}
{Eggs,Coke},
{Beer, Bread} {Milk},
Itemset
A collection of one or more items
Example: {Milk, Bread, Diaper}
k-itemset
An itemset that contains k items
Support count ()
Frequency of occurrence of an
itemset
E.g. ({Milk, Bread,Diaper}) = 2
Support
Fraction of transactions that
contain an itemset
E.g. s({Milk, Bread, Diaper}) =
2/5
Frequent Itemset
An itemset whose support is
greater than or equal to a minsup
threshold
Association Rule
An implication expression of the
form X Y, where X and Y are
itemsets
Example:
{Milk, Diaper} {Beer}
Support (s)
Confidence (c)
s
c
Brute-force approach:
List all possible association rules
Compute the support and confidence for each rule
Prune rules that fail the minsup and minconf
thresholds
Computationally prohibitive!
Example of Rules:
Observations:
All the above rules are
binary partitions of the same
itemset:
{Milk, Diaper,
Beer}
Rules originating from the
same itemset have identical
support but can have
different confidence
Thus, we may decouple the
support and confidence
{Milk,Diaper} {Beer}
c=0.67)
{Milk,Beer} {Diaper}
c=1.0)
{Diaper,Beer} {Milk}
c=0.67)
{Beer} {Milk,Diaper}
c=0.67)
{Diaper} {Milk,Beer}
c=0.5)
{Milk} {Diaper,Beer}
c=0.5)
(s=0.4,
(s=0.4,
(s=0.4,
(s=0.4,
(s=0.4,
(s=0.4,
Two-step approach:
1. Frequent Itemset Generation
Generate all itemsets whose support minsup
2. Rule Generation
Generate high confidence rules from each
frequent itemset, where each rule is a binary
partitioning of a frequent itemset
Brute-force approach:
Each itemset in the lattice is a candidate frequent
itemset
Count the support of each candidate by scanning the
database
Transaction ID
Items
100
Bread, Cheese
200
300
Bread, Milk
400
Itemsets
Frequency
Bread
Cheese
Juice
milk
(Bread, Cheese)
(Bread, Juice)
(Bread, Milk)
(Cheese, Juice)
(Cheese, Milk)
(Juice, Milk)
Itemsets
Frequency
Bread
Cheese
Juice
Milk
Bread, cheese
Cheese, Juice
Transaction
Find out Items
all possible
ID
Combinations
combinations
100
Bread, Cheese
{Bread, Cheese}
200
Bread, Cheese,
Juice
300
Bread, Milk
{Bread, Milk}
400
Itemsets
Frequency
Bread
Cheese
Juice
milk
(Bread, Cheese)
(Bread, Juice)
(Bread, Milk)
(Cheese, Juice)
(Cheese, Milk)
(Juice, Milk)
Method:
Let k=1
Generate frequent itemsets of length 1
Repeat until no new frequent itemsets are
identified
Candidate counting:
Transaction ID
Items
100
200
300
400
500
50% support
Item
Frequency
Bread
Cheese
Juice
Milk
Itemsets
Frequency
(Bread, Cheese)
(Bread, Juice)
(Bread, Milk)
(Cheese, Juice)
(Cheese, Milk)
(Juice, Milk)
Item
Frequency
Bread, Juice
Cheese, Juice
Bread
Juice
Cheese
Juice
3/4
Item Number
Item Name
Biscuits
Bread
Cereal
Cheese
Chocolate
Coffee
Donuts
Eggs
Juice
10
Milk
11
12
Newspaper
Pastry
13
Rolls
14
Sugar
15
Tea
16
Yogurt
TID
Items
Milk, Tea
10
11
12
13
14
15
Chocolate, Coffee
16
Donuts
17
18
19
20
21
22
23
24
25
Item Name
Frequency
Biscuits
Bread
13
Cereal
10
Cheese
11
Chocolate
Coffee
Donuts
10
Eggs
Juice
11
10
Milk
11
12
Newspaper
Pastry
2
1
13
Rolls
14
Sugar
15
Tea
16
Yogurt
Item
Frequency
Bread
13
Cereal
10
Cheese
11
Chocolate
Coffee
Donuts
10
Juice
11
{Bread, Cereal}
{Bread, Cheese}
{Bread, Chocolate}
{Bread, Coffee}
{Bread, Donuts}
{Bread, Juice}
{Cereal, Cheese}
{Cereal, Chocolate}
{Cereal, Coffee}
{Cereal, Donuts}
{Cereal, Juice}
{Cheese, Chocolate}
{Cheese, Coffee}
{Cheese, Donuts}
{Cheese, Juice}
{Chocolate, Coffee}
{Chocolate, Donuts}
{Chocolate, Juice}
{Coffee, Donuts}
{Coffee, Juice}
{Donuts, Juice}
{Bread, Cereal}
{Bread, Cheese}
{Bread, Chocolate}
{Bread, Coffee}
{Bread, Donuts}
{Bread, Juice}
{Cereal, Cheese}
{Cereal, Chocolate}
{Cereal, Coffee}
{Cereal, Donuts}
{Cereal, Juice}
{Cheese, Chocolate}
{Cheese,
Coffee}
{Cheese, Donuts}
{Cheese, Juice}
{Chocolate,
Coffee}
{Chocolate,
Donuts}
{Chocolate, Juice}
{Coffee, Donuts}
{Coffee, Juice}
{Donuts, Juice}
{Bread, Cereal}
{Bread, Cheese}
{Bread, Coffee}
{Cheese, Coffee}
{Chocolate, Donuts}
{Chocolate, Juice}
{Donuts, Juice}
{Bread, Cereal,
Cheese}
{Bread, Cereal,
Coffee}
{Bread, Cheese,
Coffee}
{Chocolate, Donuts,
7
Frequent 3-itemsets
Juice}
{Bread, Cheese,
Coffee}
{Chocolate, Donuts,
Juice}
or L3
Support of BCD
Frequency of
LHS
Confidence
MP
0.78
NP
10
0.70
NM 7
11
0.64
MP
0.78
NP
1.0
NM
1.0
Rule
Support of
BCD
Frequency of
LHS
Confidence
B
CD
13
0.61
C
BD
11
0.72
D
BC
0.89
CD
B
0.89
BD
C
1.0
BC
D
1.0
Cheese
Bread
Cheese
Coffee
Coffee
Bread
Coffee
Cheese
Cheese, Coffee
Bread
Bread, Coffee
Cheese
Bread, Cheese
Coffee
Chocolate
Donuts
Chocolate
Juice
Donuts
Chocolate
Donuts
Juice
Donuts, Juice
Chocolate
Chocolate, Juice
Donuts
Chocolate, Donuts
Juice
Bread
Cereal
Cereal
Bread