Beruflich Dokumente
Kultur Dokumente
VOL. 15,
NO. 1,
JANUARY/FEBRUARY 2003
57
INTRODUCTION
BACKGROUND
58
VOL. 15,
ANY-CONFIDENCE, ALL-CONFIDENCE,
AS INTEREST MEASURES
AND
BOND
In this section, we present a formal definition of anyconfidence, all-confidence, and bond, and prove a number
of properties about them. Regardless of the measure of
interestingness, it is important to be able to efficiently
determine the itemsets that have a value (for that measure)
greater than the minimum threshold. To accomplish this,
we would like to be able to prune the space of possible
59
TABLE 1
Set of Five Transactions (Items Per Transaction
Indicated by a 1)
60
VOL. 15,
TABLE 2
Support, B ond, All-Confidence, and Any-Confidence Values
Using Data from Table 1
j fd j d 2 D ^ L dg j
MAXfi j 8ll 2 PL^l 6 ;^l 6 L ^ i j fd j d 2 D^ l dg jg
minall:
Then, 8L0 L
j fd j d 2 D ^ L0 dg j
MAXfi j 8ll 2 PL0 ^l 6 ;^l 6 L0 ^i j fd j d 2 D^ l dg jg
61
u
t
minall:
Lemma 5. The support for a set of items, supportL, where
bondL is greater than or equal to any minimum bond
threshold, can be as low as 1= j D j .
j fd j d 2 D ^ 9ll 2 PL ^ l 6 ; ^ l dg j
j fd j d 2 D ^ 9ll 2 PL ^ l 6 ; ^ l dg j :
j fd j d 2 D ^ L dg j
:
j fd j d 2 D ^ L0 dg j
The bondL is
j fd j d 2 D ^ L dg j
:
j fd j d 2 D ^ 9ll 2 PL ^ l 6 ; ^ l dg j
Since L0 L, we know the following:
j fd j d 2 D ^ 9ll 2 PL0 ^ l 6 ; ^ l dg j
j fd j d 2 D ^ 9ll 2 PL ^ l 6 ; ^ l dg j :
This is Property 3 in the Appendix. We also know that
since L0 2 PL0 that j fd j d 2 D ^ L0 dg jj fd j d 2
D ^ 9ll 2 PL0 ^ l 6 ; ^ l dg j : So,
62
TABLE 3
Associations with Bond 0.6 and Their Rules
u
t
TABLE 4
Sample Itemset and Support for a File
of 30,000 Transactions
j fd j d 2 D ^ L dg j
bondL minbond:
j fd j d 2 D ^ L0 dg j
Hence, the confidence (L0 !L L0 ) minbond.
VOL. 15,
AND
TABLE 6
Two Files Containing 10,000 Transactions
(Only Three Items Are Shown)
63
64
VOL. 15,
PERFORMANCE RESULTS
65
66
VOL. 15,
67
TABLE 7
Minimum Support (in Percent) and Count for Large Itemsets with Minimum Bond
associations produced for all-confidence, using 0.7, considering only itemsets of length four, we have:
.
68
j fd j d 2 D ^ l dg jg
MAXfi j 8ll 2 PL ^ l 6 ; ^ l 6 L ^ i
j fd j d 2 D ^ l dg jg:
Proof. Since PL contains all the members of PL0 , we
have two cases:
APPENDIX
Here, we provide the basic properties that are used in the
proofs of the lemmas and theorem.
Property 1. If L0 L, then, j fd j d 2 D ^ L0 dg j
j fd j d 2 D ^ L dg j .
1.
2.
Hence,
MAXfi j 8ll 2 PL0 ^ l 6 ; ^ l 6 L0 ^ i
CONCLUSION
j fd j d 2 D ^ l dg jg
MAXfi j 8ll 2 PL ^ l 6 ; ^ l 6 L ^ i
j fd j d 2 D ^ l dg jg:
t
u
Property 3. If L0 L, then,
j fd j d 2 D ^ 9ll 2 PL0 ^ l 6 ; ^ l dg j
j fd j d 2 D ^ 9ll 2 PL ^ l 6 ; ^ l dg j :
Proof. Since PL contains all members of PL0 , the
number of transactions that contain members of PL0
cannot be greater than the number of transactions that
contain members of PL. The left-hand side of the
expression can be equal when transactions that contain
any of the items in L0 are the same transactions that
contain any of the items in L. The left-hand side can be
less when there are transactions that contain any of the
additional items in L L0 and those transactions do not
contain any of the items in L0 . Hence,
j fd j d 2 D ^ 9ll 2 PL0 ^ l 6 ; ^ l dg j
j fd j d 2 D ^ 9ll 2 PL ^ l 6 ; ^ l dg j :
t
u
L fa1 ; a2 ; . . . ; ak ; ak1 ; . . . ; an g:
A transaction that contains the set of items L must
obviously contain items in L0 . So, j fd j d 2 D ^ L dg j
cannot be greater than j fd j d 2 D ^ L0 dg j . If all
transactions that contain the set of items L0 also contain
the set of items fak ; ak1 ; . . . ; an g then,
j fd j d 2 D ^ L0 dg jj fd j d 2 D ^ L dg j :
If at least one transaction contains the set of items L0 but
not the set of items fak ; ak1 ; . . . ; an g then j fd j d 2
D ^ L0 dg j j fd j d 2 D ^ L dg j . Hence, j fd j d 2
u
t
D ^ L0 dg j j fd j d 2 D ^ L dg j .
ACKNOWLEDGMENTS
This work was supported in part by Grant LM 06726 from
the National Library of Medicine. The author would like to
thank Carlos Ordonez for his comments on an earlier draft
of this paper and would also like to thank the anonymous
referees for their invaluable comments.
REFERENCES
[1]
[2]
Property 2. If L L, then
VOL. 15,
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
69