Beruflich Dokumente
Kultur Dokumente
x 10
Histogram of Sun data set 4
x 10
Histogram of Sun data set
4 10
9
3.5
8
2 5
4
1.5
3
1
2
0.5
1
0 0
0 20 40 60 80 100 60 65 70 75 80 85 90 95 100
Similarity (%) Similarity (%)
Performance of MH on Sun data set, f=80 Running time of MH on Sun data set,f=80 Performance of MH on Sun data set,k=500 Running time of MH on Sun data set,k=500
1800 1950
k = 10 f = 70
1 k = 20 1600 1 f = 75
k = 50 f = 80
k = 100 f = 85
k = 200 f = 90
1400
Fraction of pairs found
1900
0.8 0.8
Total time (sec)
200
0 0 1750
55 65 75 85 95 1020 50 100 200 500 55 65 75 85 95 70 75 80 85 90
Similarity (%) K Value Similarity (%) F Value
Running time of K−MH on Sun data set,f=80 Running time of K−MH on Sun data set,k=500
Performance of K−MH on Sun data set,f=80 250
Performance of K−MH on Sun data set,k=500 300
k = 10 f = 70 290
1 k = 20 1 f = 75
k = 50 f = 80
k = 100 f = 85 280
k = 200 f = 90
Fraction of pairs found
200
260
0.6 0.6
250
240
0.4 0.4
150
230
220
0.2 0.2
210
0 100 0 200
55 65 75 85 95 1020 50 100 200 500 55 65 75 85 95 70 75 80 85 90
Similarity (%) K Value Similarity (%) F Value
Running time of H−LSH on Sun data set, l = 8 Running time of H−LSH on Sun data set, r = 40
Performance of H−LSH on Sun data set, l = 8 220 Performance of H−LSH on Sun data set, r = 40
140
r = 32 200 l=4
1 1
r = 40 l=8
r = 48 l = 16 130
180
r = 56 l = 32
Fraction of pairs found
0.2 80 0.2
90
60
0 0
55 65 75 85 95 32 40 48 56 55 65 75 85 95 4 8 16 32
Similarity (%) Value of parameter r Similarity (%) Value of parameter l
Running time of M−LSH on Sun data set, l = 10 Running time of M−LSH on Sun data set, r = 10
Performance of M−LSH on Sun data set, l = 10 500 Performance of M−LSH on Sun data set, r = 10 550
450 500
1
r = 10 1
r = 20
400 450
r = 50
r = 100
400
Fraction of pairs found
350
time (sec)
300
300
0.6 250 0.6
250
200
200
0.4 0.4
150
150
100 l = 10 100
0.2 0.2
l = 20
50 l = 50 50
l = 100
0 0 0 0
55 65 75 85 95 10 20 50 100 55 65 75 85 95 10 20 50 100
Similarity (%) Value of parameter r Similarity (%) Value of parameter l
False Positives vs False Negatives, Similarity = 85%
Time vs False Negatives, Similarity = 85% Time vs False Negatives, Similarity = 95% False Positives vs False Negatives, Similarity = 95%
500 200
MH
MH K−MH MH MH
450 K−MH H−LSH 180 K−MH K−MH
H−LSH M−LSH H−LSH H−LSH
250 100
200 80
4
150 60 10
100 40
50 20
4
10
0 0
0.01 0.05 0.1 0.5 1 5 10 0.01 0.05 0.1 0.5 1 5 10 0.01 0.05 0.1 0.5 1 0.01 0.05 0.1 0.5 1
False Negative Threshold (%) False Negative Threshold (%) False Negative Threshold (%) False Negative Threshold (%)