Beruflich Dokumente
Kultur Dokumente
www.seipub.org/ijc
*1
Abstract
Peer-to-peer systems and applications have attracted much
attention as they are more scalable than traditional clientserver ones. To provide efficient communications among
nodes in the network, node clustering can be utilized to
avoid flooding messages. In this paper, a distributed node
clustering algorithm was proposed which adopts a new way
to choose originators; then the ns-2 simulator was applied to
evaluate the proposed clustering algorithm. Experimental
results showed that the proposed algorithm can achieve
better clustering accuracy than existing algorithms for
different types of network topologies. More importantly, the
number of messages required for clustering is less than the
compared algorithms.
Keywords
Clustering Algorithms; Scaled Coverage Measure; Peer-to-Peer
Networks
Introduction
In recent years, peer-to-peer computing model has
drawn a great attention both from researchers and
general public because of its high scalability compared
to the traditional client-server model. Notable
applications include file sharing (Zhang et al. 2011),
video streaming (Magharei and Rejaie 2009; Ramzan et
al. 2011), and IP telephony (Bonfiglio et al. 2009). One
of the major issues on peer-to-peer computing is how
to provide high scalability with low communication
cost. One possible solution to this issue is to divide
nodes in the network into clusters. Through node
clustering, messages can be processed or merged in a
cluster and then sent to other nodes in other clusters.
Since message flooding is avoided, the communication
cost can be reduced significantly.
Node clustering can be done in both centralized and
distributed ways. Since centralized node clustering is
better suited for small networks, this paper focuses on
distributed clustering algorithms.
In the literature, a number of clustering algorithms
www.seipub.org/ijc
Degree(Vi )
Vi Nbr (Vl )
l
where Nbr(Vl) denotes the set of neighbors of node Vl
and Degree(Vl) denotes the degree of node Vl. If the
value of THP is larger than a pre-defined threshold,
node Vl will claim itself an originator and send
messages to inform other nodes. Each message
contains fields of the following fields: originator ID,
message ID, weight, and time-to-live (TTL) value. The
value of weight is calculated as follows:
1
.
weight =
Degree(Vl )
Once a node Vi receives a message, it accumulates the
value of weight sent by each originator, reduces the
TTL field by 1, and updates the weight as follows:
weight
weight =
.
Degree(Vi )
If the TTL value becomes zero or the weight is smaller
than a pre-defined threshold, node Vi forwards the
message with the updated fields to its neighbors.
Otherwise, it discards the message.
After waiting a pre-defined period of time, a node
joins the cluster led by the originator with the largest
accumulated weight. The way that the CDC scheme
uses to select originators is not guaranteed that good
originators can be found. We will discuss this in the
next section.
SCM-based Distributed Clustering Protocol
The SDC protocol (Li, Lao, and Cui 2011) takes a
different way to cluster the network. Each node Vi
initially forms a cluster, which only includes itself.
Then, it sends messages to all its neighbors to request
134
THP(V5)=1/(1*4)=0.25
THP(V2)=1/(3*3)+1/(3*4)+1/(3*3)
=0.30553
0
THP(V0)=1/(4*3)+1/(4*3)+1/(4*3)+1/(4*4)
=0.3125
THP(V3)=1/(3*3)+1/(3*4)+1/(3*4)
=0.2777
ETHP(V1)=1/(3*3)+1/(3*2)+1/(3*2)
=0.444
1
ETHP(V2)=1/(3*2)+1/(3*2)+1/(3*2)
=0.5
ETHP(V5)=1/(1*4)=0.25
3
ETHP(V2)=1/(3*2)+1/(3*2)+1/(3*3)
=0.444
1
.
Vi Nbr (Vl ) Degree(Vl ) [ Degree(Vi ) Nbr (Vi ) Nbr (Vl ) ]
www.seipub.org/ijc
ETHP(V0)=1/(4*1)+1/(4*1)+1/(4*1)+1/(4*2)
=0.875
www.seipub.org/ijc
Algorithm
SDC
Proposed Algorithm
Parameter
TTL
ETHP Threshold
Weight Threshold
TTL
Value
3
0.0005
0.0001
1
Experimental Results
Figs. 3 and 4 show the clustering accuracy of the SDC
protocol and the proposed algorithm (indicated by
ETHP) on random and power-law topologies,
respectively. It can be seen that the proposed
algorithm can achieve better clustering accuracy than
the SDC protocol for both types of topologies. This is
because the proposed ETHP can find better originators,
and thus improving the clustering accuracy.
136
Approach
to
Node
Clustering
in
REFERENCES
in
Future
Internet
Applications.
IEEE
INS-R0012.
Centrum
voor
Wiskunde
en
the
BitTorrent
Ecosystem.
IEEE
Survey
Conclusions
www.seipub.org/ijc
Peer-to-Peer
Content
Distribution
137