Sie sind auf Seite 1von 5

16 (IJCNS) International Journal of Computer and Network Security,

Vol. 2, No. 4, April 2010

Analysis of Searching Techniques and Design of


Improved Search Algorithm for Unstructured
Peer – to – Peer Networks
Dr. Yash Pal Singh1, Rakesh Rathi2, Jyoti Gajrani3 , Vinesh Jain4
1
Bundelkhand Institute of Engg. and Tech.
Jhansi India
yash_biet@yahoo.co.in
2
Govt.Engg.College,Ajmer
Badliya Circle, NH 08, Ajmer
rakeshrathi4@rediffmail.com
3
Govt.Engg.College,Ajmer
Badliya Circle, NH 08, Ajmer
t_jyoti1@rediffmail.com
4
Govt.Engg.College,Ajmer
Badliya Circle, NH 08, Ajmer
vineshjain1280@rediffmail.com

topology. Unstructured, Loosely structured and Highly


Abstract: We study the performance of several search structured are various categories of P2P networks based on
algorithms on unstructured peer-to-peer networks, both using the control over data location and network topology. In this
classic search algorithms such as flooding and random walk, as paper we are mainly concerned on comparative study of
well as a new hybrid algorithm proposed in this paper. This
various available search Algorithms in unstructured P2P
hybrid algorithm uses two level random walks for the adaptive
probability search (APS). We compare the performance of the
systems also it present the design of new proposed search
search algorithms on several graphs corresponding to common algorithm for unstructured P2P systems.
topologies proposed for peer-to- peer networks. In this paper it
is found that Local Indices algorithm gives the average
performance. Intelligent search and Routing Indices have
2. Unstructured P2P systems
higher bandwidth. Further work can be done on reducing the In unstructured networks, the placement of data (files) is
size of the query message subsequently it will reduce the completely unrelated to the overlay topology. Since there is
bandwidth. APS is the efficient technique among all. Further it no information about which nodes are likely to have the
can be improved by proposed search algorithm which uses two-
relevant files, searching essentially amounts to random
level k-walker random walk with APS instead of k-walker
random walk. Advantages of two level walk will further reduce
search, in which various nodes are probed and asked if they
collision of nodes and can help in searching the distant nodes in have any files matching the query. These systems differ in
the network. But it may slightly increase the response time. the way in which they construct the overlay topology, and
the way in which they distribute queries from node to node.
Keywords: peer-to-peer networks, adaptive probability search. The advantage of such systems is that they can easily
accommodate a highly transient node population. The
1. Introduction disadvantage is that it is hard to find the desired files
without distributing queries widely. For this reason
P2P network is a distributed network composed of a large unstructured p2p systems are considered to be unscalable.
number of distributed, heterogeneous, autonomous, and However work is done towards increasing the scalability of
highly dynamic peers in which participants share a part of unstructured systems. Napster, Gnutella, Kazaa,
their own resources such as processing power, storage Morpheus[1] are various unstructured P2P systems.
capacity, software and files. The participant in the P2P
network can act as a server and a client at the same time.
P2P systems constitute highly dynamic networks of peers 3. Searching in unstructured Systems [4]
with complex topology. This topology creates an overlay
Initially for the purpose of searching specific data item,
network, which may be totally unrelated to the physical
flooding which is basically BFS was used but it generates a
network that connects the different nodes (computers). P2P
large number of duplicate messages and also does not scale
systems can be differentiated by the degree to which these
well so a number of alternative schemes have been proposed
overlay networks contain some structure or are created ad-
to address the above problem.
hoc. Network structure here means the way in which the
content of the network is located with respect to the network
(IJCNS) International Journal of Computer and Network Security, 17
Vol. 2, No. 4, April 2010

These works include iterative deepening, k-walker k-walker BFS, subset of High High
random walk, modified random BFS, two-level k-walker random neighbor
Blind
random walk, directed BFS, intelligent search, local indices walk
based search, routing indices based search, attenuated bloom 2 Lvl k- BFS, subset of Low Low
filter based search, adaptive probabilistic search, and walker neighbor
dominating set based search. Blind
random
walk
Searching strategies in unstructured P2P systems are APS BFS, subset of Medium Medium
either blind search or informed search. In a blind search neighbor
such as iterative deepening, no node has information about Informed
the location of the desired data. In an informed search such
as routing indices, each node keeps some metadata about the Based on scalability, response time (RT), success rate(SR)
data location. To restrict the total bandwidth consumption, and bandwidth various searching methods are compared as
data queries in unstructured P2P systems may be terminated follows-
prematurely before the desired existing data is found; Algorithm Scalability Response Success Bandwidth
therefore, the query may not return the desired data even if Time Rate
the data actually exists in the system. An unstructured P2P Flooding No High Medium Low
network can not offer bounded routing efficiency due to lack
of structure. Iterative Yes High Medium Medium
The searching schemes in unstructured P2P systems can Deepnin
also be classified as deterministic or probabilistic. In a g
deterministic approach, the query forwarding is Local Yes Medium Medium Medium
deterministic. In a probabilistic approach, the query Indices
forwarding is probabilistic, random, or is based on ranking. Directed Yes Medium Medium High
Another way to categorize searching schemes in BFS
unstructured P2P systems is regular-grained or coarse-
grained. In a regular-grained approach, all nodes Intellige Yes Medium Medium High
participate in query forwarding. In a coarse-grained scheme, nt Search
the query forwarding is performed by only a subset of nodes
in the entire network. Routing Yes Medium Medium High
indices
4. Comparison of Existing Search Algorithms
Std. Yes High Medium Low
Based on search method, Query forwarding, Message random
Overhead and node duplication various searching methods walk
are compared as follows- k-walker Yes Medium Medium low
random
Algo- Search Query Message Node walk
rithm method forward- over-head dupli- 2 Lvl k- Yes Medium Medium low
ing cation walker
Flooding BFS, Broadcast High High random
Blind walk
Iterative BFS, Broadcast High High APS Yes Low High Medium
Deepning
Blind
Local BFS, Broadcast Medium Mediu
Indices Among those algorithms, Adaptive Probability Search
Informed m
(APS) is the most efficient algorithm. APS is based on k-
Directed BFS, Partial Medium High walker random walk and probabilistic (not random)
BFS Broadcast forwarding. Another interesting algorithm is Two-Level
Informed
Random Walk in which walkers are searching for an object
Intelligen BFS, subset of Medium Mediu
in two levels. So it reduces the redundancy of nodes.
t Search neighbor
Informed m
Routing BFS, subset of Medium Mediu 5. Adaptive Probability Search (APS) [6]
indices neighbor In the Adaptive Probabilistic Search (APS) [6], it is
Informed m
assumed that the storage of objects and their copies in the
Std. BFS, One Low Low network follows a replication distribution. The number of
random neighbor query requests for each object follows a query distribution.
Blind
walk The search process does not affect object placement and the
P2P overlay topology.
18 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 4, April 2010

The APS is based on k-walker random walk and


probabilistic (not random) forwarding. The querying node Figure 1 shows an example of how the search process
simultaneously deploys k walkers. On receiving the query, works. Node A initiates a request for an object owned by
each node looks up its local repository for the desired object. node F using two walkers. Assume that all index values
If the object is found, the walker stops successfully. relative to this object are initially equal to 30 and the
Otherwise, the walker continues. The node forwards the pessimistic approach is used. The paths of the two walkers
query to the best neighbor that has the highest probability are shown with thicker arrows. During the search, the index
value. The probability values are computed based on the value for a chosen neighbour is reduced by 10. One walker
results of the past queries and are updated based on the with path (A,B,C,D) fails, while the second with path
result of the current query. The query processing continues (A,E,F) finds the object. The update process is initiated for
until all k walkers terminate either successfully or fail (in the successful walker on the reverse path (along the dotted
which case the TTL limit is reached). To select neighbors arrows). First node E, then node A increase the value of
probabilistically, each node keeps a local index about its their indices for their next hops (nodes F, E respectively) by
neighbors. There is one index entry for each object which 20 to indicate object discovery through that path. In a
the node has requested or forwarded requests for through subsequent search for the same object, peer A will choose
each neighbor. The value of an index entry for an object and peer B with probability 2/9 (= 20 20+40+30), peer E with
a neighbor represents the relative probability of that probability 4/9 and peer G with probability 3/9.
neighbor being selected for forwarding a query for that
object. The higher the index entry value the higher the APS requires no message exchange on any dynamic
probability. Initially, all index values are assigned the same operation such as node arrivals or departures and object
value. Then, the index values are updated as follows. When insertions or deletions. The nature of the indices makes the
the querying node forwards a query, it makes some guess handling of these operations simple: If a node detects the
about the success of all the walkers. arrival of a new neighbour, it will associate some initial
The guess is made based on the ratio of the successful index value with that neighbour when a search will take
walkers in the past. If it assumes that all walkers will place.
succeed (optimistic approach), the querying node pro- If a neighbour disconnects from the network, the node
actively increases the index values associated with the removes the relative entries and stops considering it in
chosen neighbors and the queried object. Otherwise future queries. No action is required after object updates,
(pessimistic approach), the querying node proactively since indices are not related to file content. So, although our
decreases the index values. Using the guess determined by algorithm actively uses information, its maintenance cost on
the querying node, every node on the query path updates the any of these events is zero, a major advantage over most
index values similarly when forwarding the query. current approaches.
Upon walker termination, if the walker is successful,
5.1 Discussion on APS
there is nothing to be done in the optimistic approach. If the
walker fails, index values relative to the requested object Each node stores a relative probability (an unsigned
along the walker’s path must be corrected. Using integer) for each of its neighbours for each requested object.
information available inside the search message, the last So for R such objects and N neighbours, O(R x N) space is
node in the path sends an “update” message to the preceding needed.
node. This node, after receiving the update message, For a typical network node, this amount of space is not a
decreases its index value for the last node to reflect the burden. On nodes with limited storage capacities, index
failure. The update procedure continues along the reverse values for objects not requested for some time can be erased.
path towards the requester, with intermediate nodes This can be achieved by assigning a time-to-expire value on
decreasing their local index values relative to the next hops each newly-created or updated index. Each search or update
for that walker. Finally, the requester decreases its index message carries path information, storing a maximum of
value that relates to its neighbour for that walker. If we TTL peer addresses. Alternatively, each node can associate
employ the pessimistic approach, this update procedure the search and requester node IDs with the preceding peer in
takes place after a walker succeeds, having nodes increase the path of the walker. Updates then follow the reverse path
the index values along the walker’s path. There is nothing back to the requester. This information expires after a
to be done when a walker fails. certain amount of time.The number of messages exchanged
by APS method to terminate in the worst case will be (2 x k
x TTL) where all walkers (k walkers) travel TTL hops and
then invoke the update procedure, so the method has the
same complexity with its random counterpart. The only
extra messages that occur in APS are the update messages
along the reverse path. This is where the two index update
policies are used.
Along the paths of all k walkers, indices are updated so
that better next hop choices are made with bigger
probability. Learning feature includes both positive and
Figure 1. Searching object using pessimistic approach of negative feedback from the walkers in both update
APS with walkers. approaches. In the pessimistic approach, each node on the
(IJCNS) International Journal of Computer and Network Security, 19
Vol. 2, No. 4, April 2010

walker’s path decreases the relative probability of its next


hop for the requested object concurrently with the search. If
the walker succeeds, the update procedure increases those
index values by more than the subtracted amount (positive
feedback). So, if the initial probability of a node for a certain
object was P, it becomes bigger than P if the object was
discovered through (or at) that node and smaller than P if
the walker failed. Conversely, if many of our walkers hit
their targets on average, the optimistic approach should be
considered. This is the only invariant we require from our
update process. Figure 4. Hits per Query vs. number deployed walkers for
The learning process in the optimistic approach operates APS and Random walk algorithms
in an opposite fashion, Learning is important to achieve
both high performance and discovery of newly inserted
objects. Unlearning helps our search process adjusts to
object deletions and node departures, redirecting the walkers
7. Two- Level Random Walk[7]
elsewhere. All the nodes participating in the search get It’s an efficient search algorithm which increases the total
benefited from the process. number of nodes searched for a certain total number of
search step, and reduces the redundancy or average number
Besides standard resource-sharing in P2P systems, APS of times a particular node is searched. It works in the
achieves the distribution of search knowledge over a large following manner. When a node wishes to send a query with
number of peers. a certain search key, it composes a search message and
broadcasts it to k1 randomly selected neighbours. The
6. Performance of APS [6] message has an initial TTL1 = l1 hops. When an
The main metrics used to evaluate the performance of a intermediate node receives this message, it checks the TTL1
search algorithm are the success rate, the number of timer. If the latter is still more than 0 then it decrements the
discovered objects (Hits per Query) and the number of timer by one, selects one random neighbour and forwards
messages produced. the message to it. This process continues until one of the
nodes, say node E, receives the message with an expired
TTL1 timer (i.e. TTL1 = 0). We call such a node an edge
node. The message will then “explode” into k2 search
messages forwarded from this node. Specifically, node E
will compose a message with TTL1=0, and a second timer
TTL2=l2. It will then randomly select k2 of its neighbours,
excluding the one it just received the message from, and
broadcast the message to them. Figure 1 shows an example
illustrating this process. At level one, a source node sends
k1 random messages to a set of k1 randomly selected nodes
of its neighbours. This constitutes k1 threads (or random
walks) which travel from the source node to the edge nodes
Figure 2. Success rate vs. number deployed walkers for APS (a node where TTL1 expires). Each of the k1 threads will
and Random walk algorithms then explode into k2 threads (with TTL2 = l2 ) at each of
the k1 edge nodes. This algorithm reduces redundancy by
decreasing the average number of times a node is searched.
In the one-level k-walk algorithm k random threads are
generated from the source and they are likely to have
“thread collisions” (i.e. threads run into each other)
especially near the source. This results in having redundant
hits in the same nodes (nodes being searched multiple
times). On the other hand, the two-level algorithm sends
fewer threads from the source node which results in a
smaller probability of thread collisions near the source. Each
Figure 3. Message production vs. number deployed walkers of the k1 threads will then explode into k2 threads once it is
for APS and Random walk algorithms ”sufficiently” away from the source and the other threads.
This way, the same number of search threads can be
generated (k=k1*k2) but with a larger number of nodes
searched and a smaller probability of redundant searches to
the same nodes using the same number of total search steps.
20 (IJCNS) International Journal of Computer and Network Security,
Vol. 2, No. 4, April 2010

8. Enhancing Performance of APS [2] V. Vishnumurthy and P. Francis. On heterogeneous


overlay construction and random node selection in
Proposed algorithm uses two-level random walk for the unstructured P2P networks. In Proc. IEEE Infocom,
existing APS algorithm[6] instead of k-walker random walk. 2006.
Advantage of two-level walk[7] over one-level walk is that it [3] .MA. Jovanovic, Modelling large-scale peer-to-peer
increases the total number of nodes searched for a certain networks and a case study of gnutella. Master's thesis,
total number of search step, and reduces the redundancy or Department of Electrical and Computer Engineering
average number of times a particular node is searched. So and Computer Science, University of Cincinnati, June
collision of nodes can be further reduced and also distant 2000.
objects can also be search efficiently. Two level walk will [4] .Xiuqi Li and Jie Wu, Searching Techniques in Peer-to-
also help in further reducing message overhead. Only Peer Networks, Department of Computer Science and
disadvantage will be increased in response time. Engineering, Florida Atlantic University.
[5] V. Vishnumurthy and P. Francis. A comparison of
structured and unstructured P2P approaches to
9. Algorithm of proposed Technique heterogeneous random peer selection. In Proc. Usenix
Assumptions Annual Technical Conference, 2007.
[6] D. Tsoumakos and N. Roussopoulos. Adaptive
k1 = k2 = k = k3 – Number of walkers in each level Probabilistic Search (APS) for Peer-to-Peer Networks.
ttlcount – counter for ttl value Technical Report CS-TR-4451, Un. of Maryland, 2003.
l1 = ttl2 = ttl - Time to live for each level [7] Imad Jawhar and Jie Wu, A Two-Level Random Walk
level – variable for level number Search Protocol for Peer-to-Peer Networks, Department
kcount – counter for k3 i.e. number of walkers of Computer Science and Engineering, Florida Atlantic
Select a querying node University.
Kcount = 0 [8] Beverly Yang Hector Garcia Molina, Improving Search
level = 1 in Peer-to-Peer networks, Computer Science
Department, Stanford University.
while (level <= 2)
{
while (kcount <= k3)
{
while (ttlcount <= ttl)
{
select a neighbouring node by
applying APS and Process the node;
if object is not found
then
increment ttlcount by one
continue;
else
come out of the loop (exit); }
increment kcount by one;
}
increment level by one;
}

10. Conclusion
In this research work, various searching techniques in
unstructured p2p networks are studied. Comparative study
of these techniques is done. A new Search Technique is
proposed which helps in further enhancing the performance
of APS.

References

[1] Stephanos Androutsellis-Theotokis, ‘A Surver of Peer-


To-Peer File Sharing Technologies’, White Paper,
ELTRUN, Athens University of Economics and
Business, Greece, 2002.

Das könnte Ihnen auch gefallen