1 s2.0 S1389128606000223 Main

Computer Networks 50 (2006) 34853521
www.elsevier.com/locate/comnet
Survey of research towards robust peer-to-peer

networks: Search methods
John Risson *, Tim Moors
School of Electrical Engineering and Telecommunications, University of New South Wales, High Street, Sydney, NSW 2052, Australia
Received 7 July 2005; received in revised form 2 February 2006; accepted 6 February 2006
Available online 3 March 2006
Responsible Editor: I.F. Akyildiz
Abstract
The pace of research on peer-to-peer (P2P) networking in the last ve years warrants a critical survey. P2P has the
makings of a disruptive technologyit can aggregate enormous storage and processing resources while minimizing entry
and scaling costs. Failures are common amongst massive numbers of distributed peers, though the impact of individual
failures may be less than in conventional architectures. Thus the key to realizing P2Ps potential in applications other than
casual le sharing is robustness.
P2P search methods are rst couched within an overall P2P taxonomy. P2P indexes for simple key lookup are assessed,
including those based on Plaxton trees, rings, tori, butteries, de Bruijn graphs and skip graphs. Similarly, P2P indexes for
keyword lookup, information retrieval and data management are explored. Finally, early eorts to optimize range, multi-
attribute, join and aggregation queries over P2P indexes are reviewed. Insofar as they are available in the primary litera-
ture, robustness mechanisms and metrics are highlighted throughout. However, the low-level mechanisms that most aect
robustness are not well isolated in the literature. Furthermore, there has been little consensus on robustness metrics.
Recommendations are given for future research.
2006 Elsevier B.V. All rights reserved.
Keywords: Peer-to-peer network; Distributed hash table; Consistent hashing; Scalable distributed data structure; Vector model; Latent
semantic indexing; Dependability; Torus; Buttery network; Skip graph; Plaxton tree; de Bruijn graph
1. Introduction self-organizing P2P network automatically adapts

to the arrival, departure and failure of nodes [2].
Peer-to-peer (P2P) networks are those that exhi- Communication is symmetric in that peers act as
bit three characteristics: self-organization, symmet- both clients and servers. It has no centralized direc-
ric communication and distributed control [1]. A tory or control point. USENET servers or BGP
peers have these traits [3] but the emphasis here is
*
on the urry of research since 2000. Leading exam-
Corresponding author. Tel.: +61 410 285 215; fax: +61 2 9385
5993.
ples include Gnutella [4], Freenet [5], Pastry [2],
E-mail addresses: jr@tut.com (J. Risson), t.moors@unsw. Tapestry [6], Chord [7], the Content Addressable
edu.au (T. Moors). Network (CAN) [8], pSearch [9] and Edutella [10].
1389-1286/$ - see front matter 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.comnet.2006.02.001
3486 J. Risson, T. Moors / Computer Networks 50 (2006) 34853521
Some have suggested that peers are inherently unre-

liable [11]. Others have assumed well-connected,
stable peers [12].
This critical survey of P2P academic literature is
warranted, given the intensity of recent research. At
the time of writing, one research database lists over
5800 P2P publications [13]. One vendor surveyed
P2P products and deployments [14]. There is also
a tutorial survey of leading P2P systems [15]. DePa-
oli and Mariani recently reviewed the dependability
of some early P2P systems at a high level [16]. The
need for a critical survey was agged in the peer-
to-peer research group of the Internet Research
Task Force (IRTF) [17].
P2P is potentially a disruptive technology with
numerous applications, but this potential will not
be realized unless it is demonstrated to be robust.
A massively distributed search technique may yield
numerous practical benets for applications [18].
A P2P system has potential to be more dependable
than architectures relying on a small number of cen-
tralized servers. It has potential to evolve better
from small congurationsthe capital outlays for
high performance servers can be reduced and spread
over time if a P2P assembly of general purpose
nodes is used. A similar argument motivated the
deployment of distributed databasesone thou-
sand, o-the-shelf PC processors are more powerful
and much less expensive than a large mainframe
computer [19]. Storage and processing can be aggre-
gated to achieve massive scale. Wasteful partition-
ing between servers or clusters can be avoided. As
Gedik and Liu put it, if P2P is to nd its way into
applications other than casual le sharing, then reli-
ability needs to be addressed [20].
The taxonomy of Fig. 1 divides the entire body of
P2P research literature along four lines: search, stor-
age, security and applications. This survey concen-
trates on search aspects. A P2P search network
consists of an underlying index (Sections 24) and
queries that propagate over that index (Section 5).
This survey is concerned with two questions. The
rst is How do P2P search networks work? This
foundation is important given the pace and breadth Fig. 1. Classication of P2P research literature [1,2,4,6,7,18,20
of P2P research in the last ve years. In Section 2, 228].
we classify indexes as local, centralized and distrib-
uted. Since distributed indexes are becoming domi- (Sections 3.33.8) divides them by index structure,
nant, they are given closer attention in Sections 3 in particular Plaxton trees, rings, tori, butteries,
and 4. Section 3 gives a two-tiered comparison of de Bruijn graphs and skip graphs. Section 4 reviews
distributed P2P indexes for simple key lookup. distributed P2P indexes supporting keyword lookup
The top tier explores their overall origins (Section (Section 4.1) and information retrieval (Section 4.2).
3.1) and robustness (Section 3.2). The second tier Section 5 probes the embryonic research on P2P
J. Risson, T. Moors / Computer Networks 50 (2006) 34853521 3487
queries, in particular, range queries (Section 5.1), of identity, reputation, trust and incentives need to
multi-attribute queries (Section 5.2), join queries be tackled. Although it is beyond the scope of this
(Section 5.3) and aggregation queries (Section 5.4). paper, robustness against malicious attacks also
The second question is How robust are P2P ought to be addressed [195].
search networks? Insofar as it is available in the Possibly the largest portion of P2P research has
research literature, we tease out the robustness majored on basic routing structures [18], where
mechanisms and metrics throughout Sections 25. research on algorithms comes to the fore. Should
Unfortunately, robustness is often more sensitive the overlay be structured or unstructured? Are
to low-level design choices than it is to the broad the two approaches competing or complementary?
P2P index structure, yet these underlying design Comparisons of the structured approaches
choices are seldom isolated in the primary literature hypercubes, rings, toroids, butteries, de Bruijn
[229]. Furthermore, there has been little consensus and skip graphshave weighed the amount of rout-
on P2P robustness metrics (Section 3.2). Section 6 ing state per peer and the number of links per peer
gives recommendations to address these important against overlay hop-counts. While unstructured
gaps. overlays initially used blind ooding and random
walks, overheads usually trigger some structure,
1.1. Related disciplines for example super-peers and clusters.
P2P applications rely on cooperation between
Peer-to-peer research draws upon numerous dis- these disciplines. Applications have included le shar-
tributed systems disciplines. Networking researchers ing, directories, content delivery networks, email,
will recognize familiar issues of naming, routing and distributed computation, publishsubscribe middle-
congestion control. P2P designs need to address ware, multicasting, and distributed authentication.
routing and security issues across network region Which applications will be suited to which struc-
boundaries [152]. Networking research has tradi- tures? Are there adaptable mechanisms which can
tionally been host-centric. The webs Universal decouple applications from the underlying data
Resource Identiers are naturally tied to specic structures? What are the criteria for selection of
hosts, making object mobility a challenge [216]. applications amenable to a P2P design [1]?
P2P work is data-centric [230]. P2P systems for Robustness is emphasized throughout the survey.
dynamic object location and routing have borrowed We are particularly interested in two aspects. The
heavily from the distributed systems corpus. Some rst, dependability, was a leading design goal for
have used replication, erasure codes and Byzantine the original Internet [232]. It deserves the same
agreement [111]. Others have used epidemics for status in P2P. The measures of dependability are
durable peer group communication [39]. well established: reliability, a measure of the mean-
Similarly, P2P research is set to benet from time-to-failure (MTTF); availability, a measure of
database research [231]. Database researchers will both the MTTF and the mean-time-to-repair
recognize the need to reapply Codds principle of (MTTR);1 maintainability; and safety [233]. The
physical data independence, that is, to decouple second aspect is the ability to accommodate varia-
data indexes from the applications that use the data tion in outcome, which one could call adaptability.
[23]. It was the invention of appropriate indexing Its measures have yet to be dened. In the context
mechanisms and query optimizations that enabled of the Internet, it was only recently acknowledged
data independence. Database indexes like B+ trees as a rst class requirement [234]. In P2P, it means
have an analog in P2Ps distributed hash tables planning for the tussles over resources and identity.
(DHTs). Wide-area, P2P query optimization is a It means handling dierent kinds of queries and
ripe, but challenging, area for innovation. accommodating changeable application require-
More exible distribution of objects comes with ments with minimal intervention. It means organic
increased security risks. There are opportunities scaling [22], whereby the system grows gracefully,
for security researchers to deliver new methods for without a priori data center costs or architectural
availability, le authenticity, anonymity and access breakpoints.
control [25]. Proactive and reactive mechanisms
are needed to deal with large numbers of autono-
mous, distributed peers. To build robust systems
from cooperating but self-interested peers, issues 1
Traditionally, availability = MTTF / (MTTF + MTTR).
In the following section, we discuss one notable seem more important than exact-match key searches
omission from the taxonomy of P2P networking in in the short term. Paraphrased, most queries are
Fig. 1routing. for hay, not needles [61].
More recently, some have justiably seen unstruc-
1.2. Structured and unstructured routing tured and structured proposals as complementary,
not competing [239]. Their starting point was the
P2P routing algorithms have been classied as observation that unstructured ooding or random
structured or unstructured. Early instantia- walks are inecient for data that is not highly repli-
tions of Gnutella were unstructuredkeyword que- cated across the P2P network. Structured graphs
ries were ooded widely [235]. Napster [236] had can nd keys eciently, irrespective of replication.
decentralized content and a centralized index, so Castro et al. proposed Structella, a hybrid of
only partially satises the distributed control crite- Gnutella built on top of Pastry [239]. Another
ria for P2P systems. Early structured algorithms design used structured search for rare items and
included Plaxton, Rajaraman and Richa (PRR) unstructured search for massively replicated items
[30], Pastry [2], Tapestry [31], Chord [7] and the [54].
Content Addressable Network [8]. Mishchke and However, the structured versus unstructured
Stiller recently classied P2P systems by the pres- routing taxonomy is becoming less useful, for
ence or absence of structure in routing tables and two reasons, Firstly, most unstructured proposals
network topology [237]. have evolved and incorporated structure. Consider
Some have cast unstructured and structured the classic unstructured system, Gnutella [4].
algorithms as competing alternatives. Unstructured For scalability, its peers are either ultrapeers or leaf
approaches have been called rst generation, nodes. This hierarchy is augmented with a query
implicitly inferior to the second generation struc- routing protocol whereby ultrapeers receive a
tured algorithms [2,31]. When generic key lookups hashed summary of the resource names available
are required, these structured, key-based routing at leaf-nodes. Between ultrapeers, simple query
schemes can guarantee location of a target within broadcast is still used, though methods to reduce
a bounded number of hops [23]. The broadcasting the query load here have been considered [240].
unstructured approaches, however, may have large Secondly, there are emerging schema-based P2P
routing costs, or fail to nd available content [22]. designs [59], with super-node hierarchies and struc-
Despite the apparent advantages of structured ture within documents. These are quite distinct from
P2P, several research groups are still pursuing the structured DHT proposals.
unstructured P2P. Given that most, if not all, P2P designs today
There have been two main criticisms of struc- assume some structure, a more instructive taxon-
tured systems [61]. The rst relates to peer tran- omy would describe the structure. In this survey,
sience, which in turn aects robustness. Chawathe we use a database taxonomy in lieu of the network-
et al. opined that highly transient peers are not well ing taxonomy, as suggested by Hellerstein, Cooper
supported by DHTs [61]. P2P systems often exhibit and Garcia-Molina [23,241]. The structure is deter-
churn, with peers continually arriving and mined by the type of index. Queries feature in lieu of
departing. One objection to concerns about highly routing. The DHT algorithms implement a seman-
transient peers is that many applications use peers tic-free index [216]. They are oblivious of whether
in well-connected parts of the network. The Tapes- keys represent document titles, meta-data, or text.
try authors analysed the impact of churn in a net- Gnutella-like and schema-based proposals have a
work of 1000 nodes [31]. Others opined that it is semantic index.
possible to maintain a robust DHT at relatively
low cost [238]. Very few papers have quantitatively 1.3. Indexing and searching
compared the resilience of structured systems.
Loguinov et al. claimed that there were only two Index engineering is at the heart of P2P search
such works [24,36]. methods. It captures a broad range of P2P issues,
The second criticism of structured systems is that as demonstrated by the Search/Index Links model
they do not support keyword searches and complex [241]. As Manber put it, the most important of
queries as well as unstructured systems. Given the the tools for information retrieval is the indexa
current le-sharing deployments, keyword searches collection of terms with pointers to places where
information about documents can be found[242]. 2.1. Local index

Sen and Wang noted that a P2P network usually
consists of connections between hosts for applica- P2Ps with a purely local data index are becoming
tion-layer signaling, rather than for the data trans- rare. In such designs, peers ood queries widely and
fer itself [243]. Similarly, we concentrate on the only index their own content. They enable rich
signaled indexes and queries. queriesthe search is not limited to a simple key
Our focus here is the dependability and adapt- lookup. However, they also generate a large volume
ability of the search network. Static dependability of query trac with no guarantee that a match will
is a measure of how well queries route around fail- be found, even if it does exist on the network. For
ures in a network that is normally fault-free. example, to nd potential peers on the early instan-
Dynamic dependability gives an indication of query tiations of Gnutella, ping messages were broadcast
success when nodes and data are continually joining over the P2P network and the pong responses were
and leaving the P2P system. An adaptable index used to build the node index. Then small query
accommodates change in the data and query distri- messages, each with a list of keywords, are broad-
bution. It enables data independence, in that it facil- cast to peers which respond with matching lenames
itates changes to the data layout without requiring [4].
changes to the applications that use the data [23]. There have been numerous attempts to improve
An adaptable P2P system can support rich queries the scalability of local-index P2P networks.
for a wide range of applications. Some applications Gnutella uses xed time-to-live (TTL) rings, where
benet from simple, semantic-free key lookups the querys TTL is set less than 710 hops [4]. Small
[244]. Others require more complex, Structured TTLs reduce the network trac and the load on
Query Language (SQL)-like queries to nd docu- peers, but also reduce the chances of a successful
ments with multiple keywords, or to aggregate or query hit. One paper reported, perhaps a little too
join query results from distributed relations [22]. bluntly, that the xed TTL-based mechanism does
not work [67] To address this TTL selection prob-
2. Index types lem, they proposed an expanding ring, known else-
where as iterative deepening [29]. It uses successively
A P2P index can be local, centralized or distrib- larger TTL counters until there is a match. The
uted. With a local index, a peer only keeps the refer- ooding, ring and expanding ring methods all
ences to its own data, and does not receive increase network load with duplicated query
references for data at other nodes. The very early messages. A random walk, whereby an undupli-
Gnutella design epitomized the local index (Section cated query wanders about the network, does
2.1). In a centralized index, a single server keeps ref- indeed reduce the network load but massively
erences to data on many peers. The classic example increases the search latency. One solution is to rep-
is Napster (Section 2.2). With distributed indexes, licate the query k times at each peer. Called random
pointers towards the target reside at several nodes. k-walkers, this technique can be coupled with TTL
One very early example is Freenet (Section 2.3). limits, or periodic checks with the query originator,
Distributed indexes are used in most P2P designs to cap the query load [67]. Adamic et al. suggested
nowadaysthey dominate this survey. that the random walk searches be directed to nodes
P2P indexes can also be classied as non-forward- with higher degree, that is, with larger numbers of
ing and forwarding. When queries are guided by a inter-peer connections [249]. They assumed that
non-forwarding index, they jump to the node con- higher-degree peers are also capable of higher query
taining the target data in a single hop. There have throughputs. However without some balancing
been semantic and semantic-free one-hop schemes design rule, such peers would be swamped with
[138,245,246]. Where scalability to a massive num- the entire P2P signaling trac. In addition to the
ber of peers is required, these schemes have been above approaches, there is the directed breadth-
extended to two-hops [247,248]. More common rst algorithm [29]. It forwards queries within a
are the forwarding P2Ps where the number of hops subset of peers selected according to heuristics on
varies with the total number of peers, often logarith- previous performance, like the number of successful
mically. The related tradeos between routing state, query results. Another algorithm, called probabilis-
lookup latency, update bandwidth and peer churn tic ooding, has been modeled using percolation
are critical to total system dependability. theory [250].
Several measurement studies have investigated degree and arbitrary link costs using only local
locally indexed P2Ps. Jovanovic noted Gnutellas information. The Scamp overlay construction algo-
power law behaviour [70]. Sen and Wang compared rithms could support any of the ooding and walk-
the performance of Gnutella, Fasttrack [251] and ing routing schemes above, or other epidemic and
Direct Connect [243,252,253].2 At the time, only multicasting schemes for that matter. Resilience to
Gnutella used local data indexes. All three schemes high churn rates was identied for future study.
now use distributed data indexes, with hierarchy in
the form of Ultrapeers (Gnutella), Super-Nodes 2.2. Central index
(FastTrack) and Hubs (Direct Connect). It was
found that a very small percentage of peers have a Centralized schemes like Napster [236] are signif-
very high degree and that the total system depend- icant because they were the rst to demonstrate the
ability is at the mercy of such peers. While peer P2P scalability that comes from separating the data
up-time and bandwidth were heavy-tailed, they did index from the data itself. Ultimately 36 million
not t well with the Zipf distribution. Fortunately Napster users lost their service not because of tech-
for Internet Service Providers, measures aggregated nical failure, but because the single administration
by IP prex and Autonomous System (AS) were was vulnerable to the legal challenges of record
more stable than for individual IP addresses. A companies [255].
study of University of Washington trac found that There has since been little research on P2P sys-
Gnutella and KaZaa together contributed 43% of tems with central data indexes. Such systems have
the universitys total TCP trac [254]. They also also been called hybrid since the index is central-
reported a heavy-tailed distribution, with 600 exter- ized but the data is distributed. Yang and Garcia-
nal peers (out of 281,026) delivering 26% of KaZaa Molina devised a four-way classication of hybrid
bytes to internal peers. Furthermore, objects systems [256]: unchained servers, where users whose
retrieved from the P2P network were typically three index is on one server do not see other servers
orders of magnitude larger than web objects300 indexes; chained servers, where the server that
objects contributed to almost half of the total out- receives a query forwards it to a list of servers if it
bound KaZaa bandwidth. Others reported Gnutellas does not own the index itself; full replication, where
topology mismatch, whereby only 25% of P2P con- all centralized servers keep a complete index of all
nections link peers in the same AS, despite over 40% available metadata; and hashing, where keywords
of peers being in the top 10 ASes [65]. Together are hashed to the server where the associated
these studies underscore the signicance of multime- inverted list is kept. The unchained architecture
dia sharing applications. They motivate interesting was used by Napster, but it has the disadvantage
caching and locality solutions to the topology mis- that users do not see all indexed data in the system.
match problem. Strictly speaking, the other three options illustrate
These same studies bear out one main depend- the distributed data index, not the central index.
ability lesson: total system dependability may be The chained architecture was recommended as the
sensitive to the dependability of high degree peers. optimum for the music-swapping application at
The designers of Scamp translated this observation the time. The methods by which clients update the
to the design heuristic, have the degree of each central index were classied as batch or incremental,
node be of nearly equal size [153]. They analyzed with the optimum determined by the query-to-login
a system of N peers, with mean degree c log(N), ratio. Measurements were derived from a clone of
where link failures occur independently with proba- Napster called OpenNap [257]. Another study of
bility e. If d > 0 is xed and c > (1 + d)/(log(e)) live Napster data reported wide variation in the
then the probability of graph disconnection goes availability of peers, a general unwillingness to
to zero as N ! 1. Otherwise, if c < (1 d)/(log(e)) share les (2040% of peers share few or no les),
then the probability of disconnection goes to one as and a common understatement of available band-
N ! 1. They presented a localizer, which nds width so as to discourage other peers from sharing
approximate minima to a global function of peer ones link [202].
Inuenced by Napsters early demise, the P2P
2
Bearshare and Limewire clients use Gnutella. KaZaa and
research community may have prematurely turned
Grokster clients use FastTrack. When Sen and Wang wrote their its back on centralized architectures. Chawathe
2002 paper, Morpheus also used FastTrack. et al. opined that Google and Yahoo demonstrate
the viability of a centralized index. They argued that many applications are useful when the rst quartile
the real barriers to Napster-like designs are not of queries have path lengths of several hundred hops
technical but legal and nancial [61]. Even this in a network of only 1000 nodes, per Fig. 4 of [71]?
view may be a little too harsh on the centralized To date, there has been no analysis of Freenets
architecturesit implies that they always have an dynamic robustness. For example, how does it per-
upfront capital hurdle that is steeper than for form when nodes are continually arriving and
distributed architectures. The closer one looks at departing?
scalable centralized architectures, the less the There have been both criticisms and extensions
distinction with distributed architectures seems to of the early Freenet work. Gnutella proponents
matter. For example, it is clear that Googles acknowledged the merit in Freenets avoidance of
designers consider Google a distributed, not central- query broadcasting [261]. However, they are critical
ized, le system [258]. Google demonstrates the on two counts: the exact le name is needed to con-
scale and performance possible on commodity hard- struct a query; and exactly one match is returned for
ware, but still has a centralized master that is critical each query. P2P designs using DHTs, per Section 3,
to the operation of each Google cluster. Time may share similar characteristicsa precise query yields
prove that the value of emerging P2P networks, a precise response. The similarity is not surprising
regardless of the centralized-versus-distributed clas- since Freenet also uses a hash function to generate
sication, is that they smooth the capital outlays keys. However, the query routing used in the DHTs
and remove the single points of failure across the has rmer theoretical foundations. Another dier-
spectra of scale and geographic distribution. ence with DHTs is that Freenet will take time, when
a new node joins the network, to build an index that
2.3. Distributed index facilitates ecient query routing. By the inventors
own admission, this is damaging for a users rst
An important early P2P proposal for a distrib- impressions [262]. It was proposed to download a
uted index was Freenet [5,71,259]. While its primary copy of routing tables from seed nodes at startup,
emphasis was the anonymity of peers, it did intro- even though the new node might be far from the
duce a novel indexing scheme. Files are identied seed node. Freenets slow startup motivated Mache
by low-level content-hash keys and by secure et al. to amend the overlay after failed requests and
signed-subspace keys which ensure that only a le to place additional index entries on successful
owner can write to a le while anyone can read from requeststhey claim almost an order of magnitude
it. To nd a le, the requesting peer rst checks its reduction in average query path length [260]. Clarke
local table for the node with keys closest to the tar- also highlighted the lack of locality or bandwidth
get. When that node receives the query, it too checks information available for ecient query routing
for either a match or another node with keys close decisions [262]. He proposed that each node gather
to the target. Eventually, the query either nds the response times, connection times and proportion of
target or exceeds time-to-live (TTL) limits. The successful requests for each entry in the query rout-
query response traverses the successful query path ing table. When searching for a key that is not in its
in reverse, depositing a new routing table entry own routing table, it was proposed to estimate
(the requested key and the data holder) at each peer. response times from the routing metrics for the
The insert message similarly steps towards the target nearest known keys and consequently choose the
node, updating routing table entries as it goes, and node that can retrieve the data fastest. The response
nally stores the le there. Whereas early versions time heuristic assumed that nodes close in the key
of Gnutella used breadth-rst ooding, Freenet uses space have similar response times. This assumption
a more economic depth-rst search [260]. stemmed from early deployment observations that
An initial assessment has been done of Freenets Freenet peers seemed to specialize in parts of the
robustness. It was shown that in a network of 1000 keyspaceit has not been justied analytically.
nodes, the median query path length stayed under Kronfol drew attention to Freenets inability to do
20 hops for a failure of 30% of nodes. While the keyword searches [263]. He suggested that peers
Freenet designers considered this as evidence that cache lists of weighted keywords in order to route
the system is surprisingly robust against quite queries to documents, using Term Frequency
large failures [71], the same datapoint may well Inverse Document Frequency (TFIDF) measures
be outside meaningful operating bounds. How and inverted indexes (Section 4.2.1). With these
methods, a peer can route queries for simple key- The value of PRR is that it can locate objects using
word lists or more complicated conjunctions and xed-length routing tables [6]. Objects and nodes
disjunctions of keywords. Robustness analysis and are assigned a semantic-free address, for example
simulation of Kronfols proposal remains open. a 160 bit key. Every node is eectively the root of
The vast majority of P2P proposals in following a spanning tree. A message routes toward an object
sections rely on a distributed index. by matching longer address suxes, until it encoun-
ters either the objects root node or another node
3. Semantic-free index with a nearby copy. It can route around link and
node failure by matching nodes with a related sux.
Many of todays distributed network indexes are The scheme has several disadvantages [6]: global
semantic. The semantic index is human-readable. knowledge is needed to construct the overlay; an
For example, it might associate information with objects root node is a single point of failure; nodes
other keywords, a document, a database key or even cannot be inserted and deleted; there is no mecha-
an administrative domain. It makes it easy to asso- nism for queries to avoid congestion hot spots.
ciate objects with particular network providers, Karger et al. introduced Consistent Hashing in
companies or organizations, as evidenced in the the context of the web caching problem [49]. Web
Domain Name System (DNS). However, it can also servers could conceivably use standard hashing to
trigger legal tussles and frustrate content replication place objects across a network of caches. Clients
and migration [216]. could use the approach to nd the objects. For
Distributed Hash Tables (DHTs) have been pro- normal hashing, most object references would be
posed to provide semantic-free, data-centric refer- moved when caches are added or deleted. On the
ences. DHTs enable one to nd an objects other hand, Consistent Hashing is smoothwhen
persistent key in a very large, changing set of hosts. caches are added or deleted, the minimum number
They are typically designed for [23]: of object references move so as to maintain load bal-
ancing. Consistent Hashing also ensures that the
(a) Low degree. If each node keeps only a small total number of caches responsible for a particular
number of transport connections to other object is limited. Whereas Litwins Linear Hashing
nodes, the impact of high node arrival and (LH*) scheme requires buckets to be added one
departure rates is contained; at a time in sequence [50], Consistent Hashing
(b) Low diameter. The hops and delay introduced allows them to be added in any order [49]. There
by the extra indirection are minimized; is an open Consistent Hashing problem pertaining
(c) Greedy routing. Nodes independently calculate to the fraction of items moved when a node is
a short path to the target. At each hop, the inserted [165]. Extended Consistent Hashing was
query moves closer to the target; and recently proposed to randomize queries over the
(d) Robustness. A path to the target can be found spread of caches to signicantly reduce the load
even when links or nodes fail. variance [268]. Interestingly, Karger [49] referred
to an older DHT algorithm by Devine that used
a novel autonomous location discovery algorithm
3.1. Origins that learns the buckets locations instead of using
a centralized directory [51].
To understand the origins of recent DHTs, one In turn, Devines primary point of reference was
needs to look to three contributions from the 1990s. Litwins work3 on SDDSs and the associated LH*
The rst twoPlaxton, Rajaraman, and Richa algorithm [52]. An SDDS satises three design
(PRR) [30] and Consistent Hashing [49]were pub- requirements: les grow to new servers only when
lished within one month of each other. The third, the existing servers are well loaded; there is no central-
Scalable Distributed Data Structure (SDDS) [52], ized directory; the basic operations like insert,
was curiously ignored in signicant structured P2P search and split never require atomic updates to
designs despite having some similar goals [2,6,7]. It multiple clients. Honicky and Miller suggested the
has been briey referenced in other P2P papers rst requirement could be considered a limitation
[46,264267].
PRR is the most recent of the three. It inuenced
the designs of Pastry [2], Tapestry [6] and Chord [7]. 3
Both Litwin and Devine were at UC-Berkeley in 1993.
since expansion to new servers is not under admin- the classical analyses, for example the probability
istrative control [266]. Litwin recently noted numer- that a particular node becomes disconnected, yield
ous similarities and dierences between LH* and no major dierences between the resilience of
Chord [269]. He found that both implement key Chord, CAN and de Bruijn graphs. Using bisection
search. Although LH* refers to clients and servers, width (the minimum edge count between two equal
nodes can operate as peers in both. Chord splits partitions) and path overlap (the likelihood that
nodes when a new node is inserted, while LH* backup paths will encounter the same failed nodes
schedules splits to avoid overload. Chord requests or links as the primary path), they argued for the
travel O(log N) hops, while LH* client requests need superior resilience of the de Bruijn graph. In short,
at most two hops to nd the target. Chord stores a ring, XOR and de Bruijn graphs all permit exible
small number of ngers at each node. LH* servers choice of alternative paths, but only in de Bruijn
store N/2 to N addresses while LH* clients store 1 to are the alternate paths independent of each other
N addresses. This tradeo between hop count and [36].
the size of the index aects system robustness, and
bears striking similarity to recent one- and two- 3.2.2. Observation B: Dynamic dependability
hop P2P schemes in Section 2. The arrival and comparisons show that DHT dependability is sensitive
departure of LH* clients does not disrupt LH* ser- to the underlying topology maintenance algorithms
ver metadata at all. Given the size of the index, Li et al. give the best comparison to date of
the arrival and departure of LH* servers is likely several leading DHTs during churn [270]. They
to cause more churn than that of Chord nodes. relate the disparate conguration parameters of
Unlike Chord, LH* has a single point of failure, Tapestry, Chord, Kademlia, Kelips and OneHop
the split coordinator. It can be replicated. Alterna- to fundamental design choices. For each of these
tively it can be removed in later LH* variants, DHTs, they plotted the optimal performance in
though details have not been progressed for lack terms of lookup latency (milliseconds) and fraction
of practical need [269]. of failed lookups. The results led to several impor-
tant insights about the underlying algorithms, for
3.2. Dependability comparisons example: increasing routing table size is more cost-
eective than increasing the rate of periodic stabil-
Before launching into a critique of the various ization; learning about new nodes during the lookup
DHT geometries (Sections 3.33.8), we rst make process sometimes eliminates the need for stabiliza-
four overall observations about their dependability. tion; parallel lookups reduce latency due to time-
Dependability metrics fall into two categories: static outs more eectively than faster stabilization.
dependability, a measure of performance before Similarly, Zhuang et al. compared keep-alive algo-
recovery mechanisms take over; and dynamic rithms for DHT failure detection [271]. Such algo-
dependability, for the most likely case in massive rithmic comparisons can signicantly improve the
networks where there is continual failure and recov- dependability of DHT designs.
ery (churn). In Fig. 2, we propose a taxonomy for the topol-
ogy maintenance algorithms that inuence depend-
3.2.1. Observation A: Static dependability ability. The algorithms can be classied by how
comparisons show that no O(log N) DHT geometry nodes join and leave, how they rst detect failures,
is signicantly more dependable than the other how they share information about topology
O(log N) geometries updates, and how they react when they receive
Gummadi et al. compared the tree, hypercube, information about topology updates.
buttery, ring, XOR and hybrid geometries. In such
geometries, nodes generally know about O(log N) 3.2.3. Observation C: Most DHTs use O(log N)
neighbors and route to a destination in O(log N) geometries to suit ephemeral nodes. The O(1) hop
hops, where N is the number of nodes in the overlay. DHTs suit stable nodes and deserve more research
Gummadi et al. asked Why not the ring?. They attention
concluded that only the ring and XOR geometries Most of the DHTs in Sections 3.33.8 assume
permit exible choice of both neighbors and alterna- that nodes are ephemeral, with expected lifetimes of
tive routes [24]. Loguinov et al. added the de Bruijn 12 h. They therefore mostly use an O(log N) geom-
graph to their comparison [36]. They concluded that etry. The common assumption is that maintenance
Fig. 2. Topology maintenance in distributed hash tables [39,247,270276].
of full routing tables in the O(1) hop DHTs will con- on the algorithms of Fig. 2 will also yield
sume excessive bandwidth when nodes are continu- improvements in the dependability of the O(1) hop
ally joining and leaving. The corollary is that, when DHTs.
they run on stable infrastructure servers [277], most
of the DHTs in Sections 3.33.8 are less than opti- 3.2.4. Observation D: Although not yet a mature
mallookups take many more hops than necessary, science, the study of DHT dependability is helped
wasting latency and bandwidth budgets. The O(1) by recent simulation and formal development tools
hop DHTs suit stable deployments and high lookup While there are recent reference architectures
rates. For a churning 1024-node network, Li et al. [273,277], much of the DHT literature in Sections
concluded that OneHop is superior to Chord, Tap- 3.33.8 does not lend itself to repeatable, compara-
estry, Kademlia and Kelips in terms of latency and tive studies. The best comparative work to date
lookup success rate [270]. For a 3000-node network, [270] relies on the P2PSIM simulator [278]. At the
they concluded that OneHop is only preferable to time of writing, it supports more DHT geometries
Chord when the deployment scenario allows a com- than any other simulator. As the study of DHTs
munication cost greater than 20 bytes per node per matures, we can expect to see the simulation empha-
second [270]. This apparent limitation needs to sis shift from geometric comparison to a compari-
be put in context. They assumed that each node son of the algorithms of Fig. 2.
issues only one lookup every 10 min and has a life- P2P correctness proofs generally rely on less than
time of only 60 min. It seems reasonable to expect complete formal specications of system invariants
that in some deployments, nodes will have a lifetime and events [7,45,279]. Li and Plaxton expressed con-
of weeks or more, a maintenance bandwidth of tens cern that when many joins and leaves happen con-
of kilobits per second, and a load of hundreds of currently, it is not clear whether the neighbor tables
lookups per second. O(1) hop DHTs are superior will remain in a good state [47]. While acknowl-
in such situations. OneHop can scale at least to edging that guaranteeing consistency in a failure
many tens of thousands of nodes [247]. The recent prone network is impossible, Lynch et al. sketched
O(1) hop designs [247,274] are vastly outnumbered amendments to the Chord algorithm to guarantee
by the O(log N) DHTs in Sections 3.33.8. Research atomicity [280]. More recently, Gilbert et al. gave
a new algorithm for atomic read/write memory in hops. Pastry diers from Tapestry only in the
a churning distributed network, suggesting it to be method by which it handles network locality and rep-
a good match for P2P [281]. Lynch and Stoica show lication [2]. Each Pastry node maintains a leaf set
in an enhancement to Chord that lookups are prov- and a routing table. The leaf set contains l/2 node
ably correct when there is a limited rate of joins and IDs on either side of the local node ID in the node
failures [282]. Fault Tolerant Active Rings is a pro- ID space. The routing table, in row r column c,
tocol for active joins and leaves that was formally points to the node ID with the same r-digit prex
specied and proven using B-method tools [283]. as the local node, but with an r + 1 digit of c. A Pas-
A good starting point for a formal DHT develop- try node periodically probes leaf set and routing table
ment would be the numerous informal API speci- nodes, with periodicity of Tls and Trt and a timeout
cations [22,284,285]. Such work could be informed Tout. Mahajan et al. analysed the reliability versus
by other eorts to formally specify routing invari- maintenance cost tradeos in terms of the parame-
ants [286,287]. ters l, Tls, Trt, and Tout [288]. They concluded that
In Sections 3.33.8, we introduce the main geom- earlier concerns about excessive maintenance cost
etries for simple key lookup and survey their robust- in a churning P2P network were unfounded, but sug-
ness mechanisms. gested followup work for a wider range of reliability
targets, maintenance costs and probe periods. Rhea
3.3. Plaxton trees Geels et al. concluded that existing DHTs fail at high
churn rates [289]. Building on a Pastry implementa-
Work began in March 2000 on a structured, fault- tion from Rice University, they found that most
tolerant, wide-area Dynamic Object Location and lookups fail to complete when there is excessive
Routing (DOLR) system called Tapestry [6,155]. churn. They conjectured that short-lived nodes often
While DHTs x replica locations, a DOLR API leave the network with lookups that have not yet
enables applications to control object placement timed out, but no evidence was provided to conrm
[31]. Tapestrys basic location and routing scheme the theory. They identied three design issues that
follows Plaxton, Rajaraman and Richa (PRR) [30], aect DHT performance under churn: reactive ver-
but it remedies PRRs robustness shortcomings sus periodic recovery of peers; lookup timeouts;
described in Section 3.1. Whereas each object has and choice of nearby neighbours. Since reactive
one root node in PRR, Tapestry uses several to recovery was found to add trac to already con-
avoid a single point of failure. Unlike PRR, it allows gested links, the authors used periodic recovery in
nodes to be inserted and deleted. Whereas PRR their design. For lookup timeouts, they advocated
required a total ordering of nodes, Tapestry uses an exponentially weighted moving average of each
surrogate routing to incrementally choose root neighbours response time, over alternative xed
nodes. The PRR algorithm does not address conges- timeout or virtual coordinate schemes. For selec-
tion, but Tapestry can put object copies close to tion of nearby neighbours, they found that global
nodes generating high query loads. PRR nodes only sampling was more eective than simply sampling
know of the nearest replica, whereas Tapestry nodes a neighbours neighbours or inverse neighbours.
enable selection from a set of replicas (for example to Castro et al. have refuted the suggestion that DHTs
retrieve the most up to date). To detect routing cannot cope with high churn rates [290]. By imple-
faults, Tapestry uses TCP timeouts and UDP heart- menting methods for continuous detection and
beats for detection, sequential secondary neighbours repair, their MSPastry implementation achieved
for rerouting, and a second chance window so that shorter routing paths and a maintenance overhead
recovery can occur without the overhead of a full of less than half a message per second per node.
node insertion. Tapestrys dependability has been There have been more recent proposals based on
measured on a testbed of about 100 machines and these early Plaxton-like schemes. Kademlia uses a
on simulations of about 1000 nodes. Successful rout- bit-wise exclusive or (XOR) metric4 for the dis-
ing rates and maintenance bandwidths were mea- tance between 160 bit node identiers [45]. Each
sured during instantaneous failures and ongoing node keeps a list of contact nodes for each section
churn [31].
Pastry, like Tapestry, uses Plaxton-like prex 4
To be more precise, Maymounkov and Mazieres make
routing [2]. As in Tapestry, Pastry nodes maintain comparison with Pastrys rst routing phase, saying that Pastrys
O(log N) neighbours and route to a target in O(log N) second phase uses numeric dierence.
of the node space that is between 2i and 2i+1 from rates and stabilization periods for a Chord network
itself (0 6 i < 160). Longer-lived nodes are deliber- of 1000 nodes. They measured the number of time-
ately given preference on this listit has been found outs (caused by a nger pointing to a departed
in Gnutella that the longer a node has been active, node) and lookup failures (caused by nodes that
the more likely it is to remain active. Like Kadem- temporarily point to the wrong successor during
lia, Willow uses the XOR metric [32]. It implements churn). They also modelled the lookup stretch,
a Tree Maintenance Protocol to zipper together the ratio of the Chord lookup time to optimal
broken segments of a tree. Where other schemes lookup time on the underlying network. They dem-
use DHT routing to ineciently add new peers, onstrated the latency advantage of recursive look-
Willow can merge disjoint or broken trees in ups over iterative lookups, but there remains room
O(log N) parallel operations. for delay reduction. For further work, the authors
proposed to improve resilience to network parti-
3.4. Rings tions, using a small set of known nodes or remem-
bered random nodes. To reduce the number of
Chord is the prototypical DHT ring, so we rst messages per lookup, they suggested an increase in
sketch its operation. Chord maps nodes and keys the size of each step around the ring, accomplished
to an identier ring [7,34]. Chord supports one main via a larger number of ngers at each node. Much
operation: nd a node with the given key. It uses of the paper assumed independent, equally likely
Consistent Hashing (Section 3.1) to minimize dis- node failures. Analysis of correlated node failures,
ruption of keys when nodes join and leave the caused by massive site or backbone failures, will
network. However, Chord peers need only track be more important in some deployments. The paper
O(log N) other peers, not all peers as in the original did not attempt to recommend a xed optimal sta-
consistent hashing proposal [49]. It enables concur- bilization rate. Liben-Nowell et al. had suggested
rent node insertions and deletions, improving on that optimum stabilization rate might evolve
PRR. Compared to Pastry, it has a simpler join pro- according to measurements of peers behaviour
tocol. Each Chord peer tracks its predecessor, a list [291]such a mechanism has yet to be devised.
of successors and a nger table. Using the nger Alima et al. considered the communication costs
table, each hop is at least half the remaining dis- of Chords stabilization routines, referred to as
tance around the ring to the target node, giving an active correction, to be excessive [292]. Two other
average5 lookup hop count of (1/2)log2 N. Each robustness issues also motivated their Distributed
Chord node runs a periodic stabilization routine K-ary Search design, which is similar to Chord.
that updates predecessor and successor pointers to Firstly, the total system should evolve for an opti-
cater for newly added nodes. All successors of a mum balance between the number of peers, the
given node need to fail for the ring to fail. Although lookup hopcount and the size of the routing table.
a node departure could be treated the same as a Secondly, lookups should be reliableP2P algo-
failure, a departing Chord node rst noties the rithms should be able to guarantee a successful
predecessor and successors, so as to improve lookup for key/value pairs that have been inserted
performance. into the system. A similar lookup correctness issue
In their denitive paper, Chords inventors cri- was raised elsewhere by one of Chords authors,
tiqued its dependability under churn [34]. They pro- Is it possible to augment the data structure6 to
vided proofs on the behaviour of the Chord work even when nodes (and their associated nger
network when nodes in a stable network fail, stress- lists) just disappear? [293]. Alima et al. asserted
ing that such proofs are inadequate in the general that P2Ps using active correction, like Chord, Pastry
case of a perpetually churning network. An earlier and Tapesty, are unable to give such a guarantee.
paper had posed the question, For lookups to be They propose an alternate correction-on-use
successful during churn, how regularly do the scheme, whereby expired routing entries are cor-
Chord stabilization routines need to run? [291]. rected by information piggybacking lookups and
Stoica et al. modeled a range of node join/departure
5 6
For r successors, the average hop count is more accurately The question was posed in the context of a nearest neighbour
expressed as (1/2)log2 N (1/2)log2(r) + 1. search algorithm, a proposed Chord extension.
insertions. A prerequisite is that lookup and inser- 3.6. Butteries

tion rates are signicantly higher than node arrival,
departure and failure rates. Correct lookups are Viceroy approximates a buttery network [46]. It
guaranteed in the presence of simultaneous node generally has constant degree7 like CAN. Like
arrivals or up to f concurrent node departures, Chord, Tapesty and Pastry, it has logarithmic diam-
where f is congurable. eter. It improves on these systems, inasmuch as its
diameter is better than CAN and its degree is better
3.5. Tori than Chord, Tapestry and Pastry. As with most
DHTs, it utilizes Consistent Hashing. When a peer
Ratnasamy et al. developed the Content- joins the Viceroy network, it takes a random but
Addressable Network (CAN), another early DHT permanent identity and selects its level within
widely referenced alongside Tapestry, Pastry and the network. Each peer maintains general ring
Chord [8,294]. It is arranged as a virtual d-dimen- pointers (predecessor and successor), level ring
sional Cartesian coordinate space on a d-torus. pointers (nextonlevel and prevonlevel) and but-
Each node is responsible for a zone in this coordi- tery pointers (left, right and up). When a peer
nate space. The designers used a heuristic thought departs, it normally passes its key pairs to a succes-
to be important for large, churning P2P networks: sor, and noties other peers to nd a replacement
keep the number of neighbours independent of sys- peer.
tem size. Consequently, its design diers signi- The Viceroy paper scoped out the issue of robust-
cantly from Pastry, Tapestry and Chord. Whereas ness. It explicitly assumed that peers do not fail [46].
they have O(log N) neighbours per node and It assumed that join and leave operations do not
O(log N) hops per lookup, CAN has O(d) neigh- overlap, so as to avoid the complication of concur-
bours and O(dn1/d) hop-count. When CANs sys- rency mechanisms like locking. Kaashoek and
tem-wide parameter d is set to log(N), CAN Karger were somewhat critical of Viceroys com-
converges to their prole. If the number of nodes plexity [37]. They also pointed to its fault tolerance
grows, a major rearrangement of the CAN network blindspot. Li and Plaxton suggested that such con-
may be required [151]. The CAN designers consid- stant-degree algorithms deserve further consider-
ered building on PRR, but opted for the simple, ation [47]. They oered several pros and cons. The
low-state-per-node CAN algorithm instead. They limited degree may increase the risk of a network
had reasoned that a PRR-based design would not partition, or inhibit use of local neighbours (for
perform well under churn, given node departures the simple reason that there are less of them).
and arrivals would aect a logarithmic number of On the other hand, it may be easier to reason about
nodes [8]. the correctness of xed-degree networks. One of the
There have been preliminary assessments of Viceroy authors has since proposed constant-degree
CANs resilience. When a node leaves the CAN in peers in a two-tier, locality-aware DHT [296]the
an orderly fashion, it passes its own Virtual ID lower degree maintained by each lower-tier peer
(VID), its neighbours VIDs and IP addresses, and purportedly improves network adaptability.
its key/value pairs to a takeover node. If a node Another Viceroy author has since explored an alter-
leaves abruptly, its neighbours send recovery mes- native bounded-degree graph for P2P, namely the
sages towards the designated takeover node. CAN de Bruijn graph [297].
ensures the recovery messages reach the takeover
node, even if nodes die simultaneously, by maintain- 3.7. de Bruijn graphs
ing a VID chain with Chords stabilization algo-
rithm. Some initial proof of concept resilience de Bruijn graphs have had numerous renements
simulations were run using the Network Simulator since their inception [298,299]. Schlumberger was
(ns) [295] for up to a few hundred nodes. Average the rst to use them for networking [300]. Two
hopcounts and lookup failure probabilities were research teams independently devised the general-
plotted against the total number of nodes, for vari- ized de Bruijn graph that accommodates a exible
ous node failure rates [8]. The CAN team docu-
mented several open research questions pertaining 7
Viceroys expected degree is a constant. However, its high
to state/hopcount tradeos, resilience, load, locality probability bound is O(log n). For a very small number of nodes,
and heterogeneous peers [44,294]. degree is X(log n).
number of nodes in the system [301,302]. Rowley advantage. Similarly, to achieve a constant-factor
and Bose studied fault-tolerant rings overlaid on load balance, Koorde would have to sacrice its
the de Bruijn graph [303]. Lee et al. devised a two- degree optimality. They suggested that the ability
level de Bruijn hierarchy, whereby clusters of local to trade the degree, and hence the maintenance
nodes are interconnected by a second-tier ring [304]. overhead, against the expected hop count may be
Many of the algorithms discussed previously are important for churning systems. They also identied
greedy in that each time a query is forwarded, it an open problem: nd a load-balanced, degree opti-
moves closer to the destination. Unfortunately, mal DHT. Datta et al. showed that for arbitrary key
greedy algorithms are generally suboptimalfor a distributions, de Bruijn graphs fail to meet the dual
given degree, the routing distance is longer than nec- goals of load balancing and search eciency [307].
essary [305]. Unlike these earlier P2P designs, de They posed the question, (Is there) a constant rout-
Bruijn graphs of degree k achieve an asymptotically ing table sized DHT which meets the conicting
optimal diameter logk n, where n is the number of goals of storage load balancing and search eciency
nodes in the system and k can be varied to improve for an arbitrary and changing key distribution?
resilience. If there are O(log(n)) neighbours per Distance Halving was also inspired by de Bruijn
node, the de Bruijn hop count is O(log n/log log n). [297] and shares its optimal diameter. Naor and
To illustrate de Bruijns practical advantage, con- Wieder argued for a two-step continuous-discrete
sider a network with one million nodes of degree approach for its design. The correctness of its
20: Chord has a diameter of 20, while de Bruijn algorithms is proven in a continuous setting. The
has a diameter of 5 [36]. In 2003, there were a quick algorithms are then mapped to a discrete space.
succession of de Bruijn proposalsD2B [306], The source x and target y are points on the contin-
Koorde [37], Distance Halving [132,297] and the uous interval [0, 1). Data items are hashed to this
Optimal Diameter Routing Infrastructure (ODRI) same interval. r is a string which determines how
[36]. messages leave any point on the ring: if bit t of
Fraigniaud and Gauron began the D2B design by the string is 0, the left leg is taken; if it is 1, the right
laying out an informal problem statement: keys leg is taken. r increases by one bit each hop, giving a
should be evenly distributed; lookup latency should sequence by which to step around the ring. A
be small; trac load should be evenly distributed; lookup has two phases. In the rst, the lookup mes-
updates of routing tables and redistribution of keys sage containing the source, target and the random
should be fast when nodes join or leave the network. string hops toward the midpoint of the source and
They dened a nodes congestion to be the prob- target. On each hop, the distance between rt(x)
ability that a lookup will traverse it. Apart from its and rt(y) is halved, by virtue of the specic left
optimal de Bruijn diameter, they highlighted D2Bs and right functions. In the second phase, the mes-
merits: a constant expected update time when nodes sage steps backward from the midpoint to the tar-
(O(log n) w.h.p.8); the expected node congestion is get, removing the last bit in rt at each hop. Join
O((log n)/n) (O((log2 n)/n) w.h.p.) [306]. D2Bs resil- and leave algorithms were outlined but there was
ience was discussed only in passing. no consideration of recovery times or message load
Koorde extends Chord to attain the optimal de on churn. Using the Distance Halving properties,
Bruijn degree/diameter tradeo above [37]. Unlike the authors devised a caching scheme to relieve con-
D2B, Koorde does not constrain the selection of gestion in a large P2P network. They have also
node identiers. Also unlike D2B, it caters for con- modied the algorithm to be more robust in the
current joins, by extension of Chords functionality. presence of random faults [132].
Kaashoek and Karger investigated Koordes resil- Solid comparisons of DHT resilience are scarce,
ience to a rather harsh failure scenario: in order but Loguinov et al. give just that in their ODRI
for a network to stay connected when all nodes fail paper [36]. They compare Chord, CAN and de Bru-
with probability of 1/2, some nodes must have ijn in terms of routing performance, graph expan-
degree X(log n) [37]. They sketched a mechanism sion and clustering. At the outset, they give the
to increase Koordes degree for this more stringent optimal diameter (the maximum hopcount between
fault tolerance, losing de Bruijns constant degree any two nodes in the graph) and average hopcount
for graphs of xed degree. de Bruijn graphs con-
verge to both optima, and outperform Chord and
8
W.h.p. With high probability 1 ne. CAN on both counts. These optima impact both
delay and aggregate lookup load. They present two administration is detached from the rest of the Skip
clustering measures (edge expansion and node Graph, routing can continue within each of the par-
expansion) which are interesting for resilience. titions. Mechanisms have been devised to merge dis-
Unfortunately, after decades of de Bruijn research, connected segments [157], though at this stage,
they have no exact solution. de Bruijn was shown segments are remerged one at a time. A parallel
to be superior in terms of path overlapde Bruijn merge algorithm has been agged for future work.
automatically selects backup paths that do not over- The advantages of Skip Graphs come at a cost.
lap with the best shortest path or with each other To be able to provide range queries and data place-
[36]. ment exibility, Skip Graph nodes require many
more pointers than their DHT counterparts. An
3.8. Skip graphs increased number of pointers implies increased
maintenance trac. Another shortcoming of at least
Skip graphs have been pursued by two research one of the early proposals was that no algorithm
camps [38,41]. They augment the earlier Skip Lists was given to assign keys to machines. Consequently,
[308,309]. Unlike earlier balanced trees, the Skip there are no guarantees on system-wide load balanc-
List is probabilisticits insert and delete operations ing or on the distance between adjacent keys [100].
do not require tree rearrangements and so are faster Aspnes et al. have recently devised a scheme to
by a constant factor. The Skip List consists of layers reduce the inter-machine pointer count from
of ordered linked lists. All nodes participate in the O(m log m), where m is the number of data elements,
bottom layer 0 list. Some of these nodes participate to O(n log n), where n is the number of nodes [100].
in the layer 1 list with some xed probability. A sub- They proposed a two-layer schemeone layer for
set of layer 1 nodes participate in the layer 2 list, and the Skip Graph itself and the second bucket layer.
so on. A lookup can proceed quickly through the Each machine is responsible for a number of buck-
list by traversing the sparse upper layers until it is ets and each bucket elects a representative key.
close to, or at, the target. Unfortunately, nodes in Nodes locally adjust their load. They accept addi-
the upper layers of a Skip List are potential hot tional keys if they are below their threshold or dis-
spots and single points of failure. Unlike Skip Lists, perse keys to nearby nodes if they are above
Skip Graphs provide multiple lists at each level for threshold. There appear to be numerous open
redundancy, and every node participates in one of issues: simulations have been done but analysis is
the lists at each level. outstanding; mechanisms are required to handle
Each node in a Skip Graph has H(log n) neigh- the arrival and departure of nodes; there were only
bours on average, like some of the preceding DHTs. brief hints as to how to handle nodes with dierent
The Skip Graphs primary edge over the DHTs is its capacities.
support for prex and proximity search. DHTs hash
objects to a random point in the graph. Conse- 4. Semantic index
quently, they give no guarantees over where the
data is stored. Nor do they guarantee that the path Semantic indexes capture object relationships.
to the data will stay within the one administration While the semantic-free methods (DHTs) have r-
as far as possible [38]. Skip graphs, on the other mer theoretic foundations and guarantee that a
hand, provide for location-sensitive name searches. key can be found if it exists, they do not on their
For example, to nd the document docname on own capture the relationships between the docu-
the node user.company.com, the Skip Graph might ment name and its content or metadata. Semantic
step through its ordered lists for the prex com.com- P2P designs do. However, since their design is often
pany.user [38]. Alternatively, to nd an object with a driven by heuristics, they may not guarantee that
numeric identier, an algorithm might search the scarce items will be found.
lowest layer of the Skip Graph for the rst digit, So what might the semantically indexed P2Ps add
the next layer for the next digit, in the same vein to an already crowded eld of distributed informa-
until all digits are resolved. Being ordered, Skip tion architectures? At one extreme there are the dis-
Graphs also facilitate range searches. In each of tributed relational database management systems
these examples, the Skip Graph can be arranged (RDBMSs), with their strong consistency guarantees
such that the path to the target, as far as possible, [264]. They provide strong data independence, the
stays within an administrative boundary. If one exibility of SQL queries and strong transactional
semanticsAtomicity, Consistency, Isolation and answer when they have les9 whose names contain
Durability (ACID) [310]. They guarantee that the all the keywords. As discussed in Section 2.1, early
query response is completeall matching results versions of Gnutella did not forward the document
are returned. The price is performance. They scale index. Queries were ooded and peers searched their
to perhaps 1000 nodes, as evidenced in Mariposa own local indexes for lename matches. An early
[311,312], or require query caching front ends to review highlighted numerous areas for improvement
constrain the load [264]. Database research has [65]. It was estimated that the query trac alone
arguably been cornered into traditional, high-end, from 50,000 early-generation Gnutella nodes would
transactional applications [72]. Then there are dis- amount to 1.7% of the total US Internet backbone
tributed le systems, like the Network File System trac at December 2000 levels. It was speculated
(NFS) or the Serverless Network File Systems that high degree Gnutella nodes would impede
(xFS), with little data independence, low-level le dependability. An unnecessarily high percentage of
retrieval interfaces and varied consistency [264]. Gnutella trac crossed Autonomous System (AS)
Todays eclectic mix of Content Distribution boundariesa locality mechanism may have found
Networks (CDNs) generally deload primary servers suitable nearby peers.
by redirecting web requests to a nearby replica. Fortunately, there have since been numerous
Some intercept the HTTP requests at the DNS enhancements within the Gnutella Developer
level and then use consistent hashing to nd a replica Forum. At the time of writing, it has been reported
[23]. Since this same consistent hashing was a fore- that Gnutella has almost 350,000 unique hosts, of
runner to the DHT approaches above, CDNs are which nearly 90,000 accept incoming connections
generally constrained to the same simple key [317]. One of the main improvements is that an
lookups. index of lename keywords, called the Query Rout-
The opportunity for semantically indexed P2Ps, ing Table (QRT), can now be forwarded from leaf
then, is to provide: peers to its ultrapeers [240]. Ultrapeers can then
ensure that the leaves only receive queries for which
(a) graduated data independence, consistency and they have a match, dramatically reducing the query
query exibility, and trac at the leaves. Ultrapeers can have connections
(b) probabilistically complete query responses, to many leaf nodes (10100) and a small number
across of other ultrapeers (<10) [240]. Originally, a leaf
(c) very large numbers of low-cost, geographically nodes QRT was not forwarded by the parent ultra-
distributed, dynamic nodes. peer to other ultrapeers. More recently, there has
been a proposal to distribute aggregated QRTs
amongst ultrapeers [318]. To further limit trac,
4.1. Keyword lookup QRTs are compressed by hashing, according to
the Query Routing Protocol (QRP) specication
P2P keyword lookup is best understood by con- [261]. This same specication claims QRP may
sidering the structure of the underlying index and reduce Gnutella trac by orders of magnitude,
the algorithms by which queries are routed over that but cautions that simulation is required before mass
index. Fig. 3 summarizes the following paragraphs deployment. A known shortcoming of QRP was
by classifying the keyword query algorithms, index that the extent of query propagation was indepen-
structures and metrics. The research has largely dent of the popularity of the search terms. The
focused on scalability, not dependability. There have Dynamic Query Protocol addressed this [319]. It
been very few studies that quantify the impact of required leaf nodes to send single queries to high-
network churn. One exception is the work by degree ultrapeers which adjust the queries time-to-
Chawathe et al. on the Gia system [61]. Gias com- live (TTL) bounds according to the number of
bination of algorithms from Fig. 3 (receiver-based received query results. An earlier proposal, called
ow control, biased random walk and one-hop rep- the Gnutella UDP Extension for Scalable Searches
lication) gave 24 orders of magnitude improvement (GUESS) [314], similarly aimed to reduce the
in query success rates in churning networks.
Perhaps the most widely referenced P2P system 9
The Gnutella 0.6 specication only provides semantics for
for simple keyword match is Gnutella [4]. Gnutella nding plain les, but hints that Gnutella could store other
queries contain a string of keywords. Gnutella peers resources, like cryptographic keys or meta-information.
Fig. 3. Keyword lookup in P2P systems [4,21,61,6567,210,240,245,313316].
number of queries for widely distributed les. Their preliminary design to bias random walks
GUESS reuses the non-forwarding idea (Section 2). towards high capacity nodes did not go as far as
A GUESS peer repeatedly queries single ultrapeers the ultrapeer proposals in that the indexes did not
with a TTL of 1, with a small timeout on each query move to the high capacity nodes. Chawathe et al.
to limit load. It chooses the number of iterations chose to extend the Gnutella design with their Gia
and selects ultrapeers so as to satisfy its search system, in response to the perceived shortcomings
needs. For adaptability, a small number of experi- of DHTs in Section 1.2 [61]. Compared to the early
mental Gnutella nodes have implemented eXtensi- Gnutella designs, they incorporated several novel
ble Markup Language (XML) schemas for richer features. They devise a topology adaptation algo-
queries [320,321]. None of the above Gnutella pro- rithm so that most peers are attached to high-degree
posals explicitly assess robustness. peers. They use a random walk search algorithm, in
The broader research community has recently lieu of ooding, and bias the query load towards
been leveraging aspects of the Gnutella design. Lv higher-degree peers. For one-hop replication, they
et al. exposed one assumption implicit in some of require all nodes keep pointers to content on adja-
the early DHT workthat designs such as Gnu- cent peers. To implement a receiver-controlled
tella are inherently not scalable, and therefore token-based ow control, a peer must have a token
should be abandoned [66]. They argued that by from its neighbouring peer before it sends a query to
making better use of the more powerful peers, Gnu- it. Chawathe et al. show by simulations that the
tellas scalability issues could be alleviated. Instead combination of these features provides a scalability
of its ooding mechanism, they used random walks. improvement of three to ve orders of magnitude
over Gnutella while retaining signicant robust- locally or at another peer. If the query contains
ness. The main robustness metrics they used were several keywords, inverted lists may need to be
the collapse point query rate (the per node query retrieved from several dierent peers to nd the
rate at which the successful query rate falls below intersection [21]. The initial assessment by Li et al.
90%) and the average hop-count immediately prior was that the partition-by-document approach was
to collapse. Their comparison with Gnutella did superior [210]. For one scenario of a full-text web
not take into account the Gnutella enhancements search, they estimated the communications costs
abovethis was left as future work. Castro, Costa to be about six times higher than the feasible bud-
and Rowstron argued that if Gnutella were built get. However, wanting to exploit prior work on
on top of a structured overlay, then both the query inverted list intersection, they studied the parti-
and overlay maintenance trac could be reduced tion-by-keyword strategy. They proposed several
[239]. Yang et al. explore various policies for peer optimizations which put the communication costs
selection in the GUESS protocol, since the issue is for a partition-by-keyword system within an order
left open in the original proposal [245]. For exam- of magnitude of feasibility. There had been a couple
ple, the peer initiating the query could choose peers of prior papers that suggested partitioned-by-key-
that have been most recently used or that have word designs incorporate DHTs to map keywords
the most les shared. Various policy pitfalls are to peers [316,322]. In Gnawalis Keyword-set Search
identied. For example, good peers could be over- System (KSS), the index is partitioned by sets of
loaded, victims of their own success. Alternatively, keywords [316]. Terpstra et al. point out that by
malicious peers could encourage the querying peer keeping keyword pairs or triples, the number of lists
to try inactive peers. They conclude that a most per document in KSS is squared or tripled [323]. Shi
results policy gives the best balance of robustness et al. interpreted the approximations of Li et al. to
and eciency. Like Castro, Costa and Rowstron, mean that neither approach is feasible on its own
they concentrated on the static network scenario. [21]. Their Multi-Level Partitioning (MLP) scheme
Cholvi et al. very briey describe how similar least incorporates both partitioning approaches. They
recently used and most often used heuristics can arrange nodes into a group hierarchy, with all nodes
be used by a peer to select peer acquaintances in the single level 0 group, and with the same nodes
[313]. They were motivated by the congestion asso- sub-divided into k logical subgroups on level 1.
ciated with Gnutellas TTL-limited ooding. Recog- The subgroups are again divided, level by level, until
nizing that the busiest peers can quickly become level l. The inverted index is partitioned by docu-
overloaded central hubs for the entire network, they ment between groups and by keyword within
limit the number of acquaintances for any given groups. MLP avoids the query ooding normally
peer to 25. They sketch a mechanism to decrement associated with systems partitioned by document,
a querys TTL multiple times when it traverses since a small number of nodes in each group process
interested peers. In summary, these Gnutella- the query. It reduces the bandwidth overheads asso-
related investigations are characterized by a bias ciated with inverted list intersection in systems
for high degree peers and very short directed query partitioned solely by keyword, since groups can
paths, a disdain for ooding, and concern about calculate the intersection independently over the
excessive load on the better peers. Generally, the documents for which they are responsible. MLP
robustness analysis for dynamic networks (content was overlaid on SkipNet, per Section 3.8 [38]. Some
updates and node arrivals/departures) remains initial analyses of communications costs and query
open. latencies were provided.
One aspect of P2P keyword search systems has Much of the research above addresses partial
received particular attention: should the index be keyword search. Daswani et al. highlighted the open
partitioned by document or by keyword? The issue problem of ecient, comprehensive keyword search
aects scalability. To be partitioned by document, [25]. How can exhaustive searches be achieved with-
each node has a local index of documents for which out ooding queries to every peer in the network?
it is responsible. Gnutella is a prime example. Que- Terpstra et al. couched the keyword search problem
ries are generally ooded in systems partitioned by in rendezvous terms: dynamic keyword queries need
document. On the other hand, a peer may assume to meet with static document lists [323]. Their
responsibility for a set of keywords. The peer uses Bitzipper scheme is partitioned by document. They
an inverted list to nd a matching document, either improved on full ooding by putting document
p
metadata on 2pn nodes and forwarding queries mation consumers and producers. Daswani et al.
through only 6 n nodes. They reported that Bitzip- pointed out that, while there are IR techniques for
per nodes need only 1/166th of the bandwidth ranked keyword search at moderate scale, research
of full-ooding Gnutella nodes for an exhaustive is required so that ranking mechanisms are ecient
search. An initial comparison of query load was at the larger scale targeted by P2P designs [25].
given. There was little consideration of either static Joseph and Hoshiai surveyed several P2P systems
or dynamic resilience, that is, of nodes failing, of using metadata techniques from the IR toolkit
documents continually changing, or of nodes con- [60]. They described an assortment of IR techniques
tinually joining and leaving the network. and P2P systems, including various metadata for-
mats, retrieval models, bloom lters, DHTs and
4.2. Peer information retrieval trust issues.
In the ensuing paragraphs, we survey P2P work
The eld of Information Retrieval (IR) has that has incorporated information retrieval models,
matured considerably since its inception in the particularly the Vector Model and the Latent
1950s [324]. A taxonomy for IR models has been Semantic Indexing Model. We omit the P2P work
formalized [242]. It consists of four elements: a rep- based on Bayesian models. Some have pointed to
resentation of documents in a collection; a represen- such work [60], but it made no explicit mention of
tation of user queries; a framework describing the model [325]. One early paper on P2P content-
relationships between document representations based image retrieval also leveraged the Bayesian
and queries; and a ranking function that quanties model [326]. For the former two models, we briey
an ordering amongst documents for a particular describe the design, then try to highlight robustness
query. Three main issues motivate current IR aspects. On robustness, we are again stymied for
researchinformation relevance, query response lack of prior work. Indeed, a search across all pro-
time, and user interaction with IR systems. The ceedings of the Annual ACM Conference on
dominant IR trends for searching large text collec- Research and Development in Information Retrie-
tions are also threefold [242]. The size of collections val for the words reliable, available, depend-
is increasing dramatically. More complicated search able or adaptable did not return any results at
mechanisms are being found to exploit document the time of writing. In contrast, a standard text on
structure, to accommodate heterogeneous docu- distributed database management systems [327]
ment collections, and to deal with document errors. contains a whole chapter on reliability. IR research
Compression is in favourit may be quicker to concentrates on performance measures. Common
search compact text or retrieve it from external performance measures include recall, the fraction
devices. In a distributed IR system, query processing of the relevant documents which has been retrieved,
has four parts. Firstly, particular collections are and precision, the fraction of the retrieved docu-
targeted for the search. Secondly, queries are sent ments which is relevant [242]. Ideally, an IR system
to the targeted collections. Queries are then evalu- would have high recall and high precision. Unfortu-
ated at the individual collections. Finally results nately techniques favouring one often disadvantage
from the collections are collated. the other [324].
So how do P2P networks dier from distributed
IR systems? Bawa et al. presented four dierences 4.2.1. Vector model
[62]. They suggested that a P2P network is typically The vector model [328] represents both docu-
larger, with tens or hundreds of thousands of nodes. ments and queries as term vectors, where a term
It is usually more dynamic, with node lifetimes mea- could be a word or a phrase. If a document or query
sured in hours. They suggested that a P2P network has a term, the weight of the corresponding
is usually homogeneous, with a common resource dimension of the vector is non-zero. The similarity
description language. It lacks the centralized medi- of the document and query vectors gives an indica-
ators found in many IR systems, that assume tion of how well a document matches a particular
responsibility for selecting collections, for rewriting query.
queries, and for merging ranked results. These dis- The weighting calculation is critical across the
tinctions are generally aligned with the peer charac- retrieval models. Amongst the numerous proposals
teristics in Section 1. One might add that P2P nodes for the probabilistic and vector models, there are
display more symmetrypeers are often both infor- some commonly recurring weighting factors [324].
One is term frequency. The more a term is repeated Freenet design could nd a document based on a
in a document, the more important the term is. globally unique identier. Kronfols design added
Another is inverse document frequency. Terms com- the ability to search, for example, for documents
mon to many documents give less information about apples AND oranges NOT bananas. It
about the content of a document. Then there is doc- uses a TFIDF weighting scheme to build a docu-
ument length. Larger documents can bias term ments term vector. Each peer calculates the similar-
frequencies, so weightings are sometimes normal- ity of the query vector and local documents and
ized against document length. The expression forwards the query to the best downstream peer.
TFIDF weighting refers to the collection of Once the best downstream peer returns a result,
weighting calculations that incorporate term fre- the second-best peer is tried, and so on. Simulations
quency and inverse document frequency, not just with 1000 nodes gave an indication of the query
to one. Two weighting calculations have been par- path lengths in various situationswhen routing
ticularly dominantOkapi [329] and pivoted nor- queries in a network with constant rates of node
malization [330]. A distributed version of Googles and document insertion, when bootstrapping the
Pagerank algorithm has also been devised for a network in a worst-case ring topology, or when
P2P environment [331]. It allows incremental, ongo- failing randomly and specically selected peers.
ing Pagerank calculations while documents are Kronfol claimed excellent average-case perfor-
inserted and deleted. manceless than 20 hops to retrieve the same top
A couple of early P2P systems leveraged the n results as a centralized search engine. There were,
vector model. Building on the vector model, PlanetP however, numerous cases where the worst-case path
divided the ranking problem into two steps [215]. In length was several hundred hops in a network of
the rst, peers are ranked for the probability that only 1000 nodes.
they have matching documents. In the second, In parallel, there have been some P2P designs
higher priority peers are contacted and the matching based on the vector model from the University of
documents are ranked. An Inverse Peer Frequency, RochesterpSearch10 [9,333] and eSearch [334].
analogous to the Inverse Document Frequency, is The early pSearch paper suggested a couple of
used to rank relevant peers. To further constrain retrieval models, one of which was the Vector Space
the query trac, PlanetP contacts only the rst Model, to search only the nodes likely to have
group of m peers to retrieve a relevant set of docu- matching documents. To obtain approximate global
ments. In this way, it repeatedly contacts groups of statistics for the TFIDF calculation, a spanning
m peers until the top k document rankings are sta- tree was constructed across a subset of the peers.
ble. While the PlanetP designers rst quantied For the m top terms, the term-to-document index
recall and precision, they also considered reliability. was inserted into a Content-Addressable Network
Each PlanetP peer has a global index with a list of [294]. A variant which mapped terms to document
all other peers, their IP addresses, and their Bloom clusters was also suggested. eSearch is a hybrid of
lters. This large volume of shared information the partition-by-document and partition-by-term
needs to be maintained. Klampanos and Jose saw approaches seen in the previous section. eSearch
this as PlanetPs primary shortcoming [332]. Each nodes are primarily partitioned by term. Each is
Bloom lter summarized the set of terms in the local responsible for the inverted lists for some top terms.
index of each peer. The time to propagate changes, For each document in the inverted list, the node
be they new documents or peer arrivals/departures, stores the complete term list. To reduce the size of
was studied by simulation for up to 1000 peers. The the index, the complete term lists for a document
reported propagation times were in the hundreds of are only kept on nodes that are responsible for
seconds. Design workarounds were required for top terms in the document. eSearch uses the Okapi
PlanetP to be viable across slower dial-up modem term weighting to select top terms. It relies on the
connections. For future work, the authors were con- Chord DHT [34] to associate terms with nodes stor-
sidering some sort of hierarchy to scale to larger ing the inverted lists. It also uses automatic query
numbers of peers. expansion. This takes the signicant terms from
A second early system using the vector model is
the Fault-tolerant, Adaptive, Scalable Distributed
search engine [263], which extended the Freenet 10
The pSearch design had earlier been proposed under the name
design (Section 2.3) for richer queries. The original PeerSearch.
the top document matches and automatically adds 5. Queries

them to the users query to nd additional relevant
documents. The eSearch performance was quanti- Database research suggests directions for P2P
ed in terms of search precision, the number of re- research. Hellerstein observed that, while work on
trieved documents, and various load-balancing fast P2P indexes is well underway, P2P query opti-
metrics. Compared to the more common proposals mization remains a promising topic for future
for partitioning by keywords, eSearch consumed research [23]. Kossman reviewed the state of the
6.8 times the storage space to achieve faster search art of distributed query processing, highlighting
times. areas for future research: simulation and query opti-
mization for networks of tens of thousands of
4.2.2. Latent semantic indexing servers and millions of clients; non-relational data
Another retrieval model used in P2P proposals is types like XML, text and images; and partial query
Latent Semantic Indexing (LSI) [335]. Its key idea is responses since on the Internet failure is the rule
to map both the document and query vectors to a rather than the exception [19]. A primary motiva-
concept space with lower dimensions. The starting tion for the P2P system, PIER, was to scale from
point is a t * N weighting matrix, where t is the total the largest database systems of a few hundred nodes
number of indexed terms, N is the total number of to an Internet environment in which there are over
documents, and the matrix elements could be 160 million nodes [22]. Litwin and Sahri have also
TFIDF rankings. Using singular value decomposi- considered ways to combine distributed hashing,
tion, this matrix is reduced to a smaller number of more specically the Scalable Distributed Data
dimensions, while retaining the more signicant Structures, with SQL databases, claiming to be rst
term-to-document mappings. Baeza-Yates and to implement scalable distributed database parti-
Ribeiro-Neto suggested that LSIs value is a novel tioning [337]. Motivated by the lack of transparent
theoretic framework, but that its practical perfor- distribution in current distributed databases, they
mance advantage for real document collections measure query execution times for Microsoft SQL
had yet to be proven [242]. pSearch incorporated servers aggregated by means of an SDDS layer.
LSI [9]. By placing the indices for semantically sim- One of their starting assumptions was that it is
ilar documents close in the network, Tang et al. tou- too challenging to change the SQL query optimizer.
ted signicant bandwidth savings relative to the Database research also suggests the approach to
early full-ooding variant of Gnutella [333]. They P2P research. Researchers of database query opti-
plotted the number of nodes visited by a query. mization were divided between those looking for
The also explored the tradeo with accuracy, the optimal solutions in special cases and those using
percentage match between the documents returned heuristics to answer all queries [338]. Gribble et al.
by the distributed pSearch algorithm and those cast query optimization in terms of the data place-
from a centralized LSI baseline. In a more recent ment problem, which is to distribute data and
update to the pSearch work, Tang et al. summarized work so the full query workload is answered with
LSIs shortcomings [336]. Firstly, for large lowest cost under the existing bandwidth and
document collections, its retrieval quality is inher- resource constraints [231]. They pointed out that
ently inferior to Okapi. Secondly, singular value even the static version of this problem is NP-com-
decomposition consumes excessive memory and plete in P2P networks. Consequently, research on
computation time. Consequently, the authors used massive, dynamic P2P networks will likely progress
Okapi for searching while retaining LSI for index- using both strategies of early database research
ing. With Okapi, they selected the next node to be heuristics and special-case optimizations.
searched and selected documents on searched nodes. If P2P networks are going to be adaptable, if they
With LSI, they ensured that similar documents are are to support a wide range of applications, then
clustered near each other, thereby optimizing the they need to accommodate many query types [72].
network search costs. When retrieving a small num- Up to this point, we have reviewed queries for keys
ber of top documents, the precision of LSI + Okapi (Section 3) and keywords (Sections 4.1 and 4.2).
approached that of Okapi. However, if retrieving a Unfortunately, a major shortcoming of the DHTs
large number of documents, the LSI + Okapi preci- in Sections 3.33.7 is that they primarily support
sion is inferior. The authors want to improve this in exact-match, single-key queries. Skip Graphs sup-
future work. port range and prex queries, but not aggregation
queries. Here we probe below the language syntax

to identify the open research issues associated with
more expressive P2P queries [25]. Triantallou and
Pitoura observed the disparate P2P designs for dif-
ferent types of queries and so outlined a unifying
framework [76]. To classify queries, they considered
the number of relations (single or multiple), the
number of attributes (single or multiple) and the
type of query operator. They described numerous
operators: equality, range, join and special
functions. The latter referred to aggregation (like
sum, count, average, minimum and maximum),
grouping and ordering. The following sections
approximately t their taxonomyrange queries,
multi-attribute queries, join queries and aggregation
queries. There has been some initial P2P work on
other query typescontinuous queries [20,22,73],
recursive queries [22,74] and adaptive queries
[23,75]. For these, we defer to the primary
Fig. 4. Solutions for range queries on P2P and SDDS indexes.
references. Note 1. Although several of the authors based their work on one
particular DHT, it may be possible to port their work to others
5.1. Range queries [38,41,7788,100].
The support of ecient range predicates in P2P titions risk overload. If they are too small, there
networks was identied as an important open may be too many hops.
research issue by Huebsch et al. [22]. Range parti- Despite these potential shortcomings, there have
tioning has been important in parallel databases to been several range query proposals based on DHTs.
improve performance, so that a transaction com- If hashing ranges to nodes, it is entirely possible
monly needs data from only one disk or node [22]. that overlapping ranges map to dierent nodes.
One type of range search, longest prex match, is Gupta et al. rely on locality sensitive hashing to
important because of its prevalence in routing ensure that, with high probability, similar ranges
schemes for voice and data networks alike. In other are mapped to the same node [77]. They propose
applications, users may pose broad, inexact queries, one particular family of locality sensitive hash func-
even though they require only a small number of tions, called min-wise independent permutations.
responses. Consequently techniques to locate simi- The number of partitions per node and the path
lar ranges are also important [77]. Various propos- length were plotted against the total numbers of
als for range searches over P2P networks are peers in the system. For a network with 1000 nodes,
summarized in Fig. 4. Since the Scalable Distributed the hop-count distribution was very similar to that
Data Structure (SDDS) has been an important of the exact-matching Chord scheme. Was it load-
inuence on contemporary Distributed Hash Tables balanced? For the same network with 50,000 parti-
(DHTs) [4951], we also include ongoing work on tions, there were over two orders of magnitude
SDDS range searches. variation in the number of partitions at each node
The papers on P2P range search can be divided (rst and ninety-ninth percentiles). The Prex Hash
into those that rely on an underlying DHT (the rst Tree is a trie in which prexes are hashed onto any
ve entries in (Fig. 4) and those that do not (the sub- DHT. The preliminary analysis suggests ecient
sequent three entries). Bharambe et al. argued that doubly logarithmic lookup, balanced load and fault
DHTs are inherently ill-suited to range queries resilience [78,79]. Andrzejak and Xu were perhaps
[84]. The very feature that makes for their good load the rst to propose a mapping from ranges to
balancing properties, randomized hash functions, DHTs [80]. They use one particular Space Filling
works against range queries. One possible solution Curve, the Hilbert curve, over a Content Address-
would be to hash ranges, but this can require a pri- able Network (CAN) construction (Section 3.5).
ori partitioning. If the partitions are too large, par- They maintain two properties: nearby ranges map
to nearby CAN zones; if a range is split into two Other proposals for range queries avoid both the
sub-ranges, then the zones of the sub-ranges parti- DHT and the Skip Graph. Bharambe et al. distin-
tion the zone of the primary range. They plot path guish their Mercury design by its support for
length and load proxy measures (the total number multi-attribute range queries and its explicit load
of messages and nodes visited) for three algorithms balancing [84]. In Mercury, nodes are grouped into
to propagate range queries: brute force; controlled routing hubs, each of which is responsible for vari-
ooding and directed controlled ooding. Schmidt ous query attributes. While it does not use hashing,
and Parashar also advocated Space Filling Curves Mercury is loosely similar to the DHT approaches:
to achieve range queries over a DHT [81]. However nodes within hubs are arranged into rings, like
they point out that, while Andrzejak and Xu use an Chord [34]; for ecient routing within hubs, k
inverse Space Filling Curve to map a one-dimen- long-distance links are used, like Symphony [339].
sional space to d-dimensional zones, they map a Range lookups require O(log2 n/k) hops. Random
d-dimensional space back to a one-dimensional sampling is used to estimate the average load on
index. Such a construction gives the ability to search nodes and to nd the parts of the overlay that are
across multiple attributes (Section 5.2). Tanin et al. lightly loaded. Whereas Symphony assumed that
suggested quadtrees over Chord [82], and gave pre- nodes are responsible for ranges of approximately
liminary simulation results for query response equal size, Mercurys random sampling can deter-
times. mine the location of the start of the range, even
Because DHTs are naturally constrained to for non-uniform ranges [84]. P-Grid [42] does pro-
exact-match, single-key queries, researchers have vide for range queries, by virtue of the key order-
considered other P2P indexes for range searches. ing in its tree structures. Ganesan et al. critiqued
Several were based on Skip Graphs [38,41] which, its capabilities [83]: P-Grid assumes xed-capacity
unlike the DHTs, do not necessitate randomizing nodes; there was no formal characterization of
hash functions and are therefore capable of range imbalance ratios or balancing costs; every P-Grid
searches. Unfortunately, they are not load balanced periodically contacts other nodes for load
[83]. For example, in SkipNet [48], hashing was information.
added to balance the loadthe Skip Graph could The work on Scalable Distributed Data Struc-
support range searches or load balancing, but not tures (SDDSs) has progressed in parallel with P2P
both. One solution for load-balancing relies on an work and has addressed range queries. Like the
increased number of virtual servers [168] but, in DHTs above, the early SDDS Linear Hashing
their search for a system that can both search for (LH*) schemes were not order-preserving [52]. To
ranges and balance loads, Bharambe et al. rejected facilitate range queries, Litwin et al. devised a
the idea [84]. The virtual servers work assumed load Range Partitioning variant, RP* [87]. There are
imbalance stems from hashing, that is, by skewed options to dispense with the index, to add indexes
data insertions and deletions. In some situations, to clients and to add them to servers. In the variant
the imbalance is triggered by a skewed query load. without an index, every query is issued via multi-
In such circumstances, additional virtual servers casting. The other variants also use some multicast-
can increase the number of routing hops and ing. The initial RP* paper suggested scalability to
increase the number of pointers that a Skip Graph thousands of sites, but a more recent RP* simula-
needs to maintain. Ganesan et al. devised an alter- tion was capped at 140 servers [88]. In that work,
nate method to balance load [83]. They proposed Tsangou et al. investigated TCP and UDP mecha-
two Skip Graphs, one to index the data itself and nisms by which servers could return range query
the other to track load at each node in the system. results to clients. The primary metrics were search
Each node is able to determine the load on its neigh- and response times. Amongst the commercial paral-
bours and the most (least) loaded nodes in the sys- lel database management systems, they reported
tem. They devise two algorithms: NBRADJUST that the largest seems only to scale to 32 servers
balances load on neighbouring nodes; using REOR- (SQL Server 2000). For future work, they planned
DER, empty nodes can take over some of the tuples to explore aggregation of query results, rather than
on heavily loaded nodes. Their simulations focus on establishing a connection between the client and
skewed storage load, rather than on skewed query every single server with a response.
loads, but they surmise that the same approach All in all, it seems there are numerous open
could be used for the latter. research questions on P2P range queries. How
realistic is the maintenance of global load statistics push proactive summaries of their data rather than
considering the scale and dynamism of P2P net- waiting for a query. Summaries are aggregated and
works? Simulations at larger scales are required. stored throughout a server hierarchy, to guide sub-
Proposals should take into account both the storage sequent queries. Some initial prototype measure-
load (insert and delete messages) and the query load ments were provided for total load on the system,
(lookup messages). Simplifying assumptions need to but not for load distribution. They put several issues
be attacked. For example, how well do the above forward for future work. The indexing needs to be
solutions work in networks with heterogeneous exible to change according to query and storage
nodes, where the maximum message loads and workloads. A mesh topology might improve on
index sizes are node-dependent? their hierarchic topology since query misses would
not propagate to root servers. The choice is analo-
5.2. Multi-attribute queries gous to BGP meshes and DNS trees.
More recently, Cai et al. devised the Multi-Attri-
There has been some work on multi-attribute bute Addressable Network (MAAN) [91]. They
P2P queries. As late as September 2003, it was sug- built on Chord to provide both multi-attribute
gested that there has not been an ecient solution and range queries, claiming to be the rst to service
[76]. both query types in a structured P2P system. Each
Again, an early signicant work on multi-attri- MAAN node has O(log N) neighbours, where N is
bute queries over aggregated commodity nodes the number of nodes. MAAN multi-attribute range
germinated amongst SDDSs. k-RP* [89] uses the queries require O(log N + N smin) hops, where smin
multi-dimensional binary search tree (or kd tree is the minimum range selectivity across all attri-
where k indicates the number of dimensions of the butes. Selectivity is the ratio of the query range to
search index) [340]. It builds on the RP* work from the entire identier range. The paper assumed that
the previous section and inherits their capabilities a locality preserving hash function would ensure
for range search and partial match. Like the other balanced load. Per Section 5.1, the arguments by
SDDSs, k-RP* indexes can t into RAM for very Bharambe et al. have highlighted the shortcomings
fast lookup. For future work, Litwin and Neimat of this assumption [84]. MAAN required that the
suggested (a) a formal analysis of the range search schema must be xed and known in advance
termination algorithm and the kd paging algo- adaptable schemas were recommended for subse-
rithm, (b) a comparison with other multi-attribute quent attention. The authors also acknowledged
data structures (quad-trees and R-trees) and (c) that there is a selectivity breakpoint at which full
exploration of query processing, concurrency con- ooding becomes more ecient than their scheme.
trol and transaction management for k-RP* les, This begs for a query resolution algorithm that
and [89]. On the latter point, others have considered adapts to the prole of queries. Cai and Frank fol-
transactions to be inconsequential to the core prob- lowed up with RDFPeers [55]. They dierentiate
lem of supporting more complex queries in P2P net- their work from other RDF proposals by (a) guar-
works [72]. anteeing to nd query results if they exist and (b)
In architecting their secure wide-area Service Dis- removing the requirement of prior denition of a
covery Service (SDS), Hodes et al. considered three xed schema. They hashed < subject, predicate,
possible designs for multi-criteria searchCentral- object > triples onto the MAAN and reported rout-
ization, Mapping and Flooding [90]. These correlate ing hop metrics for their implementation. Load
to the index classications of Section 2Central, imbalance across nodes was reduced to less than
Distributed and Local. They discounted the central- one order of magnitude, but the specic measure
ized, Napster-like index for its risk of a single was number of triples stored per nodeskewed
point of failure. They considered the hash-based query loads were not considered. They plan to
mappings of Section 3 but concluded that it would improve load balancing with the virtual servers of
not be possible to adequately partition data. A doc- Section 5.1 [168].
ument satisfying many criteria would be wastefully
stored in many partitions. They rejected full ood- 5.3. Join queries
ing for its lack of scalability. Instead, they devised
a query ltering technique, reminiscent of Gnu- Two research teams have done some initial work
tellas query routing protocol (Section 4.1). Nodes on P2P join operations. Harren et al. initially
described a three-layer architecturestorage, DHT network bandwidth for these four schemes. The ini-
and query processing. They implemented the join tial prototype was on a cluster of 64 PCs, but it has
operation by modifying an existing Content more recently been expanded to PlanetLab.
Addressable Network (CAN) simulator, reporting Triantallou and Pitoura considered multicast-
signicant hot-spots in all dimensions: storage, ing to large numbers of peers to be inecient [76].
processing and routing [72]. They progressed their They therefore allocated a limited number of spe-
design more recently in the context of PIER, a dis- cial peers, called range guards. The domain of the
tributed query engine based on CAN [22,341]. They join attributes was divided, one partition per range
implemented two equi-join algorithms. In their guard. Join queries were sent only to range guards,
design, a key is constructed from the namespace where the query was executed. Ecient selection of
and the resource ID. There is a namespace for range guards and a quantitive evaluation of their
each relation and the resource ID is the primary proposal were left for future work.
key for base tuples in that relation. Queries are
multicast to all nodes in the two namespaces 5.4. Aggregation queries
(relations) to be joined. Their rst algorithm is a
DHT version of the symmetric hash join. Each Aggregation queries invariable rely on tree-struc-
node in the two namespaces nds the relevant tures to combine results from a large number of
tuples and hashes them to a new query namespace. nodes. Examples of aggregation queries are Count,
The resource ID in the new namespace is the con- Sum, Maximum, Minimum, Average, Median and
catenation of join attributes. In the second algo- Top-K [92,342,343]. Fig. 5 summarizes the tree
rithm, called fetch matches, one of the relations and query characteristics that aect dependability.
is already hashed on the join attributes. Each node The fundamental design choices for aggregation
in the second namespace nds tuples matching the trees relate to how the overlay uses DHTs, how it
query and retrieves the corresponding tuples from repairs itself when there are failures, how many
the rst relation. They leveraged two other aggregation trees there are, and whether the tree is
techniques, namely the symmetric semi-join rewrite static or dynamic (Fig. 5). Astrolabe is one of the
and the Bloom lter rewrite, to reduce the high most inuential P2P designs included in Fig. 5, yet
bandwidth overheads of the symmetric hash join. it makes no use of DHTs [92]. Other designs make
For an overlay of 10,000 nodes, they simulated use of the internal trees of Plaxton-like DHTs. Oth-
the delay to retrieve tuples and the aggregate ers build independent tree structures on top of
Fig. 5. Aggregation trees and queries in P2P networks. Key: Astrolabe [92]; Cone [93]; Distributed Approximative System Information
Service (DASIS) [95]; Scalable Distributed Information Management System (SDIMS) [98]; Self-Organized Metadata Overlay (SOMO)
[56]; Wildre [99]; Willow [32]; Newscast [97]. See also [342344].
DHTs. Most of the designs repair the aggregation the potential to be robust to faults or intentional
tree with periodic mechanisms similar to those used attacks [18]. If P2P is to be a disruptive technology
in the DHTs themselves. Willow is an exception in applications other than casual le sharing, then
[32]. It uses a Tree Maintenance Protocol to zip dis- robustness needs to be practically veried [20].
joint aggregation trees together when there are major The best comparative research on P2P depend-
failures. Yalagandula and Dahlin found recongura- ability has been done in the context of Distri-
tions at the aggregation layer to be costly, suggesting buted Hash Tables (DHTs) [270]. The entire body
more research on techniques to reduce the cost and of DHT research can be distilled to four main obser-
frequency of such recongurations [98]. Many of vations about dependability (Section 3.2). Firstly,
the designs use multiple aggregation trees, each static dependability comparisons show that no
rooted at the DHT node responsible for the aggrega- O(log N) DHT geometry is signicantly more
tion attribute. On the other hand, the Self-Organized dependable than the other O(log N) geometries. Sec-
Metadata Overlay [56] uses a single tree and is vulner- ondly, dynamic dependability comparisons show
able to a single point of failure at its root. that DHT dependability is sensitive to the underly-
At the time of writing, researchers have just ing topology maintenance algorithms (Fig. 2). Thirdly,
begun exploring the performance of queries in the most DHTs use O(log N) geometries to suit ephem-
presence of churn. Most designs are for best-eort eral nodes, whereas the O(1) hop DHTs suit stable
queries. Bawa et al. devised a better consistency nodesthey deserve more research attention.
model, called Single-Site Validity [99] to qualify Fourthly, although not yet a mature science, the
the accuracy of results when there is churn. Its price study of DHT dependability is helped by recent sim-
was a ve-fold increase in the message load, when ulation tools that support multiple DHTs [278].
compared to an ecient but best-eort Spanning We make the following four suggestions for
Tree. Gossip mechanisms are resilient to churn, future P2P research:
but they delay aggregation results and incur high
message cost for aggregation attributes with small 1. Complete the companion P2P surveys for storage,
read-to-write ratios. security and applications. A rough outline has
been suggested in Fig. 1, along with references.
6. Conclusions The need for such surveys was highlighted within
the peer-to-peer research group of the Internet
Research on peer-to-peer networks can be divided Research Task Force (IRTF) [17].
into four categoriessearch, storage, security and 2. P2P indexes are maturing. P2P queries are embry-
applications. This critical survey has focused on onic. Work on more expressive queries over P2P
search methods. While P2P networks have been clas- indexes started to gain momentum in 2003, but
sied by the existence of an index (structured or remains fraught with eciency and load issues.
unstructured) or the location of the index (local, cen- 3. Isolate the low-level mechanisms aecting robust-
tralized and distributed), this survey has shown that ness. There is limited value in comparing robust-
most have evolved to have some structure, whether ness of DHT geometries (like rings versus de
it is indexes at superpeers or indexes dened by Bruijn graphs), when robustness is highly sensi-
DHT algorithms. As for location, the distributed tive to underlying topology maintenance algo-
index is most common. The survey has characterized rithms (Fig. 2).
indexes as semantic and semantic-free. It has also cri- 4. Build consensus on robustness metrics and their
tiqued P2P work on major query types. While much acceptable ranges. This paper has teased out
of it addresses work from 2000 or later, we have numerous measures that impinge on robustness,
traced important building blocks from the 1990s. for example, the median query path length for a
The initial motivation in this survey was to answer failure of x% of nodes, bisection width, path
the question, How robust are P2P search net- overlap, the number of alternatives available for
works? The question is key to the deployment of the next hop, lookup latency, average live band-
P2P technology. Balakrishnan et al. argued that the width (bytes/node/s), successful routing rates,
P2P architecture is appealing: the startup and growth the number of timeouts (caused by a nger
barriers are low; they can aggregate enormous stor- pointing to a departed node), lookup failure rates
age and processing resources; the decentralized (caused by nodes that temporarily point to the
and distributed nature of P2P systems gives them wrong successor during churn) and clustering
measures (edge expansion and node expansion). [18] H. Balakrishnan, M.F. Kaashoek, D. Karger, R. Morris, I.
Application-level robustness metrics need to Stoica, Looking up data in P2P systems, Communications
of the ACM 46 (2) (2003) 4348.
drive a consistent assessment of the underlying [19] D. Kossmann, The state of the art in distributed query
search mechanics. processing, ACM Computing Surveys 32 (4) (2000) 422469.
[20] B. Gedik, L. Liu, Reliable peer-to-peer information mon-
itoring through replication, in: Proceedings of the 22nd Intl
References Symposium on Reliable Distributed Systems, October 68,
2003, pp. 5665.
[1] M. Roussopoulos, M. Baker, D. Rosenthal, T. Guili, P. [21] S.-M. Shi, Y. Guangwen, D. Wang, J. Yu, S. Qu, M. Chen,
Maniatis, J. Mogul, 2 P2P of Not 2 P2P? in: The 3rd Intl Making peer-to-peer keyword searching feasible using
Workshop on Peer-to-Peer Systems, February 2627, 2004. multi-level partitioning, in: The 3rd Intl Workshop on
[2] A. Rowstron, P. Druschel, Pastry: scalable, distributed Peer-to-Peer Systems, February 2627, 2004.
object location and routing for large-scale peer-to-peer [22] R. Huebsch, J.M. Hellerstein, N. Lanham, B.T. Loo, S.
systems, IFIP/ACM Middleware 2001, November 2001. Shenker, I. Stoica, Querying the Internet with PIER, in:
[3] B. Yeager, B. Bhattacharjee, Peer-to-Peer Research Group Proceedings of the 29th Intl Conference on Very Large
Charter, 2003. Available from: <http://www.irtf.org/char- Databases VLDB03, September 2003.
ters/p2prg.html>. [23] J.M. Hellerstein, Toward network data independence,
[4] T. Klingberg, R. Manfredi, Gnutella 0.6, 2002. ACM SIGMOD Record 32 (3) (2003) 3440.
[5] I. Clarke, A distributed decentralised information storage [24] K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S.
and retrieval system, Undergraduate Thesis, 1999. Shenker, I. Stoica, The impact of DHT routing geometry
[6] B. Zhao, J. Kubiatowicz, A. Joseph, Tapestry: an infra- on resilience and proximity, in: Proceedings of 2003
structure for fault-tolerant wide-area location and routing, Conference on Applications, Technologies, Architectures
Report No. UCB/CSD-01-1141, 2001. and Protocols for Computer Communications, 2003, pp.
[7] I. Stoica, R. Morris, D. Liben-Nowell, D. Karger, M. 381394.
Kaashoek, F. Dabek, H. Balakrishnan, Chord: a scalable [25] N. Daswani, H. Garcia-Molina, B. Yang, Open problems in
peer-to-peer lookup service for Internet applications, in: data-sharing peer-to-peer systems, in: The 9th Intl Con-
Proceedings of ACM SIGCOMM 2001, pp. 149160. ference on Database Theory (ICDT 2003), Siena, Italy,
[8] S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. January 810, 2003.
Shenker, A scalable content-addressable network, in: Pro- [26] B. Cooper, H. Garcia-Molina, Studying search networks
ceedings of the Conference on Applications, Technologies, with SIL, in: Second Intl Workshop on Peer-to-Peer
Architectures and Protocols for Computer Communica- Systems IPTPS 03, February 2021, 2003.
tions, August 2731, 2001, pp. 161172. [27] M. Bawa, Q. Sun, P. Vinograd, B. Yang, B. Cooper, A.
[9] C. Tang, Z. Xu, M. Mahalingam, pSearch: information Crespo, N. Daswani, P. Ganesan, H. Garcia-Molina, S.
retrieval in structured overlays, First Workshop on Hot Kamvar, S. Marti, M. Schlossed, Peer-to-peer research at
Topics in Networks. Also Computer Communication Stanford, ACM SIGMOD Record 32 (3) (2003) 2328.
Review 33 (1) (2003), October 2829, 2002. [28] B. Yang, H. Garcia-Molina, Improving search in peer-to-
[10] W. Nejdl, S. Decker, W. Siberski, Edutella Project, RDF- peer networks, in: Proceedings of the 22nd IEEE Intl
based Metadata Infrastructure for P2P Applications. 2003. Conference on Distributed Computing Systems, July 2002.
Available from: <http://edutella.jxta.org/>. [29] B. Yang, H. Garcia-Molina, Ecient search in peer-to-peer
[11] K. Aberer, M. Hauswirth, Peer-to-peer information systems: networks, in: Proceedings of the 22nd Intl Conference on
concepts and models, state-of-the-art, and future systems, in: Distributed Computing Systems, July 25, 2002.
ACM SIGSOFT Software Engineering Notes, Proceedings [30] C. Plaxton, R. Rajaraman, A. Richa, Accessing nearby
of the 8th European Software Engineering Conference held copies of replicated objects in a distributed environment, in:
jointly with 9th ACM SIGSOFT International Symposium ACM Symposium on Parallel Algorithms and Architec-
on Foundations of Software Engineering 26 (5), 2001. tures, 1997.
[12] L. Zhou, R. van Renesse, P6P: a peer-to-peer approach to [31] B. Zhao, L. Huang, J. Stribling, S. Rhea, A. Joseph, J.
Internet infrastructure, in: The 3rd Intl Workshop on Peer- Kubiatowicz, Tapestry: a resilient global-scale overlay for
to-Peer Systems, February 2627, 2004. service deployment, IEEE Journal on Selected Areas in
[13] Citeseer, Citeseer Scientic Literature Digital Library, Communications 22 (1) (2004) 4153.
2004. Available from: <http://citeseer.ist.psu.edu/>. [32] R. van Renesse, A. Bozdog, Willow: DHT, aggregation and
[14] D. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J. publish/subscribe in one protocol, in: The 3rd Intl Work-
Pruyne, B. Richard, S. Rollins, Z. Xu, Peer-to-Peer Com- shop on Peer-to-Peer Systems, February 2627, 2004.
puting, HP Technical Report, HPL-2002-57, 2002. [33] P. Ganesan, G. Krishna, H. Garcia-Molina, Canon in G
[15] K. Aberer, M. Hauswirth, An overview on peer-to-peer major: designing DHTs with hierarchical structure, in:
information systems, in: Workshop on Distributed Data Proceedings Intl Conference on Distributed Computing
and Structures WDAS-2002, 2002. Systems ICDCS 2004.
[16] F. DePaoli, L. Mariani, Dependability in peer-to-peer [34] I. Stoica, R. Morris, D. Liben-Nowell, D. Karger, M.
systems, IEEE Internet Computing 8 (4) (2004) 5461. Kaashoek, F. Dabek, H. Balakrishnan, Chord: a scalable
[17] B. Yeager, Proposed research tracks, Email to the Internet peer-to-peer lookup protocol for Internet applications,
Research Task Force IRTF P2P Research Group, Novem- IEEE/ACM Transactions on Networking 11 (1) (2003)
ber 10, 2003. 1732.
[35] S. Rhea, T. Roscoe, J. Kubiatowicz, Structured peer-to- [50] W. Litwin, M. Neimat, D. Schneider, LH*a scalable,
peer overlays need application-driven benchmarks, in: distributed data structure, ACM Transactions on Database
Proceedings of the 2nd Intl Workshop on Peer-to-Peer Systems TODS 21 (4) (1996) 480525.
Systems IPTPS03, February 2021, 2003. [51] R. Devine, Design and Implementation of DDH: a distrib-
[36] D. Loguinov, A. Kumar, S. Ganesh, Graph-theoretic uted dynamic hashing algorithm, in: Proceedings of the 4th
analysis of structured peer-to-peer systems: routing dis- Intl Conference on Foundations of Data Organizations
tances and fault resilience, in: Proceedings of 2003 Confer- and Algorithms, 1993.
ence on Applications, Technologies, Architectures and [52] W. Litwin, M.-A. Niemat, D. Schneider, LH*Linear
Protocols for Computer Communications, August 2529, hashing for distributed les, in: Proceedings of the ACM
2003, pp. 395406. Intl Conference on Management of Data SIGMOD, May
[37] F. Kaashoekm D. Karger, Koorde: a simple degree-optimal 1993.
hash table, in: Second Intl Workshop on Peer-to-Peer [53] C. Tempich, S. Staab, A. Wranik, Remindin: semantic
Systems IPTPS03, February 2021, 2003. query routing in peer-to-peer networks, in: Proceedings of
[38] N. Harvey, M.B. Jones, S. Saroiu, M. Theimer, A. the 13th Conference on World Wide Web, New York, NY,
Wolman, SkipNet: a scalable overlay network with USA, May 1720, 2004, pp. 640649.
practical locality properties, in: Proceedings of the Fourth [54] B.T. Loo, R. Huebsch, I. Stoica, J.M. Hellerstein, The case
USENIX Symposium on Internet Technologies and Sys- for a hybrid P2P search infrastructure, in: The 3rd Intl
tems USITS03, March 2003. Workshop on Peer-to-Peer Systems, February 2627,
[39] I. Gupta, K. Birman, P. Linga, A. Demers, R. Van Renesse, 2004.
Kelips: Building an ecient and stable P2P DHT through [55] M. Cai, M. Frank, RDFPeers: a scalable distributed RDF
increased memory and background overhead, in: Second repository based on a structured peer-to-peer network, in:
Intl Workshop on Peer-to-Peer Systems IPTPS 03, Febru- Proceedings of the 13th Conference on World Wide Web,
ary 2021, 2003. May 1720, 2004, pp. 650657.
[40] J. Cates, Robust and Ecient Data Management for a [56] Z. Zhang, S.-M. Shi, J. Zhu, SOMO: self-organized
Distributed Hash Table, Masters Thesis, May 2003. metadata overlay for resource management in P2P DHTs,
[41] J. Aspnes, G. Shah, Skip graphs, in: Proceedings of the 14th in: Second Intl Workshop on Peer-to-Peer Systems
Annual ACMSIAM Symposium on Discrete Algorithms, IPTPS03, February 2021, 2003.
2003, pp. 384393. [57] B. Yang, H. Garcia-Molina, Designing a super-peer
[42] K. Aberer, P. Cudre-Mauroux, A. Datta, Z. Despotovic, network, in: Proceedings of the 19th Intl Conference on
M. Hauswirth, M. Punceva, R. Schmidt, P-Grid: a self- Data Engineering ICDE, March 2003.
organizing structured P2P system, ACM SIGMOD Record [58] I. Tatarinov, P. Mork, Z. Ives, J. Madhavan, A. Halevy, D.
32 (3) (2003) 2933. Suciu, N. Dalvi, X. Dong, Y. Kadiyska, G. Miklau, The
[43] B. Zhao, Y. Duan, L. Huang, A. Joseph, J. Kubiatowicz, Piazza peer data management project, ACM SIGMOD
Brocade: landmark routing on overlay networks, in: First Record 32 (3) (2003) 4752.
Intl Workshop on Peer-to-Peer Systems IPTPS02, March [59] W. Nejdl, W. Siberski, M. Sintek, Design issues and
2002. challenges for RDF- and schema-based peer-to-peer sys-
[44] S. Ratnasamy, S. Shenker, I. Stoica, Routing algorithms for tems, ACM SIGMOD Record 32 (3) (2003) 4146.
DHTs: some open questions, in: Proceedings of the First [60] S. Joseph, T. Hoshiai, Decentralized meta-data strategies:
Intl Workshop on Peer to Peer Systems, IPTPS 2002, eective peer-to-peer search, IEICE Transactions on
March 2002. Communication E86-B (6) (2003) 17401753.
[45] P. Maymounkov, D. Mazieres, Kademlia: a peer-to-peer [61] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, S.
information system based on the XOR metric, in: Proceed- Shenker, Making Gnutella-like P2P systems scalable, in:
ings of the First Intl Workshop on Peer to Peer Systems, Proceedings of 2003 Conference on Applications, Technol-
IPTPS 2002, March 78, 2002. ogies, Architectures and Protocols for Computer Commu-
[46] D. Malkhi, M. Naor, D. Ratajczak, Viceroy: a scalable nications, August 2529, 2003, pp. 407418.
and dynamic emulation of the buttery, in: Proceedings [62] M. Bawa, G.S. Manku, P. Raghavan, SETS: search
of the 21st Annual Symposium on Principles of Distrib- enhanced by topic segmentation, in: Proceedings of the
uted Computing PODC, July 2124, 2002, pp. 183 26th Annual International ACM SIGIR Conference on
192. Research and Development in Information Retrieval, 2003,
[47] X. Li, C. Plaxton, On name resolution in peer to peer pp. 306313.
networks, in: Proceedings of the ACM SIGACT Annual [63] H. Sunaga, M. Takemoto, T. Iwata, Advanced peer to peer
Workshop on Principles of Mobile Computing POMC02 network platform for various servicesSIONet Semantic
2002, pp. 8289. Information Oriented Network, in: Proceedings of the
[48] N. Harvey, J. Dunagan, M.B. Jones, S. Saroiu, M. Second Intl Conference on Peer to Peer Computing,
Theimer, A. Wolman, SkipNet: a scalable overlay network September 57, 2002, pp. 169170.
with practical locality properties, Microsoft Research [64] M. Schlosser, M. Sintek, S. Decker, W. Nejdl, in: Hyper-
Technical Report MSR-TR-2002-92, 2002. CuPHypercubes, Ontologies and P2P Networks, Springer
[49] D. Karger, E. Lehman, T. Leighton, R. Panigraphy, M. Lecture Notes on Computer Science, Agents and Peer-to-
Levin, D. Lewin, Consistent hashing and random trees: Peer Systems, vol. 2530, 2002.
distributed caching protocols for relieving hot spots on the [65] M. Ripeanu, A. Iamnitchi, P. Foster, Mapping the Pro-
World Wide Web, ACM Symposium on Theory of ceedings of network, IEEE Internet Computing 6 (1) (2002)
Computing, 1997. 5057.
[66] Q. Lv, S. Ratnasamy, S. Shenker, Can heterogeneity National Conference on Digital Government Research
make Gnutella scalable? in: Proceedings of the 1st Intl 2004, pp. 8190.
Workshop on Peer-to-Peer Systems IPTPS2002, March 7 [83] P. Ganesan, M. Bawa, H. Garcia-Molina, Online balancing
8, 2002. of range-partitioned data with applications to peer-to-peer
[67] Q. Lv, P. Cao, E. Cohen, K. Li, S. Shenker, Search and systems, in: Proceedings of the 30th Intl Conference on
replication in unstructured peer to peer networks, in: Very Large Data Bases VLDB 2004, August 29September
Proceedings of the 16th International Conference on 3, 2004.
Supercomputing, June 2226, 2002, pp. 8495. [84] A. Bharambe, M. Agrawal, S. Seshan, Mercury: supporting
[68] V. Kalogaraki, D. Gunopulos, D. Zeinalipour-Yasti, XML scalable multi-attribute range queries, SIGCOMM04,
schemas: integration and translation: a local search mech- August 30September 3, 2004.
anism for peer to peer networks, in: Proceedings of the 11th [85] K. Aberer, Scalable data access in P2P systems using
ACM International Conference on Information and unbalanced search trees, in: Workshop on Distributed Data
Knowledge management, 2002, pp. 300307. and Structures WDAS-2002, 2002.
[69] O. Babaoglu, H. Meling, Montresor, Anthill: a framework [86] K. Aberer, A. Datta, M. Hauswirth, The Quest for
for the development of agent-based peer-to-peer systems, Balancing Peer Load in Structured Peer-to-Peer Systems,
in: Proceedings of the IEEE Intl Conference on Distributed Technical Report IC/2003/32 2003.
Computer Systems, 2002, pp. 1522. [87] W. Litwin, M.-A. Neimat, D. Schneider, RP*: a family of
[70] M. Jovanovic, Modeling large-scale peer-to-peer networks order-preserving scalable distributed data structures, in:
and a case study of Gnutella, Masters thesis, 2001. Proceedings of the 20th Intl Conference on Very Large
[71] I. Clarke, O. Sandberg, B. Wiley, T. Hong, Freenet: A Data Bases VLDB94, September 1215, 1994.
Distributed Anonymous Information Storage and Retrieval [88] M. Tsangou, S. Ndiaye, M. Seck, W. Litwin, Range queries
System, Springer, New York, USA, 2001. to scalable distributed data structure RP*, in: Proceedings
[72] J. Harren, J. Hellerstein, R. Huebsch, B. Loo, S. Shenker, I. of the Fifth Workshop on Distributed Data and Structures,
Stoica, Complex queries in DHT-based peer-to-peer net- WDAS 2003, June 2003.
works, in: Proceedings of the First Intl Workshop on Peer [89] W. Litwin, M.-A. Neimat, k-RP*s: a scalable distributed
to Peer Systems IPTPS 2002, March 2002. data structure for high-performance multi-attributed access,
[73] B. Gedik, L. Liu, PeerCQ: a decentralized and self- in: Proceedings of the Fourth Intl Conference on Para-
conguring peer-to-peer information monitoring system, llel and Distributed Information Systems, 1996, pp. 120
in: Proceedings of the 23rd Intl Conference on Distributed 131.
Computing Systems ICDCS2003, May 1922, 2003. [90] T. Hodes, S. Czerwinski, B. Zhao, A. Joseph, R. Katz, An
[74] B.T. Loo, R. Huebsch, J.M. Hellerstein, T. Roscoe, I. architecture for secure wide-area service discovery, Wireless
Stoica, Analyzing P2P Overlays with Recursive Queries, Networks 8 (2/3) (2002) 213230.
Technical Report, CSD-04-1301, January 14, 2004. [91] M. Cai, M. Frank, J. Chen, P. Szekely, MAAN: a multi-
[75] R. Avnur, J. Hellerstein, Eddies: continuously adaptive attribute addressable network for grid information services,
query processing, in: Proceedings of 2000 ACM SIGMOD in: Proceedings of the Intl Workshop on Grid Computing,
International Conference on Management of Data 2000, November 2003.
pp. 261272. [92] R. van Renesse, K.P. Birman, W. Vogels, Astrolabe: a
[76] P. Triantallou, T. Pitoura, Towards a unifying framework robust and scalable technology for distribute system mon-
for complex query processing over structured peer-to-peer itoring, management and data mining, ACM Transactions
data networks, in: Proceedings of the First Intl Workshop on Computer Systems 21 (2) (2003) 164206.
on Databases, Information Systems and Peer-to-Peer [93] R. Bhagwan, G. Varghese, G. Voelker, Cone: Augmenting
Computing DBISP2P, September 78, 2003, pp. 169183. DHTs to support distributed resource discovery, Technical
[77] A. Gupta, D. Agrawal, A.E. Abbadi, Approximate range Report, CS2003-0755, July 2003.
selection queries in peer-to-peer systems, in: Proceedings of [94] K. Albrecht, R. Arnold, R. Wattenhofer, Join and Leave in
the First Biennial Conference on Innovative Data Systems Peer-to-Peer Systems: The DASIS Approach, Technical
Research CIDR 2003, 2003. Report 427, Department of Computer Science, November
[78] S. Ratnasamy, P. Francis, M. Handley, Range queries in 2003.
DHTs, Technical Report IRB-TR-03-009, July 2003. [95] K. Albrecht, R. Arnold, R. Wattenhofer, Aggregating
[79] S. Ramabhadran, S. Ratnasamy, J. Hellerstein, S. Shenker, information in peer-to-peer systems for improved join and
Brief announcement: prex hash tree, in: Proceedings of the leave, in: Proceedings of the Fourth IEEE Intl Conference
23rd Annual ACM SIGACT-SIGOPS Symposium on on Peer-to-Peer Computing, August 2527, 2004.
Principles of Distributed Computing, PODC 2004, July [96] A. Montresor, M. Jelasity, O. Babaoglu, Robust Aggrega-
2528, 2004, p. 368. tion Protocol for Large-Scale Overlay Networks, Technical
[80] A. Andrzejak, Z. Xu, Scalable, ecient range queries for Report UBLCS-2003-16, December 2003.
grid information services, in: Proceedings of the Second [97] M. Jelasity, W. Kowalczyk, M. van Steen, An approach to
IEEE Intl Conference on Peer to Peer Computing, aggregation in large and fully distributed peer-to-peer
September 2002. overlay networks, in: Proceedings of the 12th Euromicro
[81] C. Schmidt, M. Parashar, Enabling exible queries with Conference on Parallel, Distributed and Network-based
guarantees in P2P systems, IEEE Internet Computing 8 (3) Processing PDP 2004, February 2004.
(2004) 1926. [98] P. Yalagandula, M. Dahlin, A scalable distributed infor-
[82] E. Tanin, A. Harwood, H. Samet, Indexing distributed mation management system, in: SIGCOMM04, August
complex data for complex queries, in: Proceedings of the 30September 3, 2004.
[99] M. Bawa, A. Gionis, H. Garcia-Molina, R. Motwani, The gramming Languages and Operating Systems ASPLOS
price of validity in dynamic networks, in: Proceedings of 2000, November 2000, pp. 190201.
2004 ACM SIGMOD Intl Conference on the Management [113] K. Birman, in: The Surprising Power of Epidemic Com-
of Data 2004, pp. 515526. munication, Lecture Notes in Computer Science, vol. 2584,
[100] J. Aspnes, J. Kirsch, A. Krishnamurthy, Load balancing Springer-Verlag, Heidelberg, 2003, pp. 97102.
and locality in range-queriable data structures, in: Proceed- [114] P. Costa, M. Migliavacca, G.P. Picco, G. Cugola, Introduc-
ings of the 23rd Annual ACM SIGACTSIGOPS Sympo- ing reliability in content-based publishsubscribe through
sium on Principles of Distributed Computing PODC 2004, epidemic algorithms, in: Proceedings of the 2nd Interna-
July 2528, 2004. tional Workshop on Distributed Event-based Systems, 2003,
[101] G. On, J. Schmitt, R. Steinmetz, The eectiveness of pp. 18.
realistic replication strategies on quality of availability for [115] P. Costa, M. Migliavacca, G.P. Picco, G. Cugola, Epidemic
peer-to-peer systems, in: Proceedings of the Third Intl algorithms for reliable content-based publishsubscribe: an
IEEE Conference on Peer-to-Peer Computing, September evaluation, in: The 24th Intl Conference on Distributed
13, 2003, pp. 5764. Computing Systems (ICDCS-2004), March 2326, Tokyo
[102] D. Geels, J. Kubiatowicz, Replica management should be a University of Technology, Hachioji, Tokyo, Japan,
game, in: Proceedings of the SIGOPS European Workshop, 2004.
September 2003. [116] A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S.
[103] E. Cohen, S. Shenker, Replication strategies in unstruc- Shenker, H. Sturgis, D. Swinehart, D. Terry, Epidemic
tured peer to peer networks, in: Proceedings of 2002 algorithms for replicated data management, in: Proceedings
Conference on Applications, Technologies, Architectures of the Sixth ACM Symposium on Principles of Distributed
and Protocols for Computer Communications 2002, pp. Computing 1987, pp. 112.
177190. [117] P. Eugster, R. Guerraoiu, A. Kermarrec, L. Massoulie,
[104] E. Cohen, S. Shenker, P2P and multicast: replication Epidemic information dissemination in distributed systems,
strategies in unstructured peer to peer networks, in: IEEE Computer 37 (5) (2004) 6067.
Proceedings of 2002 Conference on Applications, Technol- [118] W. Vogels, R.v. Renesse, K. Birman, The power of
ogies, Architectures and Protocols for Computer Commu- epidemics: robust communication for large-scale distrib-
nications 2002, pp. 177190. uted systems, ACM SIGCOMM Computer Communica-
[105] H. Weatherspoon, J. Kubiatowicz, Erasure coding vs tion Review 33 (1) (2003) 131135.
replication: a quantative comparison, in: Proceedings of [119] S. Voulgaris, M. van Steen, An epidemic protocol for
the First Intl Workshop on Peer to Peer Systems IPTPS02, managing routing tables in very large peer to peer networks,
March 2002. in: Proceedings of the 14th IFIP/IEEE Workshop on
[106] D. Lomet, Replicated indexes for distributed data, in: Distributed Systems: Operations and Management, Octo-
Proceedings of the Fourth Intl Conference on Parallel and ber 2003.
Distributed Information Systems, December 1820, 1996, [120] I. Gupta, On the design of distributed protocols from
pp. 108119. dierential equations, in: Proceedings of the 23rd Annual
[107] V. Gopalakrishnan, B. Silaghi, B. Bhattacharjee, P. Kele- ACM SIGACTSIGOPS Symposium on Principles of
her, Adaptive replication in peer-to-peer systems, in: Distributed Computing PODC 2004, July 2528, 2004,
Proceedings of the 24th Intl Conference on Distributed pp. 216225.
Computing Systems ICDCS 2004, March 2326, 2004. [121] I. Gupta, K.P. Birman, V. Renesse, Fighting Fire with Fire:
[108] S.-D. Lin, Q. Lian, M. Chen, Z. Zhang, A practical Using randomized Gossip to combat stochastic scalability
distributed mutual exclusion protocol in dynamic peer-to- limits, Cornell University Dept of Computer Science
peer systems, in: The 3rd Intl Workshop on Peer-to-Peer Technical Report, March 2001.
Systems, February 2627, 2004. [122] I. Gupta, Building scalable solutions to distributed com-
[109] A. Adya, R. Wattenhofer, W. Bolosky, M. Castro, G. puting problems using probabilistic components, Doctoral
Cermak, R. Chaiken, J. Douceur, J. Howell, J. Lorch, M. Dissertation, Cornell University, August 2003.
Thiemer, Farsite: federated, available and reliable storage [123] A. Ganesh, A.-M. Kermarrec, L. Massoulie, Peer-to-peer
for an incompletely trusted environment, in: ACM SIGOPS membership management for gossip-based protocols, IEEE
Operating Systems Review, Special Issue on Decentralized Transactions on Computers 52 (2) (2003) 139149.
Storage Systems, 2002, pp. 114. [124] N. Bailey, Epidemic Theory of Infectious Diseases and its
[110] A. Rowstron, P. Druschel, Storage management and Applications, second ed., Hafner Press, 1975.
caching in PAST, a large-scale, persistent peer-to-peer [125] P. Eugster, R. Guerraoiu, S. Handurukande, P. Kouznet-
storage utility, in: Proceedings ACM SOSP01, October sov, A.-M. Kermarrec, Lightweight probabilistic broadcast,
2001, pp. 188201. ACM Transactions on Computer Systems 21 (4) (2003)
[111] S. Rhea, C. Wells, P. Eaton, D. Geels, B. Zhao, H. 341374.
Weatherspoon, J. Kubiatowicz, Maintenance-free global [126] H. Weatherspoon, J. Kubiatowicz, Ecient heartbeats and
data storage, IEEE Internet Computing 5 (5) (2001) repair of softstate in decentralized object location and
4049. routing systems, in: Proceedings of SIGOPS European
[112] J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Workshop, September 2002.
Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, [127] G. Koloniari, E. Pitoura, Content-based routing of path
W. Weimer, C. Wells, B. Zhao, Oceanstore: an architecture queries in peer-to-peer systems, in: Proceedings of the 9th
for global-scale persistent storage, in: Proceedings of the Intl Conference on Extending DataBase Technology
Ninth Intl Conference on Architecture Support for Pro- EDBT, March 1418, 2004.
[128] A. Mohan, V. Kalogaraki, Speculative routing and update [145] P. Keleher, S. Bhattacharjee, B. Silaghi, Are virtualized
propagation: a kundali centric approach, in: IEEE Intl overlay networks too much of a good thing? in: First Intl
Conference on Communications ICC03, May 2002. Workshop on Peer-to-Peer Systems IPTPS, March 2002.
[129] G. Koloniari, Y. Petrakis, E. Pitoura, Content-based [146] A. Mislove, P. Druschel, Providing administrative control
overlay networks for XML peers based on multi-level and autonomy in structured peer-to-peer overlays, in: The
bloom lters, in: Proceedings of the First Intl Workshop on 3rd Intl Workshop on Peer-to-Peer Systems, June 912,
Databases, Information Systems and Peer-to-Peer Com- 2004.
puting DBISP2P, September 78 2003, pp. 232247. [147] D. Karger, M. Ruhl, Diminished Chord: a protocol for
[130] G. Koloniari, E. Pitoura, Bloom-based lters for hierarchi- heterogeneous subgroup formation in peer-to-peer net-
cal data, in: Proceedings of the 5th Workshop on Distrib- works, in: The 3rd Intl Workshop on Peer-to-Peer Systems,
uted Data and Structures (WDAS), 2003. February 2627, 2004.
[131] B. Bloom, Space/time trade-os in hash coding with [148] B. Awerbuch, C. Scheideler, Consistent, order-preserving
allowable errors, Communications of the ACM 13 (7) data management in distributed storage systems, in: Pro-
(1970) 422426. ceedings of the Sixteenth ACM Symposium on Parallel
[132] M. Naor, U. Wieder, A simple fault tolerant distributed hash Algorithms and Architectures SPAA 2004, June 2730,
table, in: Second Intl Workshop on Peer-to-Peer Systems 2004, pp. 4453.
(IPTPS 03), Berkeley, CA, USA, February 2021, 2003. [149] M. Freedman, D. Mazieres, Sloppy hashing and self-
[133] P. Maymounkov, D. Mazieres, Rateless codes and big organizing clusters, in: Proceedings of the 2nd Intl Work-
downloads, in: Second Intl Workshop on Peer-to-Peer shop on Peer-to-Peer Systems IPTPS03, February 2003.
Systems, IPTPS03, February 2021, 2003. [150] F. Dabek, J. Li, E. Sit, J. Robertson, F. Kaashoek, R.
[134] M. Krohn, M. Freedman, D. Mazieres, On-the-y veri- Morris, Designing a DHT for low latency and high
cation of rateless erasure codes for ecient content distri- throughput, in: Proceedings of the First Symposium on
bution, in: Proceedings of the IEEE Symposium on Security Networked Systems Design and Implementation (NSDI04),
and Privacy, May 2004. San Francisco, California, March 2931, 2004, pp. 85
[135] J. Byers, J. Considine, M. Mitzenmacher, S. Rost, Informed 98.
content delivery across adaptive overlay networks, in: [151] M. Ruhl, Ecient algorithms for new computational
Proceedings of 2002 Conference on Applications, Technol- models, Doctoral Dissertation, September 2003.
ogies, Architectures and Protocols for Computer Commu- [152] K. Sollins, Designing for scale and dierentiation, in:
nications, 2002, pp. 4760. Proceedings of ACM SIGCOMM Workshop on Future
[136] J. Plank, S. Atchley, Y. Ding, M. Beck, Algorithms for high Directions in Network Architecture, August 2527, 2003.
performance, wide-area distributed le downloads, Parallel [153] L. Massoulie, A. Kermarrec, A. Ganesh, Network aware-
Processing Letters 13 (2) (2003) 207223. ness and failure resilience in self-organizing overlay net-
[137] M. Castro, P. Rodrigues, B. Liskov, BASE: using abstrac- works, in: Proceedings of 22nd Intl Symposium on Reliable
tion to improve fault tolerance, ACM Transactions on Distributed Systems, SRDS03, October 68, 2003, pp. 47
Computer Systems 21 (3) (2003) 236269. 55.
[138] R. Rodrigues, B. Liskov, L. Shrira, The design of a robust [154] R. Cox, F. Dabek, F. Kaashoek, J. Li, R. Morris, Practical,
peer-to-peer system, in: 10th ACM SIGOPS European distribute network coordinates, ACM SIGCOMM Com-
Workshop, September 2002. puter Communication Review 34 (1) (2004) 113118.
[139] H. Weatherspoon, T. Moscovitz, J. Kubiatowicz, Intro- [155] K. Hildrum, J. Kubiatowicz, S. Rao, B. Zhao, Distributed
spective failure analysis: avoiding correlated failures in object location in a dynamic network, in: Proceedings of the
peer-to-peer systems, in: Proceedings of the Intl Workshop 14th Annual ACM Symposium on Parallel Algorithms and
on Reliable Peer-to-Peer Distributed Systems, October Architectures, 2002, pp. 4152.
2002. [156] X. Zhang, Q. Zhang, G. Song, W. Zhu, A construction of
[140] F. Dabek, R. Cox, F. Kaashoek, R. Morris, Vivaldi: a locality-aware overlay network: mOverlay and its perfor-
decentralized network coordinate system, SIGCOMM04, mance, IEEE Journal on Selected Areas in Communica-
August 30September 3, 2004. tions 22 (1) (2004) 1828.
[141] E.-K. Lua, J. Crowcroft, M. Pias, Highways: proximity [157] N. Harvey, M.B. Jones, M. Theimer, A. Wolman, Ecient
clustering for massively scaleable peer-to-peer network recovery from organization disconnects in Skipnet, in:
routing, in: Proceedings of the Fourth IEEE Intl Confer- Second Intl Workshop on Peer-to-Peer Systems IPTPS03,
ence on Peer-to-Peer Computing, August 2527, 2004. February 2021, 2003.
[142] F. Fessant, S. Handurukande, A.-M. Kermarrec, L. Mas- [158] M. Pias, J. Crowcroft, S. Wilbur, T. Harris, S. Bhatti,
soulie, Clustering in peer-to-peer le sharing workloads, in: Lighthouses for scalable distributed location, in: Second
The 3rd Intl Workshop on Peer-to-Peer Systems, February Intl Workshop on Peer-to-Peer Systems IPTPS03, Febru-
2627, 2004. ary 2021, 2003.
[143] T.S.E. Ng, H. Zhang, Predicting Internet network distance [159] K. Gummadi, S. Saroui, S. Gribble, D. King, Estimating
with coordinates-based approaches, in: IEEE Infocom latency between arbitrary Internet end hosts, in: Proceed-
2002, The 21st Annual Joint Conference of the IEEE ings of SIGCOMM IMW 2002, November 2002.
Computer and Communication Societies, June 2327, 2002. [160] Y. Liu, X. Liu, L. Xiao, L. Ni, X. Zhang, Location-aware
[144] K. Hildrum, R. Krauthgamer, J. Kubiatowicz, Object topology matching in P2P systems, in: Proceedings of IEEE
location in realistic networks, in: Proceedings of the Infocomm, March 711, 2004.
Sixteenth ACM Symposium on Parallel Algorithms and [161] G.S. Manku, Balanced binary trees for ID management
Architectures (SPAA 2004), June 2004, pp. 2535. and load balance in distributed hash tables, in: Proceedings
of 23rd Annual ACM SIGACT-SIGOPS Symposium on [178] B. Sieka, A. Kshemkalyani, M. Singhal, On the security of
Principles of Distributed Computing, PODC 2004, July 25 polling protocols in peer-to-peer systems, in: Proceedings of
28, 2004, pp. 197205. Fourth IEEE Intl Conference on Peer-to-Peer Computing,
[162] J. Gao, P. Steenkiste, Design and evaluation of a distrib- August 2527, 2004.
uted scalable content delivery system, IEEE Journal on [179] M. Feldman, K. Lai, I. Stoica, J. Chuang, Robust incentive
Selected Areas in Communications 22 (1) (2004) 5466. techniques for peer-to-peer networks, in: ACM E-Com-
[163] X. Wang, Y. Zhang, X. Li, D. Loguinov, On zone-balancing merce Conference EC04, May 2004.
of peer-to-peer networks: analysis of random node join, in: [180] K. Anagnostakis, M. Greenwald, Exchange-based incentive
Proceedings of Joint International Conference on Measure- mechanism for peer-to-peer le sharing, in: Proceedings of
ment and Modeling of Computer Systems, June 2004. 24th Intl Conference on Distributed Computing Systems
[164] D. Karger, M. Ruhl, Simple ecient load balancing ICDCS 2004, March 2326, 2004.
algorithms for peer-to-peer systems, in: Proceedings of [181] J. Schneidman, D. Parkes, Rationality and self-Interest
Sixteenth ACM Symposium on Parallel Algorithms and in peer to peer networks, in: Second Intl Workshop
Architectures SPAA 2004, June 2730, 2004. on Peer-to-Peer Systems IPTPS03, February 2021,
[165] D. Karger, M. Ruhl, Simple ecient load balancing 2003.
algorithms for peer-to-peer systems, in: The 3rd Intl [182] C. Buragohain, D. Agrawal, S. Subhash, A game theoretic
Workshop on Peer-to-Peer Systems, February 2627, framework for incentives in P2P systems, in: Proceedings of
2004. Third Intl IEEE Conference on Peer-to-Peer Computing,
[166] M. Adler, E. Halperin, R. Karp, V. Vazirani, A stochastic September 13, 2003, pp. 4856.
process on the hypercube with applications to peer-to-peer [183] W. Josephson, E. Sirer, F. Schneider, Peer-to-peer authen-
networks, in: Proceedings of the 35th ACM symposium on tication with a distributed single sign-on service, in: The 3rd
Theory of Computing 2003, pp. 575584. Intl Workshop on Peer-to-Peer Systems, February 2627,
[167] C. Baquero, N. Lopes, Towards peer to peer content 2004.
indexing, ACM SIGOPS Operating Systems Review 37 (4) [184] A. Fiat, J. Saia, Censorship resistant peer to peer content
(2003) 9096. addressable networks, in: Proceedings of the 13th Annual
[168] A. Rao, K. Lakshminarayanan, S. Surana, R. Karp, I. ACMSIAM Symposium on Discrete Algorithms, 2002,
Stoica, Load balancing in structured P2P systems, in: pp. 94103.
Proceedings of 2nd Intl Workshop on Peer-to-Peer Sys- [185] N. Daswani, H. Garcia-Molina, Query-ood DoS attacks
tems, IPTPS03, February 2021, 2003. in Gnutella, in: Proceedings of the 9th ACM Conference on
[169] J. Byers, J. Considine, M. Mitzenmacher, Simple load Computer and Communications Security, 2002, pp. 181
balancing for distributed hash tables, in: Second Intl 192.
Workshop on Peer-to-Peer Systems IPTPS 03, February [186] A. Singh, L. Liu, TrustMe: anonymous management of
2021, 2003. trust relationships in decentralized P2P systems, in: Pro-
[170] P. Castro, J. Lee, A. Misra, CLASH: a protocol for ceedings of Third Intl IEEE Conference on Peer-to-Peer
Internet-scale utility-oriented distributed computing, in: Computing, September 13, 2003.
Proceedings of 24th Intl Conference on Distributed Com- [187] A. Serjantov, Anonymizing censorship resistant systems, in:
puting Systems ICDCS 2004, March 2326, 2004. Proceedings of Second Intl Conference on Peer to Peer
[171] A. Stavrou, D. Rubenstein, S. Sahu, A lightweight, robust Computing, March 2002.
P2P system to handle ash crowds, IEEE Journal on [188] S. Hazel, B. Wiley, Achord: a variant of the chord lookup
Selected Areas in Communications 22 (1) (2004) 617. service for use in censorship resistant peer-to-peer publish-
[172] A. Selcuk, E. Uzun, M.R. Pariente, A reputation-based ing systems, in: Proceedings of Second Intl Conference on
trust management system for P2P networks, in: Fourth Intl Peer to Peer Computing, March 2002.
Workshop on Global and Peer-to-Peer Computing, April [189] M. Freedman, R. Morris, Tarzan: a peer-to-peer anony-
2021, 2004. mizing network layer, in: Proceedings of 9th ACM
[173] T. Papaioannou, G. Stamoulis, Eective use of reputation Conference on Computer and Communications Security,
in peer-to-peer environments, in: Fourth Intl Workshop on 2002, pp. 193206.
Global and Peer-to-Peer Computing, April 2021, 2004. [190] M. Feldman, C. Papadimitriou, J. Chuang, I. Stoica, Free-
[174] M. Blaze, J. Feigenbaum, J. Lacy, Trust and Reputation in riding and whitewashing in peer-to-peer systems, in: 3rd
P2P Networks, 2003. Available from: <http://www.neuro- Annual Workshop on Economics and Information Security
grid.net/twiki/bin/view/Main/ReputationAndTrust>. WEIS04, May 2004.
[175] E. Damiani, D.C. di Vimercati, S. Paraboschi, P. Samarati, [191] L. Ramaswamy, L. Liu, FreeRiding: a new challenge for
F. Violante, A reputation-based approach for choosing peer-to-peer le sharing systems, in: Proceedings of 2003
reliable resources in peer to peer networks, in: Proceedings Hawaii Intl Conference on System Sciences, P2P Track,
of the 9th Conference on Computer and Communications HICSS2003, January 69, 2003.
Security, 2002, pp. 207216. [192] T.-W. Ngan, D. Wallach, P. Druschel, Enforcing fair
[176] S. Marti, P. Ganesan, H. Garcia-Molina, DHT routing sharing of peer-to-peer resources, in: Second Intl Work-
using social links, in: The 3rd Intl Workshop on Peer-to- shop on Peer-to-Peer Systems, IPTPS03, February 2021,
Peer Systems, February 2627, 2004. 2003.
[177] G. Caronni, M. Waldvogel, Establishing trust in distributed [193] L. Cox, B.D. Noble, Samsara: honor among thieves in peer-
storage providers, in: Proceedings of Third Intl IEEE to-peer storage, in: Proceedings of Nineteenth ACM
Conference on Peer-to-Peer Computing, September 13, Symposium on Operating System Principles, 2003, pp.
2003, pp. 128133. 120132.
[194] M. Surridge, C. Upstill, Grid security: lessons for peer-to- [211] S. Iyer, A. Rowstron, P. Druschel, Squirrel: a decentralized
peer systems, in: Proceedings of Third Intl IEEE Conference peer-to-peer web cache, in: Proceedings of the 21st Annual
on Peer-to-Peer Computing, September 13, 2003, pp. 26. Symposium on Principles of Distributed Computing, 2002,
[195] E. Sit, R. Morris, Security considerations for peer-to-peer pp. 213222.
distributed hash tables, in: First Intl Workshop on Peer-to- [212] M. Bawa, R. Bayardo, S. Rajagopalan, E. Shekita, Make it
Peer Systems, March 2002. fresh, make it quick: searching a network of personal
[196] C. ODonnel, V. Vaikuntanathan, Information leak in the webservers, in: Proceedings of the 12th International
Chord lookup protocol, in: Proceedings of Fourth IEEE Conference on World Wide Web 2003, pp. 577586.
Intl Conference on Peer-to-Peer Computing, August 25 [213] B.T. Loo, S. Krishnamurthy, O. Cooper, Distributed web
27, 2004. crawling over DHTs, Technical Report, CSD-04-1305,
[197] K. Berket, A. Essiari, A. Muratas, PKI-based security for February 9 2004.
peer-to-peer information sharing, in: Proceedings of Fourth [214] M. Junginger, Y. Lee, A self-organizing publish/subscribe
IEEE Intl Conference on Peer-to-Peer Computing, August middleware for dynamic peer-to-peer networks, IEEE
2527, 2004. Network 18 (1) (2004) 3843.
[198] B. Karp, S. Ratnasamy, S. Rhea, S. Shenker, Spurring [215] F. Cuenca-Acuna, C. Peery, R. Martin, T. Nguyen,
adoption of DHTs with OpenHash, a public DHT service, PlanetP: using gossiping to build content addressable
in: The 3rd Intl Workshop on Peer-to-Peer Systems, peer-to-peer information sharing communities, in: Pro-
February 2627, 2004. ceedings of the 12th International Symposium on
[199] J. Considine, M. Walsh, D.G. Andersen, A pragmatic High Performance Distributed Computing (HPDC), June
approach to DHT adoption, Technical Report, December 2002.
2003. [216] M. Walsh, H. Balakrishnan, S. Shenker, Untangling the
[200] G. Li, Peer to peer networks in action, IEEE Internet web from DNS, in: Proceedings of First Symposium on
Computing 6 (1) (2002) 3739. Networked Systems Design and Implementation NSDI04,
[201] A. Mislove, A. Post, C. Reis, P. Willmann, P. Druschel, D. March 2931, 2004, pp. 225238.
Wallach, X. Bonnaire, P. Sens, J.-M. Busca, L. Arantes- [217] B. Awerbuch, C. Scheideler, Robust distributed name
Bezerra, POST: a secure, resilient, cooperative messaging service, in: The 3rd Intl Workshop on Peer-to-Peer
system, in: 9th Workshop on Hot Topics in Operating Systems, February 2627, 2004.
Systems, HotOS, May 2003. [218] A. Iamnitchi, Resource Discovery in Large Resource-
[202] S. Saroiu, P. Gummadi, S. Gribble, A measurement study Sharing Environments, Doctoral Dissertation 2003.
of peer-to-peer le sharing systems, in: Proceedings of [219] R. Cox, A. Muthitacharoen, R. Morris, Serving DNS using
Multimedia Computing and Networking 2002 MMCN02, a peer-to-peer lookup service, in: First Intl Workshop on
January 2002. Peer-to-Peer Systems (IPTPS), March 2002.
[203] A. Muthitacharoen, R. Morris, T. Gil, B. Chen, Ivy: a read/ [220] A. Chander, S. Dawson, P. Lincoln, D. Stringer-Calvert,
write peer-to-peer le system, in: ACM SIGOPS Operating NEVRLATE: scalable resource discovery, in: Second
Systems Review, Special issue on Decentralized Storage IEEE/ACM Intl Symposium on Cluster Computing and
Systems, December 2002, pp. 3144. the Grid CCGRID2002 2002, pp. 5665.
[204] A. Muthitacharoen, R. Morris, T. Gil, B. Chen, A read/ [221] M. Balazinska, H. Balakrishnan, D. Karger, INS/Twine: A
write peer-to-peer le system, in: Proceedings of 5th scalable peer-to-peer architecture for intentional resource
Symposium on Operating System Design and Implementa- discovery, in: Proceedings of First Intl Conference on
tion (OSDI 2002), Boston, MA, December 2002. Pervasive Computing (IEEE), 2002.
[205] F. Annexstein, K. Berman, M. Jovanovic, K. Ponnavaikko, [222] J. Kangasharju, K. Ross, D. Turner, Secure and resilient
Indexing techniques for le sharing in scalable peer to peer peer-to-peer E-mail: design and implementation, in: Pro-
networks, in: 11th IEEE Intl Conference on Computer ceedings of Third Intl IEEE Conference on Peer-to-Peer
Communications and Networks, 2002, pp. 1015. Computing, September 13, 2003.
[206] G. Kan, Y. Faybishenko, Introduction to Gnougat, in: [223] V. Lo, D. Zappala, D. Zhou, Y. Liu, S. Zhao, Cluster
First Intl Conference on Peer-to-Peer Computing 2001, pp. computing on the y: P2P scheduling of idle cycles in the
412. Internet, in: The 3rd Intl Workshop on Peer-to-Peer
[207] R. Gold, D. Tidhar, Towards a content-based aggregation Systems, February 2627, 2004.
network, in: Proceedings of First Intl Conference on Peer [224] A. Iamnitchi, I. Foster, D. Nurmi, A peer-to-peer approach
to Peer Computing, 2001, pp. 6268. to resource discovery in grid environments, in: IEEE High
[208] F. Dabek, M.F. Kaashoek, D. Karger, R. Morris, I. Stoica, Performance Distributed Computing, 2002.
Wide-area cooperative storage with CFS, in: Proceedings of [225] I. Foster, A. Iamnitchi, On Death, Taxes and the conver-
18th ACM symposium on Operating System Principles gence of peer-to-peer and grid computing, in: Second Intl
2001, pp. 202215. Workshop on Peer-to-Peer Systems IPTPS 03, February
[209] M. Freedman, E. Freudenthal, D. Mazieres, Democratizing 2021, 2003.
content publication with coral, in: Proceedings of First [226] W. Hoschek, Peer-to-Peer Grid Databases for Web Service
Symposium on Networked Systems Design and Implemen- Discovery, ConcurrencyPractice and Experience, 2002,
tation NSDI04, March 2931, 2004, pp. 239252. pp. 17.
[210] J. Li, B.T. Loo, J. Hellerstein, F. Kaashoek, D. Karger, R. [227] K. Aberer, A. Datta, M. Hauswirth, A decentralized public
Morris, On the feasibility of peer-to-peer web indexing and key infrastructure for customer-to-customer e-commerce,
search, in: Second Intl Workshop on Peer-to-Peer Systems Intl Journal of Business Process Integration and Manage-
IPTPS 03, February 2021, 2003. ment (2004).
[228] S. Ajmani, D. Clarke, C.-H. Moh, S. Richman, ConChord: [247] A. Gupta, B. Liskov, R. Rodrigues, Ecient routing for
cooperative SDSI certicate storage and name resolution, peer-to-peer overlays, in: First Symposium on Networked
in: First Intl Workshop on Peer-to-Peer Systems IPTPS, Systems Design and Implementation NSDI, March 2004.
March 2002. [248] A. Mizrak, Y. Cheng, V. Kumar, S. Savage, Structured
[229] J. Li, J. Stribling, T. Gil, R. Morris, F. Kaashoek, superpeers: leveraging heterogeneity to provide constant-
Comparing the performance of distributed hash tables time lookup, in: IEEE Workshop on Internet Applications,
under churn, in: The 3rd Intl Workshop on Peer-to-Peer June 2324, 2003.
Systems, February 2627, 2004. [249] L. Adamic, R. Lukose, A. Puniyani, B. Huberman, Search
[230] S. Shenker, The data-centric revolution in networking, in power-law networks, Physical Review E, The American
Keynote Speech, in: 29th Intl Conference on Very Large Physical Society 64 (046135) (2001).
Data Bases, September 912, 2003. [250] F. Banaei-Kashani, C. Shahabi, Criticality-based analysis
[231] S. Gribble, A. Halevy, Z. Ives, M. Rodrig, D. Suciu, What and design of unstructured peer-to-peer networks as
can databases do for P2P? in: Proceedings of Fourth Intl complex systems, in: Proceedings of the 3rd IEEE/
Workshop on Databases and the Web, WebDB2001, May ACM Intl Symposium on Cluster Computing and the
2425, 2001. Grid, 2003, pp. 351358.
[232] D. Clark, The design philosophy of the DARPA Internet [251] KaZaa, KaZaa Media Desktop, 2001. Available from:
protocols, in: ACM SIGCOMM Computer Communica- <www.kazaa.com>.
tion Review, Symposium Proceedings on Communications [252] S. Sen, J. Wang, Analyzing peer-to-peer trac across large
Architectures and Protocols 18 (4), 1988. networks, in: Proceedings of the Second ACM SIGCOMM
[233] J.-C. Laprie, Dependable computing and fault tolerance: workshop on Internet measurement, November 0608,
concepts and terminology, in: Twenty-Fifth Intl Sympo- 2002, pp. 137150.
sium on Fault-Tolerant Computing, Highlights from [253] DirectConnect, 2001. Available from: <http://www.neo-
Twenty-Five Years, 1995, pp. 213. modus.com>.
[234] D. Clark, J. Wroclawski, K. Sollins, R. Braden, Tussle in [254] S. Saroiu, K. Gummadi, R. Dunn, S. Gribble, H. Levy, An
cyberspace: dening tomorrows Internet, in: Conference analysis of Internet content delivery systems, ACM SIG-
on Applications, Technologies, Architectures and Protocols OPS Operating Systems Review 36 (2002) 315327.
for Computer Communications 2002, pp. 347356. [255] A. Loo, The future or peer-to-peer computing, Communi-
[235] Clip2, The Gnutella Protocol Specication, 2000. Available cations of the ACM 46 (9) (2003) 5661.
from: <http://www.clip2.com>. [256] B. Yang, H. Garcia-Molina, Comparing hybrid peer-to-
[236] Napster, 1999. Available from: <http://www.napster.com>. peer systems (extended), in: 27th Intl Conference on Very
[237] J. Mishchke, B. Stiller, A methodology for the design of Large Data Bases, September 1114, 2001.
distributed search in P2P middleware, IEEE Network 18 (1) [257] D. Scholl, OpenNap Home Page, 2001. Available from:
(2004) 3037. <http://opennap.sourceforge.net/>.
[238] J. Li, K. Sollins, Implementing aggregation and broadcast [258] S. Ghemawat, H. Gobio, S.-T. Leung, The Google le
over distributed hash tables. Full report, November 2003. system, in: Proceedings of 19th ACM Symposium on
Available from: <http://krs.lcs.mit.edu/regions/docs.html>. Operating Systems Principles, 2003, pp. 2943.
[239] M. Castro, M. Costa, A. Rowstron, Should we build [259] I. Clarke, S. Miller, T. Hong, O. Sandberg, B. Wiley,
Gnutella on a structured overlay? ACM SIGCOMM Protecting free expression online with freenet, IEEE Inter-
Computer Communication Review 34 (1) (2004) 131136. net Computing 6 (1) (2002).
[240] A. Singla, C. Rohrs, Ultrapeers: Another Step Towards [260] J. Mache, M. Gilbert, J. Guchereau, J. Lesh, F. Ramli, M.
Gnutella Scalability, Version 1.0, November 26, 2002. Wilkinson, Request algorithms in Freenet-style peer-to-peer
Available from: <http://groups.yahoo.com/group/the_gdf/ systems, in: Proceedings of the Second IEEE Intl Confer-
les/Proposals/Working%20Proposals/Ultrapeer/>. ence on Peer to Peer Computing P2P02, September 57,
[241] B. Cooper, H. Garcia-Molina, Ad hoc, self-supervising peer- 2002.
to-peer search networks, Technical Report, 2003. Available [261] C. Rohrs, Query Routing for the Gnutella Networks, Version
from: <http://www.cc.gatech.edu/~cooperb/odin/>. 1.0, 2002. Available from: <http://www.limewire.com/devel-
[242] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information oper/query_routing/keyword%20routing.htm>.
Retrieval, Addison Wesley, Essex, England, 1999. [262] I. Clarke, Freenets Next Generation Routing Protocol,
[243] S. Sen, J. Wang, Analyzing peer-to-peer trac across large 20th July 2003. Available from: <http://freenetproject.org/
networks, IEEE/ACM Transactions on Networking 12 (2) index.php?page=ngrouting>.
(2004) 219232. [263] A.Z. Kronfol, FASD: A fault-tolerant, adaptive scalable
[244] H. Balakrishnan, S. Shenker, M. Walsh, Semantic-free distributed search engine, Masters Thesis, 2002. Available
referencing in linked distributed systems, in: Second Intl from: <http://www.cs.princeton.edu/~akronfol/fasd/>.
Workshop on Peer-to-Peer Systems IPTPS 03, February [264] S. Gribble, E. Brewer, J.M. Hellerstein, D. Culler, Scalable,
2021, 2003. distributed data structures for Internet service construction,
[245] B. Yang, P. Vinograd, H. Garcia-Molina, Evaluating in: Proceedings of the 4th Symposium on Operating Systems
GUESS and non-forwarding peer-to-peer search, in: The Design and Implementation OSDI 2000, October 2000.
24th Intl Conference on Distributed Computing Systems [265] K. Aberer, Ecient search in unbalanced, randomized
ICDCS04, March 2326, 2004. peer-to-peer search trees, EPFL Technical Report IC/2002/
[246] A. Gupta, B. Liskov, R. Rodrigues, One hop lookups for 79, 2002.
peer-to-peer overlays, in: 9th Workshop on Hot Topics in [266] R. Honicky, E. Miller, A fast algorithm for online
Operating Systems (HotOS), May 1821, 2003. placement and reorganization of replicated data, in: Pro-
ceedings of the 17th Intl Parallel and Distributed Process- [283] J. Risson, K. Robinson, T. Moors, Fault tolerant active
ing Symposium, April 2003. rings for structured peer-to-peer overlays, in: Proceedings
[267] G.S. Manku, Routing networks for distributed hash tables, of the 30th Annual IEEE Conference on Local Computer
in: Proceedings of the 22nd Annual ACM Symposium on Networks, November 1517, 2005, pp. 1825.
Principles of Distributed Computing, PODC 2003, July 13 [284] B. Awerbuch, C. Scheideler, Peer-to-peer systems for
16, 2003, pp. 133142. prex search, in: Proceedings of 22nd annual ACM
[268] S. Lei, A. Grama, Extended consistent hashing: a frame- Symposium on Principles of Distributed Computing 2003,
work for distributed servers, in: Proceedings of the 24th pp. 123132.
Intl Conference on Distributed Computing Systems [285] F. Dabek, B. Zhao, P. Druschel, J. Kubiatowicz, I. Stoica,
ICDCS 2004, March 2326, 2004. Towards a common API for structured P2P overlays, in:
[269] W. Litwin, Re: Chord & LH*, Email to Ion Stoica, March Proceedings of Second Intl Workshop on Peer to Peer
23, 2004. Systems IPTPS 2003, February 2003.
[270] J. Li, J. Stribling, R. Morris, F. Kaashoek, T. Gil, A [286] N. Feamster, H. Balakrishnan, Towards a logic for wide-
performance vs. cost framework for evaluating DHT design area Internet routing, in: Proceedings of ACM SIGCOMM
tradeos under churn, in: Proceedings of IEEE Infocom, Workshop on Future Directions in Network Architecture,
March 1317, 2005. August 2527, 2003, pp. 289300.
[271] S. Zhuang, D. Geels, I. Stoica, R. Katz, On failure [287] B. Ahlgren, M. Brunner, L. Eggert, R. Hancock, S. Schmid,
detection algorithms in overlay networks, in: Proceedings Invariants: a new design methodology for network archi-
of IEEE Infocomm, March 1317, 2005. tectures, in: Proceedings of ACM SIGCOMM Workshop
[272] X. Li, J. Misra, C.G. Plaxton, Active and concurrent on Future Direction in Network Architecture, August 30,
topology maintenance, in: The 18th Annual Conference on 2004, pp. 6570.
Distributed Computing (DISC 2004), Trippenhuis, Amster- [288] R. Mahajan, M. Castro, A. Rowstron, Controlling the cost
dam, The Netherlands, October 47, 2004. of reliability in peer-to-peer overlays, in: Second Intl
[273] K. Aberer, L.O. Alima, A. Ghodsi, S. Girdzijauskas, M. Workshop on Peer-to-Peer Systems IPTPS03, February
Hauswirth, S. Haridi, The essence of P2P: a reference 2021, 2003.
architecture for overlay networks, in: Proceedings of the 5th [289] S. Rhea, D. Geels, T. Roscoe, J. Kubiatowicz, Handling
International Conference on Peer-to-Peer Computing, churn in a DHT, Report No. UCB/CSD-03-1299, Univer-
August 31September 2, 2005. sity of California, in: Proceedings of USENIX Annual
[274] C. Tang, M. Buco, R. Chang, S. Dwarkadas, L. Luan, E. Technical Conference, June 2003.
So, C. Ward, Low trac overlay networks with large [290] M. Castro, M. Costa, A. Rowstron, Performance and
routing tables, in: Proceedings of the 2005 ACM Sigmet- dependability of structured peer-to-peer overlays, Micro-
rics International Conference on Measurement and soft Research Technical Report MSR-TR-2003-94, Decem-
Modeling of Computer Systems, June 610, 2005, pp. 14 ber. Also 2004 Intl Conference on Dependable Systems and
25. Networks, June 28July 1, 2003.
[275] S. Rhea, D. Geels, T. Roscoe, J. Kubiatowicz, Handling [291] D. Liben-Nowell, H. Balakrishnan, D. Karger, Analysis of
churn in a DHT, in: Proceedings of the USENIX Annual the evolution of peer-to-peer systems, in: Annual ACM
Technical Conference, June 2004. Symposium on Principles of Distributed Computing, 2002,
[276] C. Blake, R. Rodrigues, High Availability, Scalable storage, pp. 233242.
dynamic peer networks: pick two, in: 9th Workshop on Hot [292] L. Alima, S. El-Ansary, P. Brand, S. Haridi, DKS(N,k,f): a
Topics in Operating Systems (HotOS), Lihue, Hawaii, May family of low communication, scalable and fault-tolerant
1821, 2003. infrastructures for P2P applications, in: Proceedings of 3rd
[277] S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnas- IEEE/ACM Intl Symposium on Cluster Computing and
amy, S. Shenker, I. Stoica, H. Yu, OpenDHT: a public the Grid, 2003, pp. 344350.
DHT service and its uses, in: Proceedings of the Conference [293] D. Karger, M. Ruhl, Finding nearest neighbours in growth-
on Applications, Technologies, Architectures and Protocols restricted metrics, in: Proceedings of the 34th Annual ACM
for Computer Communications, August 2226, 2005, pp. Symposium on Theory of Computing, 2002, pp. 741750.
7384. [294] S. Ratnasamy, A scalable content-addressable network,
[278] T. Gil, F. Kaashoek, J. Li, R. Morris, J. Stribling, p2psim, a Doctoral Dissertation 2002.
simulator for peer-to-peer protocols, 2003. Available from: [295] S. McCanne, S. Floyd, The LBNL/UCB Network
<http://www.pdos.lcs.mit.edu/p2psim/>. Simulator.
[279] K. Hildrum, J.D. Kubiatowicz, S. Rao, B.Y. Zhao, [296] I. Abraham, D. Malkhi, O. Dubzinski, LAND: Stretch
Distributed object location in a dynamic network, Theory (1 + epsilon) locality aware networks for DHTs, in:
of Computing Systems, 2004. Proceedings of ACMSIAM Symposium on Discrete
[280] N. Lynch, D. Malkhi, D. Ratajczak, Atomic data access in Algorithms SODA-04, 2004.
distributed hash tables, in: Proceedings of Intl Peer-to-Peer [297] M. Naor, U. Wieder, Novel architectures for P2P applica-
Symposium, March 78, 2002. tions: the continuous-discrete approach, in: Proceedings of
[281] S. Gilbert, N. Lynch, A. Shvartsman, RAMBO II: Rapidly Fifteenth Annual ACM Symposium on Parallel Algorithms
Recongurable Atomic Memory for Dynamic Networks, and Architectures, SPAA 2003, June 79, 2003, pp. 50
Technical Report, MIT-CSAIL-TR-890 2004. 59.
[282] N. Lynch, I. Stoica, MultiChord: a resilient namespace [298] N.D. de Bruijn, A combinatorial problem, Koninklijke
management algorithm, Technical Memo MIT-LCS-TR- Netherlands: Academe Van Wetenschappen 49 (1946) 758
936, 2004. 764.
[299] J.-W. Mao, The coloring and routing problems on de [319] A. Fisk, Gnutella Dynamic Query Protocol, v0.1, 2003.
Bruijn interconnection networks, Doctoral Dissertation, Available from: <http://groups.yahoo.com/group/the_gdf/
July 18, 2003. les/Proposals/Working%20Proposals/search/Dynamic%
[300] M.L. Schlumberger, de Bruijn communication networks, 20Querying/>.
Doctoral Dissertation 1974. [320] S. Thadani, Meta Data searches on the Gnutella Network
[301] M. Imase, M. Itoh, Design to minimized diameter on build- (addendum), 2001. Available from: <http://www.lime-
block network, IEEE Transactions on Computers C-30 (6) wire.com/developer/MetaProposal2.htm>.
(1981) 439442. [321] S. Thadani, Meta Information Searches on the Gnutella
[302] S.M. Reddy, D.K. Pradhan, J.G. Kuhl, Direct graphs with Networks, 2001. Available from: <http://www.lime-
minimal and maximal connectivity, Technical Report, wire.com/developer/metainfo_searches.html>.
School of Engineering, Oakland University, 1980. [322] P. Reynolds, A. Vahdat, Ecient peer-to-peer keyword
[303] R.A. Rowley, B. Bose, Fault-tolerant ring embedding in de searching, in: ACM/IFP/USENIX Intl Middleware Con-
Bruijn networks, IEEE Transactions on Computers 42 (12) ference, Middleware 2003, June 1620, 2003.
(1993) 14801486. [323] W. Terpstra, S. Behnel, L. Fiege, J. Kangasharju, A.
[304] K.Y. Lee, G. Liu, H.F. Jordan, Hierarchical networks for Buchmann, Bit Zipper Rendezvous, optimal data place-
optical communications, Journal of Parallel and Distrib- ment for general P2P queries, in: Proceedings of the First
uted Computing 60 (2000) 116. Intl Workshop on Peer-to-Peer Computing and Databases,
[305] M. Naor, U. Wieder, Know thy neighbors neighbor: better March 14, 2004.
routing for skip-graphs and small worlds, in: The 3rd [324] A. Singhal, Modern information retrieval: a brief over-
Intl Workshop on Peer-to-Peer Systems, February 2627, view, IEEE Data Engineering Bulletin 24 (4) (2001) 35
2004. 43.
[306] P. Fraigniaud, P. Gauron, The content-addressable net- [325] E. Cohen, A. Fiat, H. Kaplan, Associative search in peer to
works D2B, Technical Report 1349, Laboratoire de peer networks: harnessing latent semantics, in: IEEE
Recherche en Informatique, January 2003. Infocom 2003, The 22nd Annual Joint Conference of the
[307] A. Datta, S. Girdzijauskas, K. Aberer, On de Bruijn routing IEEE Computer and Communications Societies, March 30
in distributed hash tables: there and back again, in: April 3, 2003.
Proceedings of the Fourth IEEE Intl Conference on Peer- [326] W. Muller, A. Henrich, Fast retrieval of high-dimensional
to-Peer Computing, August 2527, 2004. feature vectors in P2P networks using compact peer data
[308] W. Pugh, Skip lists: a probabilistic alternative to balanced summaries, in: Proceedings of 5th ACM SIGMM Interna-
trees, in: Proceedings of Workshop on Algorithms and tional Workshop on Multimedia Information Retrieval,
Data Structures, August 1719, 1989, pp. 437449. November 7, 2003, pp. 7986.
[309] W. Pugh, Skip lists: a probabilistic alternative to balanced [327] M.T. Ozsu, P. Valduriez, Principles of Distributed Data-
trees, Communications of the ACM 33 (6) (1990) 668676. base Systems, second ed., Prentice-Hall, 1999.
[310] J. Gray, The transaction concept: virtues and limitations, [328] G. Salton, A. Wong, C.S. Yang, A vector space model for
in: Proceedings of VLDB, September 1981. automatic indexing, Communications of the ACM 18 (11)
[311] B.T. Loo, J.M. Hellerstein, R. Huebsch, S. Shenker, I. (1975) 613620.
Stoica, Enhancing P2P le-sharing with Internet-scale [329] S.E. Robertson, S. Walker, M. Beaulieu, Okapi at TREC-7:
query processor, in: Proceedings of the 30th Intl Confer- automatic ad hoc, ltering, VLC and ltering tracks, in:
ence on Very Large Data Bases VLDB 2004, 29 August3 Proceedings of Seventh Text REtrieval Conference, TREC-
September, 2004. 7, NIST Special Publication 500242, July 1999, pp. 253
[312] M. Stonebraker, P. Aoki, W. Litwin, A. Pfeer, A. Sah, J. 264.
Sidell, C. Staelin, A. Yu, Mariposa: a wide-area distributed [330] A. Singhal, J. Choi, D. Hindle, D. Lewis, F. Pereira, AT&T
database system, THE VLDB JournalThe Intl Journal at TREC-7, in: Proceedings of Seventh Text REtrieval
of Very Large Data Bases (5) (1996) 4863. Conference TREC-7, July 1999, pp. 253264.
[313] V. Cholvi, P. Felber, E. Biersack, Ecient search in [331] K. Sankaralingam, S. Sethumadhavan, J. Browne, Distrib-
unstructured peer-to-peer networks, in: Proceedings of uted Pagerank for P2P Systems, in: Proceedings of the 12th
Symposium on Parallel Algorithms and Architectures, July international symposium on High Performance Distributed
2004. Computing HPDC, June 2224, 2003.
[314] S. Daswani, A. Fisk, Gnutella UDP extension for scalable [332] I. Klampanos, J. Jose, An architecture for information
searches (GUESS) v0.1, 2002. Available from: <http://www. retrieval over semi-collaborated peer-to-peer networks, in:
limewire.org/sheye/viewrep/~raw,r=1.2/limecvs/core/guess_ Proceedings of 2004 ACM symposium on applied comput-
01.html>. ing 2004, pp. 10781083.
[315] A. Fisk, Gnutella Dynamic Query Protocol v0.1, Gnutella [333] C. Tang, Z. Xu, S. Dwarkadas, Peer-to-peer information
Developer Forum, 2003. retrieval using self-organizing semantic overlay networks,
[316] O. Gnawali, A keyword set search system for peer-to-peer in: Proceedings of 2003 Conference on Applications,
networks, Masters thesis, 2002. Technologies, Architectures and Protocols for Computer
[317] Limewire, Limewire Host Count, 2004. Available from: Communications, August 2529, 2003, pp. 175186.
<http://www.limewire.com/english/content/netsize.shtml>. [334] C. Tang, S. Dwarkadas, Hybrid globallocal indexing for
[318] A. Fisk, Gnutella Ultrapeer Query Routing, v0.1, 2003. ecient peer-to-peer information retrieval, in: Proceedings
Available from: <http://groups.yahoo.com/group/the_gdf/ of the First Symposium on Networked Systems Design and
les/Proposals/Working%20Proposals/search/Ultrapeer% Implementation NSDI04, March 2931, 2004, pp. 211
20QRP/>. 224.
[335] G.W. Furnas, S. Deerwester, S.T. Dumais, T.K. Landauer, [344] R. van Renesse, The importance of aggregation, in: A.
R.A. Harshman, L.A. Streeter, K.E. Lochbaum, Informa- Schiper, A.A. Shvartsman, H. Weatherspoon, B.Y. Zhao
tion retrieval using a singular value decomposition model of (Eds.), Future Directions in Distributed Computing,
latent semantic structure, in: Proceedings of 11th Annual Springer-Verlag Lecture Notes in Computer Science, vol.
Intl ACM SIGIR Conference on Research and Develop- 2584, Springer-Verlag, Heidelberg, 2003.
ment in Information Retrieval, 1988, pp. 465480.
[336] C. Tang, S. Dwarkadas, Z. Xu, On scaling latent semantic
indexing for large peer-to-peer systems, in: The 27th John Risson has a decade of engineering
Annual Intl ACM SIGIR Conference on SIGIR04, experience with a leading telecommuni-
ACM Special Interest Group on Information Retrieval, cations provider in Australia. He led the
July 2004. implementation team responsible for the
[337] W. Litwin, S. Sahri, Implementing SD-SQL Server: a introduction of national high-speed cor-
scalable distributed database system, CERIA Research porate data services. He also has two
Report 2004-04-02, April 2004. years of business development experience
[338] M. Jarke, J. Koch, Query optimization in database systems, amongst Internet service providers in
ACM Computing Surveys 16 (2) (1984) 111152. north-east Asia with responsibilities for
[339] G.S. Manku, M. Bawa, P. Raghavan, Symphony: distrib- IP, MPLS and optical ethernet products.
uted hashing in a small world, in: Proceedings of the 4th He received his BEng (1st Class Hons)
USENIX Symposium on Internet Technologies and Sys- from the University of Queensland, Australia and the MEng from
tems, March 2628, 2003. Monash University, Australia. He is a Ph.D. candidate at the
[340] J.L. Bentley, Multidimensional binary search trees used for University of New South Wales, Australia. He is also a member of
associative searching, Communications of the ACM 18 (9) the ACM, the IEEE and the Australian Computer Society.
(1975) 509517.
[341] B. Chun, I. Stoica, J. Hellerstein, R. Huebsch, S. Jeery,
B.T. Loo, S. Mardanbeigi, T. Roscoe, S. Rhea, S. Schenker, Tim Moors is a Senior Lecturer in the
Querying at Internet Scale, in: Proceedings of 2004 ACM School of Electrical Engineering and
SIGMOD International Conference on Management of Telecommunications at the University of
Data, Demonstration Session 2004, pp. 935936. New South Wales, in Sydney, Australia.
[342] P. Cao, Z. Wang, Ecient top-K query calculation in He researches network reliability, trans-
distributed networks, in: Proceedings of the 23rd Annual port protocols, and wireless LAN MAC
ACM SIGACTSIGOPS Symposium on Principles of protocols. Previously, he was with the
Distributed Computing PODC 2004, July 2528, 2004, Center for Advanced Technology in
pp. 206215. Telecommunications at Polytechnic
[343] D. Psaltoulis, I. Kostoulas, I. Gupta, K. Birman, A. University in New York, and prior to
Demers, Practical algorithms for size estimation in large that, with the Communications Division
and dynamic groups, in: Proceedings of the Twenty-Third of the Australian Defence Science and Technology Organisation.
Annual ACM SIGACTSIGOPS Symposium on Principles He received his Ph.D. and BEng (Hons) degrees from universities
of Distributed Computing, PODC 2004, July 2528, 2004. in Western Australia (Curtin and UWA).

1 s2.0 S1389128606000223 Main

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

1 s2.0 S1389128606000223 Main

Hochgeladen von

Copyright:

Verfügbare Formate

Computer Networks 50 (2006) 34853521

Survey of research towards robust peer-to-peer

Responsible Editor: I.F. Akyildiz

1. Introduction self-organizing P2P network automatically adapts

Some have suggested that peers are inherently unre-

information about documents can be found[242]. 2.1. Local index

Fig. 2. Topology maintenance in distributed hash tables [39,247,270276].

insertions. A prerequisite is that lookup and inser- 3.6. Butteries

Fig. 3. Keyword lookup in P2P systems [4,21,61,6567,210,240,245,313316].

the top document matches and automatically adds 5. Queries

queries. Here we probe below the language syntax

Das könnte Ihnen auch gefallen