Sie sind auf Seite 1von 105

Peer-to-Peer Networking:

An Overview

Raouf Boutaba
Department of Computer Science
University of Waterloo
rboutaba@bbcr.uwaterloo.ca

1
Evolution of Distributed Computing Models
‰ Client-Server

Client invocation invocation Server

result Server result

Client
Key: Process: Computer:

 Client: Process wishing to access data, use resources or


perform operations on a different computer
 Server: Process managing data and all other shared resources
amongst servers and clients, allows clients access to resource
and performs computation
 Interaction: invocation / result message pairs
2
Variants of Client-Server Model
‰ A service provided by multiple servers
Service

Server
Client

Server

Client
Server

 Examples: Many commercial web services are implemented


through different physical servers
 Motivation
Î Performance (e. g., cnn. com, download servers, etc.)
Î Reliability
3
Variants of Client-Server Model (2)
‰ Web Proxy Server

Client Server

Proxy
Server

Client Server

 Render replication/ distributedness transparent


 Caching
ÎProxy server maintains cache store of recently
requested resources
4
Variants of Client-Server Model (3)
‰ Mobile Code
 Code that is sent to a client process to carry out a specific
task
 Examples
Î Applets
Î Active Packets (containing communications protocol code)
 a) Client request results in downloading of applet code

Client Web
Server
Applet code

 b) Client interacts with the applet

Client Applet Web


Server

5
Variants of Client-Server Model (4)
‰ Mobile Agents
 Executing program (code + data), migrating amongst processes,
carrying out of an autonomous task, usually on behalf of some
other process
 Advantages: flexibility, savings in communications cost
 Virtual markets, worm programs

‰ Network Computers
 Downloads its operating system and any application software
needed by the user from a remote file server
 Applications are run locally but the files are managed by a
remote file server
 User can migrate from one network computer to another since
all application code and data is stored by a file server

6
Variants of Client-Server Model (5)
‰ Thin Clients and Compute Servers
 Executing windows- based user interface on local computer
while application executes on compute server

 Example: X11 server (run on the application client side)

 In reality: Palm Pilots, Mobile phones

Network Computer or PC Compute Server

Thin Network Web


Client Server

7
Variants of Client-Server Model (6)
‰ Spontaneous Networking
Gateway Music
service Alarm
service
Internet

Hotel wireless
Discovery network
service
Camera

TV/PC Guest’s
Laptop PDA Devices
 Discovery services
Î Services available in the network
Î Their properties, and how to access them (including device-
specific driver information)
8
The Peer-to-Peer Model
‰ Applications based on peer processes
 Not Client-Server
 processes that have largely identical functionality

Application Application

Coordination Coordination
Code Code

Application

Coordination
Code

9
The Peer-to-Peer Mania
€ Started in the middle of 2000
€ When the Internet felt into predictable patterns
€ Computer field shocks:
Î Napster, SETI@home, Freenet, Gnutella, Jabber, …

Î many early P2P projects have an overtly political mission

€ Emergence of sporadically connected Internet nodes (laptops,


handhelds, cell phones, appliances, …)

These developments return content, choice, and control to ordinary


users. Tiny endpoints on the Internet, sometimes without even
knowing each other, exchange information and form communities.
10
The Peer-to-Peer Mania (Cont)
€ A new energy erupting in the computing field
€ Yet, P2P is the oldest architecture in the world of
communications
Î Telephones are peer-to-peer
Î Usenet implementation of UUCP
Î Routing in the Internet
Î Internet endpoints have historically been peers

P2P technologies return the Internet to its original


vision, in which everyone creates as well as consumes
11
The Peer-to-Peer Mania (Cont)
€ An Internet evolution: loosening the virtual from the
physical
Î DNS decoupled names from physical locations
Î URNs allow to retrieve documents without knowing domain
names
Î Virtual hosting and replicated servers changed the one-to-one
relationship of names to systems

The next major conceptual leap: let go of the notion


of location
12
Outline
€ Definition
€ Overlay networks
€ Goals
€ P2P Applications
€ Classification of P2P systems
€ Design requirements
€ Case Studies
€ Putting it all together
13
Definitions
€ Everything except the
client/server model
€ Network of nodes with
equivalent
capabilities/responsibi
lities (symmetrical)
€ Nodes are both
Servers and clients
called “Servents”

14
Definitions (Cont)
€ A transient network that allows a group of computer
users to connect with each other and collaborate by
sharing resources (CPU, storage, content).

€ The connected peers construct a virtual overlay


network on top of the underlying network
infrastructure

€ Examples of overlays:
Î BGP routers and their peering relationships
Î Content distribution networks (CDNs)
Î Application-level multicast
Î And P2P apps !

15
Overlay Networks
€ An overlay network is a set of
logical connections between
end hosts

€ Overlay networks can be


unstructured or structured

€ Proximity not necessarily


taken into account

€ Overlay maintenance is an
issue
Overlay edge
16
Overlays: All in the application layer
application

€Design flexibility transport


network
data link
physical

ÎTopology
Î Protocol
Î Messaging over
TCP, UDP, ICMP

application
application
Underlying physical
transport
transport network
network data link
net is transparent to data link
physical
physical

developer

17
Goals
€ Cost reduction through cost sharing
Î Client/Server: Server bears most of the cost
Î P2P: Cost spread over all the peers (+Napster, ++Gnutella,…)
Î aggregation of otherwise unused resources (e.g., seti@home)

€ Improved scalability/reliability
Î resource discovery and search (eg. Chord, CAN, …)

€ Interoperability
Îfor the aggregation of diverse resources (storage, CPU, …)

€ Increased autonomy
Î independence from servers, hence providers (e.g., A way
around censorship, licensing restrictions, etc.)

18
Goals (Cont)
€ Anonymity/privacy
Î Difficult to ensure with a central server
Î Required by users who do not want a server/provider to know
their involvement in the system
ÎFreenet is a prime example
€ Dynamism
Î Resources (e.g., compute nodes) enter and leave the system
continuously
Î Mechanisms are required to avoid polling (e.g., “buddy lists” in
Instant messaging)

€ Ad hoc communications
ÎP2P systems typically do not rely on an established
infrastructure
Îthey build their own, e.g. logical overlay in CAN
19
P2P Application areas
€ File Sharing
€ Communication
€ Collaboration
€ Computation
€ Databases
€ Others

20
P2P File Sharing
€ File exchange: Killer application!
€ (+) Potentially unlimited file exchange areas
€ (+) High available safe storage: duplication and
redundancy
€ (+) Anonymity : preserve anonymity of authors and
publishers
€ (+) Manageability
€ (-) Network bandwidth consumption
€ (-) Security
€ (-) Search capabilities
21
P2P File Sharing (cont)
‰ Examples of P2P file sharing applications:
€Napster
Î disruptive; proof of concept
€Gnutella
Î open source
€KaZaA
Î today more KaZaA traffic then Web traffic!
€eDonkey
Î becoming popular in Europe
Î appears to use a DHT
€and many others…

‰ How do you explain this success?


22
P2P Communication
€ Instant Messaging (IM)
ÎUser A runs IM client on her PC
ÎIntermittently connects to Internet; gets new IP address for
each connection
ÎRegisters himself with “system”
ÎLearns from “system” that user B in his “buddy list” is active
ÎUser A initiates direct TCP connection with User B: P2P
ÎUser A and User B chat.
ÎCan also be voice, video and text.

€ Audio-Video Conferencing
Î Example: Voice-over-IP (Skype)

23
P2P Collaboration
€ Application-level user collaboration
€ Shared Applications
Î Example: Shared file editing (eg. Distributed Powerpoint)

€ Online games
Î Multi-players, distributed
Î Example: Descent (www.planetdescent.com)

€ Technical challenges
Î Locating of peers
Î Fault tolerance
Î Real-time constraints

€ Example:
Î Groove
24
P2P Computation
€ Achieves processing scalability by aggregating the
resources of large number of individual PC.
€ Application areas
ÎFinancial applications
ÎBiotechnology
ÎAstronomy,…

€ Relative project
Îseti@home
ÎAvaki
ÎEntropia
ÎGridella

25
P2P Databases
€ Fragments large database over physically distributed
nodes
€ Overcomes limitations of distributed DBMS
ÎStatic topology
ÎHeavy administration work

€ Dissemination of data sources over the Internet


ÎEach peer is a node with a database
ÎSet of peers changes often (site availability, usage patterns)

€ No global schema
€ Examples:
ÎAmbientDB
ÎXpeer : self-organizing XML DB
26
Other Applications
‰ P2P Applications built over emerging overlays
€ PlanetLab: to conduct Internete-scale experiments

‰ DHTs and their applications


€ for data storage and retrieval
€ directed storage/searches:
Îassign particular nodes to hold particular content (or a
pointer to it)
ÎIntroduce a hash function to map the object being
searched for to a unique identifier
Îwhen a node wants that content, go to the node that is
supposed to have or know about it
€ Examples: Chord, CAN, Pastry, Tapestry
27
P2P Classification
‰ Degree of P2P decentralization
€ Hybrid decentralized P2P
€ Purely decentralized P2P
€ Partially centralized P2P

‰ Degree of P2P structure


€ Structured P2P
€ Loosely structured P2P
€ Unstructured P2P
28
Hybrid decentralized P2P
€ Central server facilitates the interaction b/w peers.
€ Central server performs the lookups and identifies the
nodes of the network.
€ example: Napster
€(-) Single point of failure, scalability?, …

Index

Data 29
Purely decentralized P2P
€ network nodes perform the same tasks (Servents)
€ no central coordination activity
€ examples: original Gnutella, Freenet
€ (-) data consistency?, Manageability?, Security?,
Comm. overhead

30
Partially centralized P2P
€ some of the nodes assume a more important role
€ Supernodes act as local central indexes
€ examples: Kazaa, recent Gnutella

files

List of files

31
Unstructured P2P
€data is distributed randomly over the peers and
broadcasting mechanisms are used for searching.
€placement of data is unrelated to the overlay topology.
€examples: Napster, Gnutella, KaZaa

Where is “music A”? I have it Download


Music A

Download
Music A Where is “music A”?
Reporting a file list
music A is…

Purely decentralized Hybrid decentralized


32
Structured P2P
€Network topology is tightly controlled and files are
placed at precisely specified locations.
€Provide a mapping between the file identifier and
location
€Examples: Chord, CAN, PAST, Tapestry, Pastry, etc.
Publish file or file position to node that
has appropriate key value music A
7239
3482 2429
File has a key value
music A
2145
Key (music A) = 2429
6823
4321 7328

9832
33
Loosely structured P2P
€Between structured and unstructured
€File locations are affected by routing hints, but they
are not completely specified.
€example: Freenet
“B” may be
“A” may on right
be on left side…
Here I am!! side

A
B

There is
no “B”
34
P2P classification summary

35
Design requirements
€ Decentralization
€ Scalability
€ Anonymity
€ Self-Organization
€ Performance
€ Fault Resilience
€ Security
€ Others

36
Decentralization
€ Emphasizes users’ ownership and control of data and
resources

37
Scalability
€ The ease with which the system remains tractable with
an increasing number of nodes and data elements
€ Important feature because of the very large number of
participants and data elements
€ Immediate benefit of decentralization
€ Early p2p systems had limited scalability!
ÎNapster was able to scale up to about 6 millions of users at the
peak of its service
€ Achieving good scalability should not be at the expense
of other desirable features.
ÎHybrid P2P systems, such as Napster, intentionally keep some
amount of the operations and files centralized.
ÎMany Structured lookup protocols are proposed
38
Anonymity
€ The degree to which a system allows for anonymous transactions
€ Do not want someone to identify author, publisher, reader,
server, or document on the systems

39
Self-organization
€“A process where the organization of a system
is spontaneously established, i.e, without this
being controlled by the environment or an
encompassing or otherwise external system”
€Needed for scalability and fault resilience, and
because of intermittent connection of
resources, and cost of ownership.
€Self-maintenance and self-repair
€DHTs are a means to achieve self-organization
(e.g., in Chord, Pastry, …)

40
Performance
€ P2P systems aim to improve performance by
aggregating distributed storage capacity (e.g.,
Napster, Gnutella) and processing cycles (e.g.,
seti@home)
€ Performance is influenced by 3 types of resources:
processing, storage, and bandwidth
€ Performance: how long it takes to retrieve a file or
how much bandwidth will a query consume?
€ To optimize performance
ÎReplication
ÎCaching
ÎIntelligent routing & network organization
41
Fault resilience
€ P2P systems are faced with failures commonly
associated with systems.
ÎDisconnections
ÎUnreachability
ÎPartitions
ÎNode Failure

€The resilience of the connectivity when


failures are encountered by the arbitrary
departure of peers
42
Security
‰ Protection against:
€ Routing attacks
Î Redirect queries in wrong direction or to non-existing nodes
Î Misleading updates
Î Partition
€ Storage/Retrieval attacks
Î Node responsible for holding data item does not store or deliver it
as required
€ Inconsistent behavior
ÎNode sometimes behaves and sometimes does not
€ Identity spoofing
ÎNode claims to have an identity that belongs to another node
ÎNode delivers bogus content
€ DoS attacks
Î Let legitimate (authenticated) users through
€ Rapid joins/leaves

‰ In P2P systems, security issues are raised by the presence of


malicious peers
€ Trust assessment and management
43
Others
€ Cost of ownership
Î Sharing the cost of owning and maintaining a system and its content
€ Ad-hoc connectivity
Î Tolerate sudden disconnection and ad-hoc addition of nodes
€ Transparency
Î Make the system easy to use
Î Location, address, naming, administration
€ Interoperability
Î Compatibility (protocols, systems, how to advertise and maintain the
same levels of security, ...)?
Î Standardization efforts: P2P Working, Grid Forum, JXTA
€ Fairness
Î The extent to which the workload is evenly distributed across
nodes

44
Case Studies
File sharing P2P systems

45
Napster
ˆ Hybrid decentralized,
unstructured

ˆ Combination of client/server
and P2P approaches
ˆ A network of registered
users running a client
software, and a central
directory server
ˆ The server maintains 3
tables:
 (File_Index , File_Metadata)
 (User_ID , User_Info)
 (User_ID , File_Index)
46
Gnutella
ˆ Pure decentralized, unstructured

ˆ Characteristic:
 Few nodes with high connectivity.
 Most nodes with sparse connectivity.

ˆ Goal: distributed and anonymous file sharing

ˆ Each application instance (node) :


 stores/serves files
 routes queries to its neighbors
 responds to request queries

47
Gnutella (cont.)
Join Search File Transfer

E E E
A A A

C C x.mp3
C
B D B D B D
x.mp3 x.mp3 x.mp3
A’s Ping
B’s Pong A’s Query A’s file req.
C’s Pong B’s Query Hit B’s file resp.
D’s Pong D’s Query Hit
E’s Pong

48
Gnutella (cont.)
ˆ Advantages:
 Robustness to random node failure
 Completeness (constrained by the TTL)

ˆ Disadvantages:
 Communication overhead
 Network partition (controlled flooding)
 Security

ˆ Free riding problem

ˆ Tradeoff:
 Low TTL => low communication overhead
 High TTL => high search horizon
49
Freenet
ˆ Purely decentralized, loosely structured
ˆ Goal: provide anonymous method for storing and
retrieving data (files)
ˆ 2 operations:
 Insertion
 Search

ˆ Indexes (keys) are used for:


 file clustering
 Assisting in routing
 Search optimization

ˆ Each node maintains local data-store (files with similar


keys) and routing table
50
Freenet
ˆ Bootstrapping
 No explicit solution

ˆ File insertion
 User assigns a hash key to file
 sends an insert message to the user’s own node
 Node checks its data store for collision
 Node looks up the closest key and forwards the message to
another node
 This is done recursively until TTL expires or collision
detected.
 If no collision, user sends data down the path established by
the insert message.
 Each node along the path stores it and creates a routing table
entry 51
Freenet (cont.)
‰ Search Connection
Query : AFCD3
Data request
a Data reply
AFCD3 Request failed
Chain
AFCD3
c b
d
AFCD3

Chain
f e
AFCD3

52
Freenet
ˆ Advantages:
 Anonymity
 Robustness to random node failure
 Low communication overhead
 More scalable
 Self-organized

ˆ Disadvantages:
 Poor routing decision
 Spam
 Security

53
KaZaA
ˆ Partially centralized,
unstructured
ˆ Every peer is either a
supernode (SN) or an
ordinary node (ON)
assigned to a
supernode
ˆ Each supernode knows
where the other
supernodes are (Mesh
structure)

54
KaZaA (cont.)
ˆ Bootstrap:
 Connection to a known SN
 Upload METADATA for shared files
Î file name
Î file size
Î file descriptor
Î ContentHash

ˆ Search:
 ON -> SN (-> SN)*
 Keyword-based
 ContentHash in result

ˆ Download:
 HTTP
 ContentHash-based
 Failure -> automatic search on ContentHash
55
KaZaA (cont.)
ˆ Advantages:
 Scalability
 Efficiency
 Exploits heterogeneity of peer
 Fault-tolerance

ˆ Disadvantages
 Pollution
 DoS attacks on SN

56
Case Studies
Distributed Computing

57
seti@home
ˆ Goal : to discover alien
civilizations
ˆ Analyzes the radio
emission from the space
collected by the radio
telescopes using
processing power of
millions of unused internet
PCs
ˆ Two major components : a database server and clients
ˆ Supported platforms : Windows, Linux, Solaris, and HP-UX

58
Case Studies
Distributed Hash Tables

59
What is a DHT?

ˆ Hash Table
 data structure that maps “keys” to “values”

ˆ Interface
 put(key, value)
 get(key)

ˆ Distributed Hash Table (DHT)


 similar, but spread across the Internet
 challenge: locate content

60
What is a DHT? (cont.)
ˆ Single-node hash table:
Key = hash (data)
put(key, value)
get(key)->value

ˆ Distributed Hash Table (DHT):


Key = hash (data)
Lookup (key) -> node-IP@
Route (node-IP@, PUT, key, value)
Route (node-IP@, GET, key) -> value

ˆ Idea:
 Assign particular nodes to hold particular content (or reference
to content)
 Every node supports a routing function (given a key, route
messages to node holding key)

61
What is a DHT? (cont.)

Distributed application
put(key, value) get (key) value
Distributed hash table
lookup(key) node IP address
Lookup service

node node …. node

62
DHT in action
K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V

63
DHT in action: put()
K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V

put(K1,V1)

64
DHT in action: put()
K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V

put(K1,V1)

65
DHT in action: put()
(K1,V1) K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V

66
DHT in action: get()
K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V

get (K1)

67
Iterative vs. Recursive Routing
K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V

68
Peers vs Infrastructure
ˆ Peer:
Application users provide nodes for DHT
Examples: file sharing, etc

ˆ Infrastructure:
Set of managed nodes provide DHT
service
Perhaps serve many applications

69
DHT Design Goals
ˆ An “overlay” network with:
 Decentralization and self-organization, i.e. no central
authority, local routing decisions
 Flexibility in mapping keys to physical nodes and routing
 Robustness to joining/leaving
 Scalability, i.e. low communication overhead
 Efficiency, i.e. low latency

ˆ A consistent “storage” mechanism with


 No guarantees on persistence
 Maintenance via soft state
DHT Approaches

€ Consistent Hashing

€ Chord

€ CAN (Content Addressable Network)

€ Pastry

€ Tapestry
Consistent Hashing
ˆ Consistent hashing [Karger
97]
ˆ Overlay network is a circle
ˆ Each node has randomly
chosen id
 Example: hash on the IP
address
 Keys in same id space 4-bit ID space
ˆ Node’s successor in circle
is node with next largest id
 Each node knows IP
address of its successor
ˆ Key is stored in closest
successor
Consistent Hashing
ˆ Principles:
Node departures Node joins
Each node must track You are a new node, id k
s ≥ 2 successors
Ask any node n to find the node n’ successor for id k
If your successor leaves,
take next one Get successor list from n’

Ask your new successor for Tell your predecessors to update their successor lists
list of its successors; update
your s successors
Thus, each node must track its predecessor

ˆ # neighbors = s+1: O(1)


 Routing table: (neighbor_id , neighbor_ip@)

ˆ Can we do
Average # of messages to find key: O(N)
better?
Chord
ˆ Consistent Hashing
 Circular overlay
 1-dimentianal random ID in the hash space
 Covered range: ]previous_ID , own_ID] (mod ID space)
ˆ Finger Table
 Set of known neighbors
 The ith neighbor (clockwise) of the node of ID n has the
closest (larger) ID to n+2i (mod ID space), i ≥ 0
ˆ Routing
 To reach the node handling ID n’, send the message to
neighbor # log2(n’-n)
Chord
ˆ Routing: a node is
reachable from any
other node in no 1
more than log2(N)
overlay hops 8

K19
32
87
86

67
Lookup(K19)
72
Chord
ˆ Insertion
 Bootstrap (82) gets (1)
 (1) finds (82)’s pred (72)
 (72) constructs (82)’s finger table 82
 Update other nodes’ finger tables 1
in order to take (82) into account
 Update log2(N) 87 8

32
ˆ Deletion 86

82
67
72 23-finger=86
X 82
23-finger=86
X 82 23-finger=82
X 86
23-finger=82X 86
76
CAN
ˆ Routing geometry: Hypercube
ˆ A hash value = a point in a D-dimensional
Cartesian space
ˆ Each node responsible of a D-dimensional cube
ˆ Neighbors are nodes that “touch” in more than
a point
 Exple: D=2
• 2,3,4,5 are 1’s neighbors 2
• 6 is a neighbor of 2
3 1 5
 # neighbors: 0(D)
4

7
6
77
CAN
ˆ Routing:
 Recursively, from (n1, ..., nD) to (m1, …, mD),
choose the closest neighbor to (m1, …, mD)
 expected # overlay hops: 0(DN1/D)/4
2

ˆ Node insertion: 3 1 5
 find some node in the CAN (via bootstrap
process) (9)
 choose a point in the space uniformly at 10 9 4
random (X)
 using CAN, inform the node that 10 8 7
currently covers the space (8) X 6
 that node (8) splits its space in half
• 1st split along 1st dimension, if last split
along dimension i< D, next split along i+1st
dimension
 keeps half the space and gives other half
to joining node

78
CAN
ˆ Removal:
 leaf (3) removed
 Find a leaf node that is either
• a sibling
• descendant of a sibling where its sibling is also a leaf node (5)
 (5) takes over (3)’s region (moves to (3)’s position on the tree)
 (5)’s sibling (2) takes over (5)’s previous region
 The cube structure remains intact

2
7 4 2 12
5
11
9
8 X
3
5 1 6
10

X
13 14

79
Tapestry
ˆ Namespace (objects and nodes)
 160 bits
 f(ObjectID) = RootID: each ObjectID is mapped to a RootID
(node ID) via a dynamic mapping function

ˆ Routing Mesh
 Suffix routing
• from A to B: at hth hop arrives C which shares a h-digit-long suffix
with B
 Routing table
• at each node, used to route overlay messages to the destination
• Organized according to “mapping” levels Li (neighbor is in Li if
neighborID shares a prefix with ID which is i-1 long)

80
Tapestry NodeId 5230

L1 L2 L3 L4
0482 50xx 520x 5230
ˆ Routing:
 5230 routes to 42AD via 156A 51xx 521x 5231
5230 -> 400F -> 4227 -> 42A2 -> 42AD 248D 5230 522x 5232
3342 52xx 5230 5233
ˆ Publication
 f(42AE)=42AD 400F 53xx 524x 5234

42AE
400F
ˆ Query L1
 f(42AE)=42AD L2 5230 156A

4227 4629
42A9
L3

42AD
42A2
42A7
AC78
?
42AE 4112
81
Pastry
ˆ Namespace
 128 bits
 ID: sequence of digits with base 2b
 IDs roughly evenly distributed in the namespace

ˆ Each node contains


 Routing table
 Neighborhood set
 Leaf set

82
Pastry
ˆ Routing table:
 log2bN rows, 2b-1 entries/row
 Row n: n first digits shared with the nodeId
 Entry: IP@ of the closest node
(proximity)
ˆ Routing:
 ~ Tapestry: suffix-based routing with
differences:
• If key falls within range of leaf set, message
forwarded directly to node in leaf set closest
to key.
• If routing entry empty or if associated node is
unreachable, then forward to a node in leaf set (or
neighborhood set) that shares a prefix
with the key at least as long as the local node,
and whose id is numerically closer to the key
than the local node’s id.
 less than (log2bN) hops

ˆ The choice of b: tradeoff between the size of the table (log2bN x


(2b-1) and the number of hops in routing (log2bN)

83
DHTs: Issues
ˆ Consistency
 Soft-state publication
 Tradeoff between consistency and communication
overhead
ˆ Performance
 Load-balancing
 Latency
=> Replication?
ˆ Location-awareness
 Neighbors in a ring and far away in the internet
=> Reduce latency in routing
ˆ Geometry
 Ring topology adds flexibility and helps in resilience
ˆ Bootstrapping
 Relies on known node
84
DHTs: Applications

€ Global file systems [OceanStore, CFS, PAST,


Pastiche, UsenetDHT]
€ naming services [Chord-DNS, Twine, SFR]
€ DB query processing [PIER, Wisc]
€ Internet-scale data structures [PHT, Cone,
SkipGraphs]
€ Communication services [i3, MCAN, Bayeux]
€ File sharing [OverNet]
€ Event notification [Scribe, Herald]

85
JXTA Platform

86
What is JXTA?

ˆ JXTA: enabling technology for the design of


interoperable P2P applications

ˆ Abstracts the underlying network/topology


=> virtual network

ˆ Provides a development platform:


 Set of services
 Set of protocols
 Open for additional services/protocols

87
Why JXTA?
ˆ Uniform Peer Addressing
 Each peer has its own and unique ID(128-bits).
ˆ Well defined virtual network
 It makes application design simple.
ˆ Fault tolerance
 It works persistently.
ˆ Self-Organization
 Peer works independently.
ˆ Independence from
 Hardware platform
 Programming language
 Network
88
JXTA Protocols

€ Peer Discovery
Protocol
€ Peer Information
Protocol
€ Pipe Binding
Protocol
€ Peer Resolver
Protocol
€ Rendezvous
Protocol
€ Peer Endpoint
Protocol

89
API Architecture JXTA

90
JXTA :
API Architecture
ˆ ID:
 JXTA ID a standard URN in JXTA namesapce
 Based on 128-bit long UUIDs
urn:jxta:uuid-DEADBEEFDEAFBABAFEEDBABE000000010206
 Implements UUID self-generator
ˆ Cache Manager:
 Caches, indexes and stores soft-state advertisements
 Exple: Apache open-source Xindice XML (JXTA 2.0)
ˆ XML parser:
 enables the serialization and deserialization of Java objects into
output and input XML character streams.

ˆ Advertisements:
 allows a Java object to be serialized and deserilized into an XML
document representing an advertisement (peer, peerGroup, endpoint,
rendezvous, pipe, route, etc.)
91
JXTA :
API Architecture

ˆ HTTP, TCP/IP, TLS Transport:


 Implements transport bindings as specified in the JXTA
binding protocol
 HTTP Transport:
• Embedded HTTP server
• HTTP GET/POST
• Support firewall traversal
 TCP/IP Transport:
• Can be configured to accept broadcast through IP multicast
 TLS Transport:
• Fragments messages into TLS records
• Reliable, secure, end-to-end transport over the HTTP and
TCP/IP transport
• Key exchange

92
JXTA :
API Architecture

ˆ Message:
 All messages are represented as XML payload
 Sequence of elements manipulated by specific
services
ˆ Virtual Messenger:
 abstracts all JXTA transports into a common
interface for the endpoint service
 Abstracts synchronized/asynchronized
transports

93
JXTA :
API Architecture
ˆ Relay service:
 Implemented by super-peers
 Message store & forward to peers not directly
reachable (behind firewalls, NATs…)

94
JXTA :
API Architecture

ˆ Router service:
 For dynamically discovering and maintaining
route information to other peers
 Implements the Endpoint Routing Protocol
(ERP), allowing a peer to query and discover
route information
 Each message sent and received contains a
routing element used for updating route
information

95
JXTA Core Services:
Endpoint Service

ˆ Endpoint service:
 Provides network abstraction and allows peers
to communicate independently of the underlying
network topology (firewalls or NAT,…. ), and
physical transports
 Provides uniform de-multiplexing of incoming
messages
 Delegates network propagation, and
connectivity establishment to the appropriate
messenger transports
96
JXTA Core Services:
Rendezvous Service
ˆ Rendezvous service:
 Used for message propagation within the scope
of the peerGroup
 Implements the Rendezvous Protocol (RVP)
 Uses the endpoint service for propagating
messages
ˆ Rendezvous peers are super-peers:
 Maintain indices over resources shared by sub-
peers (through SRDI service)
 Maintain lists of other Rendezvous peers
(through RPV service)
97
JXTA Core Services:
Resolver Service

ˆ Resolver Service:
 Used for sending generic query/response
 Uses the endpoint and rendezvous services for
unicasting and propagating requests within the
scope of a peergroup
 The resolver service implements the Peer
Resolver Protocol (PRP).

98
JXTA Core Services:
Discovery Service

ˆ Discovery service:
 Used for discovering and publishing any type of
advertisements (peer, peergroup, pipe, etc.) in a
peergroup
 Implements the Peer Discovery Protocol (PDP)
 Uses the resolver service for sending and
receiving discovery requests

99
JXTA Core Services:
Pipe Service

ˆ Pipe service:
 Used for creating and binding pipe ends (input
and output pipes) to peer endpoints within the
scope of a peergroup
 Three types of pipes: unicast (one-to-one),
secure, and propagate (one-to-N)
 The pipe service uses a pipe resolver service
for dynamically binding a pipe end to a peer
endpoint
 The pipe resolver service implements the Pipe
Binding Protocol (PBP).
100
JXTA Core Services:
Peer Info service

ˆ Peer Info service:


 Pluggable framework for metering and
monitoring peers
 Metering monitors can be associated with any
peerGroup service to collect information about
that service
 The Peer Info service implements the Peer
Information Protocol.

101
JXTA Core Services:
Membership service

ˆ Membership Service:
 Manage PeerGroup membership, and issue
membership credentials
 Provides a pluggable authentication framework
to support different authentication mechanisms

102
JXTA Core Services:
PeerGroup Service

ˆ PeerGroup Service:
 manages a group of peers and enables a peer to
create, advertise, and join new PeerGroups
 NetPeerGroup: default group
• basic/core services (discovery, resolver, pipe,
rendezvous, etc.)
• exposed by the PeerGroup service

103
References
ˆ K.W. Ross and Dan Rubenstein, "Tutorial on P2P Systems " Infocom 2003,
http://cis.poly.edu/~ross/p2pTheory/P2Preading.htm.
ˆ Dejan Milojicic, Vana Kalogeraki, Rajan Lukose, Kiran Nagaraja, Jim Pruyne, Bruno
Richard, Sami Rollins and Zhichen Xu "Peer-to-Peer Computing”, HP Labs
Technical Report, HPL-2002-57 http://www.hpl.hp.com/techreports/2002
ˆ Karl Aberer, Philippe Cudré-Mauroux, Anwitaman Datta, Zoran Despotovic,
Manfred Hauswirth, Magdalena Punceva, Roman Schmidt “P-Grid: A Self-
organizing Structured P2P System” ACM SIGMOD Record, 32(3), September
2003
ˆ Athens Univ. of Economics and Bussiness, White Paper : A Survey of Peer-to-Peer
File Sharing Technologies. http://www.eltrun.aueb.gr/whitepapers/p2p_2002.pdf
ˆ Matei Ripeanu and Ian Foster, “Mapping the Gnutella Network: Macroscopic
Properties of Large-Scale Peer-to-Peer Systems,” Proceedings of the 1st
International Workshop on Peer-to-Peer Systems, March 2002.
ˆ J. Liang, R. Kumar, K.W. Ross, “Understanding KaZaA”,
http://cis.poly.edu/~ross/papers/.
ˆ A. Crespo and H. Garcia-Molina, “Semantic overlay networks for p2p systems”,
Technical report, Computer Science Department, Stanford University, October
2002.
104
References (cont.)
ˆ Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans
Kaashoek, Frank Dabek, Hari Balakrishnan, “Chord: A Scalable Peer-to-peer
Lookup Protocol for Internet Applications”, IEEE/ACM Transactions on
Networking, February 2003.
ˆ S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, "A scalable
content-addressable network.” In SIGCOMM, August 2001
ˆ Antony Rowstron and Peter Druschel, “Pastry : Scalable, decentralized object
location and routing for large-scale peer-to-peer systems”, Proc. Of 18th
IFIP/ACM Interna-tional Conference on Distributed Systems Platforms,
Heidelberg, Germany, November 2001.
ˆ B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph., "Tapestry: An infrastructure for
fault-tolerant wide-area location and routing," Technical Report UCB/CSD-01-
1141, UC Berkeley, April 2001.
ˆ Bernard Traversat, Ahkil Arora, Mohamed Abdelaziz, “Project JXTA 2.0 Super-
Peer Virtual Network”,
http://www.jxta.org/project/www/docs/JXTA2.0protocols1.pdf
ˆ Ekaterina Chtcherbina, Thomas Wieland, “Project JXTA-Guide to a peer-to-peer
framework”, http://www.drwieland.de/jxta-tutorial-part1a.pdf,
http://www.drwieland.de/jxta-tutorial-part2a.pdf
105

Das könnte Ihnen auch gefallen