Beruflich Dokumente
Kultur Dokumente
An Overview
Raouf Boutaba
Department of Computer Science
University of Waterloo
rboutaba@bbcr.uwaterloo.ca
1
Evolution of Distributed Computing Models
Client-Server
Client
Key: Process: Computer:
Server
Client
Server
Client
Server
Client Server
Proxy
Server
Client Server
Client Web
Server
Applet code
5
Variants of Client-Server Model (4)
Mobile Agents
Executing program (code + data), migrating amongst processes,
carrying out of an autonomous task, usually on behalf of some
other process
Advantages: flexibility, savings in communications cost
Virtual markets, worm programs
Network Computers
Downloads its operating system and any application software
needed by the user from a remote file server
Applications are run locally but the files are managed by a
remote file server
User can migrate from one network computer to another since
all application code and data is stored by a file server
6
Variants of Client-Server Model (5)
Thin Clients and Compute Servers
Executing windows- based user interface on local computer
while application executes on compute server
7
Variants of Client-Server Model (6)
Spontaneous Networking
Gateway Music
service Alarm
service
Internet
Hotel wireless
Discovery network
service
Camera
TV/PC Guest’s
Laptop PDA Devices
Discovery services
Î Services available in the network
Î Their properties, and how to access them (including device-
specific driver information)
8
The Peer-to-Peer Model
Applications based on peer processes
Not Client-Server
processes that have largely identical functionality
Application Application
Coordination Coordination
Code Code
Application
Coordination
Code
9
The Peer-to-Peer Mania
Started in the middle of 2000
When the Internet felt into predictable patterns
Computer field shocks:
Î Napster, SETI@home, Freenet, Gnutella, Jabber, …
14
Definitions (Cont)
A transient network that allows a group of computer
users to connect with each other and collaborate by
sharing resources (CPU, storage, content).
Examples of overlays:
Î BGP routers and their peering relationships
Î Content distribution networks (CDNs)
Î Application-level multicast
Î And P2P apps !
15
Overlay Networks
An overlay network is a set of
logical connections between
end hosts
Overlay maintenance is an
issue
Overlay edge
16
Overlays: All in the application layer
application
ÎTopology
Î Protocol
Î Messaging over
TCP, UDP, ICMP
application
application
Underlying physical
transport
transport network
network data link
net is transparent to data link
physical
physical
developer
17
Goals
Cost reduction through cost sharing
Î Client/Server: Server bears most of the cost
Î P2P: Cost spread over all the peers (+Napster, ++Gnutella,…)
Î aggregation of otherwise unused resources (e.g., seti@home)
Improved scalability/reliability
Î resource discovery and search (eg. Chord, CAN, …)
Interoperability
Îfor the aggregation of diverse resources (storage, CPU, …)
Increased autonomy
Î independence from servers, hence providers (e.g., A way
around censorship, licensing restrictions, etc.)
18
Goals (Cont)
Anonymity/privacy
Î Difficult to ensure with a central server
Î Required by users who do not want a server/provider to know
their involvement in the system
ÎFreenet is a prime example
Dynamism
Î Resources (e.g., compute nodes) enter and leave the system
continuously
Î Mechanisms are required to avoid polling (e.g., “buddy lists” in
Instant messaging)
Ad hoc communications
ÎP2P systems typically do not rely on an established
infrastructure
Îthey build their own, e.g. logical overlay in CAN
19
P2P Application areas
File Sharing
Communication
Collaboration
Computation
Databases
Others
20
P2P File Sharing
File exchange: Killer application!
(+) Potentially unlimited file exchange areas
(+) High available safe storage: duplication and
redundancy
(+) Anonymity : preserve anonymity of authors and
publishers
(+) Manageability
(-) Network bandwidth consumption
(-) Security
(-) Search capabilities
21
P2P File Sharing (cont)
Examples of P2P file sharing applications:
Napster
Î disruptive; proof of concept
Gnutella
Î open source
KaZaA
Î today more KaZaA traffic then Web traffic!
eDonkey
Î becoming popular in Europe
Î appears to use a DHT
and many others…
Audio-Video Conferencing
Î Example: Voice-over-IP (Skype)
23
P2P Collaboration
Application-level user collaboration
Shared Applications
Î Example: Shared file editing (eg. Distributed Powerpoint)
Online games
Î Multi-players, distributed
Î Example: Descent (www.planetdescent.com)
Technical challenges
Î Locating of peers
Î Fault tolerance
Î Real-time constraints
Example:
Î Groove
24
P2P Computation
Achieves processing scalability by aggregating the
resources of large number of individual PC.
Application areas
ÎFinancial applications
ÎBiotechnology
ÎAstronomy,…
Relative project
Îseti@home
ÎAvaki
ÎEntropia
ÎGridella
25
P2P Databases
Fragments large database over physically distributed
nodes
Overcomes limitations of distributed DBMS
ÎStatic topology
ÎHeavy administration work
No global schema
Examples:
ÎAmbientDB
ÎXpeer : self-organizing XML DB
26
Other Applications
P2P Applications built over emerging overlays
PlanetLab: to conduct Internete-scale experiments
Index
Data 29
Purely decentralized P2P
network nodes perform the same tasks (Servents)
no central coordination activity
examples: original Gnutella, Freenet
(-) data consistency?, Manageability?, Security?,
Comm. overhead
30
Partially centralized P2P
some of the nodes assume a more important role
Supernodes act as local central indexes
examples: Kazaa, recent Gnutella
files
List of files
31
Unstructured P2P
data is distributed randomly over the peers and
broadcasting mechanisms are used for searching.
placement of data is unrelated to the overlay topology.
examples: Napster, Gnutella, KaZaa
Download
Music A Where is “music A”?
Reporting a file list
music A is…
9832
33
Loosely structured P2P
Between structured and unstructured
File locations are affected by routing hints, but they
are not completely specified.
example: Freenet
“B” may be
“A” may on right
be on left side…
Here I am!! side
A
B
There is
no “B”
34
P2P classification summary
35
Design requirements
Decentralization
Scalability
Anonymity
Self-Organization
Performance
Fault Resilience
Security
Others
36
Decentralization
Emphasizes users’ ownership and control of data and
resources
37
Scalability
The ease with which the system remains tractable with
an increasing number of nodes and data elements
Important feature because of the very large number of
participants and data elements
Immediate benefit of decentralization
Early p2p systems had limited scalability!
ÎNapster was able to scale up to about 6 millions of users at the
peak of its service
Achieving good scalability should not be at the expense
of other desirable features.
ÎHybrid P2P systems, such as Napster, intentionally keep some
amount of the operations and files centralized.
ÎMany Structured lookup protocols are proposed
38
Anonymity
The degree to which a system allows for anonymous transactions
Do not want someone to identify author, publisher, reader,
server, or document on the systems
39
Self-organization
“A process where the organization of a system
is spontaneously established, i.e, without this
being controlled by the environment or an
encompassing or otherwise external system”
Needed for scalability and fault resilience, and
because of intermittent connection of
resources, and cost of ownership.
Self-maintenance and self-repair
DHTs are a means to achieve self-organization
(e.g., in Chord, Pastry, …)
40
Performance
P2P systems aim to improve performance by
aggregating distributed storage capacity (e.g.,
Napster, Gnutella) and processing cycles (e.g.,
seti@home)
Performance is influenced by 3 types of resources:
processing, storage, and bandwidth
Performance: how long it takes to retrieve a file or
how much bandwidth will a query consume?
To optimize performance
ÎReplication
ÎCaching
ÎIntelligent routing & network organization
41
Fault resilience
P2P systems are faced with failures commonly
associated with systems.
ÎDisconnections
ÎUnreachability
ÎPartitions
ÎNode Failure
44
Case Studies
File sharing P2P systems
45
Napster
Hybrid decentralized,
unstructured
Combination of client/server
and P2P approaches
A network of registered
users running a client
software, and a central
directory server
The server maintains 3
tables:
(File_Index , File_Metadata)
(User_ID , User_Info)
(User_ID , File_Index)
46
Gnutella
Pure decentralized, unstructured
Characteristic:
Few nodes with high connectivity.
Most nodes with sparse connectivity.
47
Gnutella (cont.)
Join Search File Transfer
E E E
A A A
C C x.mp3
C
B D B D B D
x.mp3 x.mp3 x.mp3
A’s Ping
B’s Pong A’s Query A’s file req.
C’s Pong B’s Query Hit B’s file resp.
D’s Pong D’s Query Hit
E’s Pong
48
Gnutella (cont.)
Advantages:
Robustness to random node failure
Completeness (constrained by the TTL)
Disadvantages:
Communication overhead
Network partition (controlled flooding)
Security
Tradeoff:
Low TTL => low communication overhead
High TTL => high search horizon
49
Freenet
Purely decentralized, loosely structured
Goal: provide anonymous method for storing and
retrieving data (files)
2 operations:
Insertion
Search
File insertion
User assigns a hash key to file
sends an insert message to the user’s own node
Node checks its data store for collision
Node looks up the closest key and forwards the message to
another node
This is done recursively until TTL expires or collision
detected.
If no collision, user sends data down the path established by
the insert message.
Each node along the path stores it and creates a routing table
entry 51
Freenet (cont.)
Search Connection
Query : AFCD3
Data request
a Data reply
AFCD3 Request failed
Chain
AFCD3
c b
d
AFCD3
Chain
f e
AFCD3
52
Freenet
Advantages:
Anonymity
Robustness to random node failure
Low communication overhead
More scalable
Self-organized
Disadvantages:
Poor routing decision
Spam
Security
53
KaZaA
Partially centralized,
unstructured
Every peer is either a
supernode (SN) or an
ordinary node (ON)
assigned to a
supernode
Each supernode knows
where the other
supernodes are (Mesh
structure)
54
KaZaA (cont.)
Bootstrap:
Connection to a known SN
Upload METADATA for shared files
Î file name
Î file size
Î file descriptor
Î ContentHash
Search:
ON -> SN (-> SN)*
Keyword-based
ContentHash in result
Download:
HTTP
ContentHash-based
Failure -> automatic search on ContentHash
55
KaZaA (cont.)
Advantages:
Scalability
Efficiency
Exploits heterogeneity of peer
Fault-tolerance
Disadvantages
Pollution
DoS attacks on SN
56
Case Studies
Distributed Computing
57
seti@home
Goal : to discover alien
civilizations
Analyzes the radio
emission from the space
collected by the radio
telescopes using
processing power of
millions of unused internet
PCs
Two major components : a database server and clients
Supported platforms : Windows, Linux, Solaris, and HP-UX
58
Case Studies
Distributed Hash Tables
59
What is a DHT?
Hash Table
data structure that maps “keys” to “values”
Interface
put(key, value)
get(key)
60
What is a DHT? (cont.)
Single-node hash table:
Key = hash (data)
put(key, value)
get(key)->value
Idea:
Assign particular nodes to hold particular content (or reference
to content)
Every node supports a routing function (given a key, route
messages to node holding key)
61
What is a DHT? (cont.)
Distributed application
put(key, value) get (key) value
Distributed hash table
lookup(key) node IP address
Lookup service
62
DHT in action
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
63
DHT in action: put()
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(K1,V1)
64
DHT in action: put()
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
put(K1,V1)
65
DHT in action: put()
(K1,V1) K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
66
DHT in action: get()
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
get (K1)
67
Iterative vs. Recursive Routing
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
68
Peers vs Infrastructure
Peer:
Application users provide nodes for DHT
Examples: file sharing, etc
Infrastructure:
Set of managed nodes provide DHT
service
Perhaps serve many applications
69
DHT Design Goals
An “overlay” network with:
Decentralization and self-organization, i.e. no central
authority, local routing decisions
Flexibility in mapping keys to physical nodes and routing
Robustness to joining/leaving
Scalability, i.e. low communication overhead
Efficiency, i.e. low latency
Consistent Hashing
Chord
Pastry
Tapestry
Consistent Hashing
Consistent hashing [Karger
97]
Overlay network is a circle
Each node has randomly
chosen id
Example: hash on the IP
address
Keys in same id space 4-bit ID space
Node’s successor in circle
is node with next largest id
Each node knows IP
address of its successor
Key is stored in closest
successor
Consistent Hashing
Principles:
Node departures Node joins
Each node must track You are a new node, id k
s ≥ 2 successors
Ask any node n to find the node n’ successor for id k
If your successor leaves,
take next one Get successor list from n’
Ask your new successor for Tell your predecessors to update their successor lists
list of its successors; update
your s successors
Thus, each node must track its predecessor
Can we do
Average # of messages to find key: O(N)
better?
Chord
Consistent Hashing
Circular overlay
1-dimentianal random ID in the hash space
Covered range: ]previous_ID , own_ID] (mod ID space)
Finger Table
Set of known neighbors
The ith neighbor (clockwise) of the node of ID n has the
closest (larger) ID to n+2i (mod ID space), i ≥ 0
Routing
To reach the node handling ID n’, send the message to
neighbor # log2(n’-n)
Chord
Routing: a node is
reachable from any
other node in no 1
more than log2(N)
overlay hops 8
K19
32
87
86
67
Lookup(K19)
72
Chord
Insertion
Bootstrap (82) gets (1)
(1) finds (82)’s pred (72)
(72) constructs (82)’s finger table 82
Update other nodes’ finger tables 1
in order to take (82) into account
Update log2(N) 87 8
32
Deletion 86
82
67
72 23-finger=86
X 82
23-finger=86
X 82 23-finger=82
X 86
23-finger=82X 86
76
CAN
Routing geometry: Hypercube
A hash value = a point in a D-dimensional
Cartesian space
Each node responsible of a D-dimensional cube
Neighbors are nodes that “touch” in more than
a point
Exple: D=2
• 2,3,4,5 are 1’s neighbors 2
• 6 is a neighbor of 2
3 1 5
# neighbors: 0(D)
4
7
6
77
CAN
Routing:
Recursively, from (n1, ..., nD) to (m1, …, mD),
choose the closest neighbor to (m1, …, mD)
expected # overlay hops: 0(DN1/D)/4
2
Node insertion: 3 1 5
find some node in the CAN (via bootstrap
process) (9)
choose a point in the space uniformly at 10 9 4
random (X)
using CAN, inform the node that 10 8 7
currently covers the space (8) X 6
that node (8) splits its space in half
• 1st split along 1st dimension, if last split
along dimension i< D, next split along i+1st
dimension
keeps half the space and gives other half
to joining node
78
CAN
Removal:
leaf (3) removed
Find a leaf node that is either
• a sibling
• descendant of a sibling where its sibling is also a leaf node (5)
(5) takes over (3)’s region (moves to (3)’s position on the tree)
(5)’s sibling (2) takes over (5)’s previous region
The cube structure remains intact
2
7 4 2 12
5
11
9
8 X
3
5 1 6
10
X
13 14
79
Tapestry
Namespace (objects and nodes)
160 bits
f(ObjectID) = RootID: each ObjectID is mapped to a RootID
(node ID) via a dynamic mapping function
Routing Mesh
Suffix routing
• from A to B: at hth hop arrives C which shares a h-digit-long suffix
with B
Routing table
• at each node, used to route overlay messages to the destination
• Organized according to “mapping” levels Li (neighbor is in Li if
neighborID shares a prefix with ID which is i-1 long)
80
Tapestry NodeId 5230
L1 L2 L3 L4
0482 50xx 520x 5230
Routing:
5230 routes to 42AD via 156A 51xx 521x 5231
5230 -> 400F -> 4227 -> 42A2 -> 42AD 248D 5230 522x 5232
3342 52xx 5230 5233
Publication
f(42AE)=42AD 400F 53xx 524x 5234
42AE
400F
Query L1
f(42AE)=42AD L2 5230 156A
4227 4629
42A9
L3
42AD
42A2
42A7
AC78
?
42AE 4112
81
Pastry
Namespace
128 bits
ID: sequence of digits with base 2b
IDs roughly evenly distributed in the namespace
82
Pastry
Routing table:
log2bN rows, 2b-1 entries/row
Row n: n first digits shared with the nodeId
Entry: IP@ of the closest node
(proximity)
Routing:
~ Tapestry: suffix-based routing with
differences:
• If key falls within range of leaf set, message
forwarded directly to node in leaf set closest
to key.
• If routing entry empty or if associated node is
unreachable, then forward to a node in leaf set (or
neighborhood set) that shares a prefix
with the key at least as long as the local node,
and whose id is numerically closer to the key
than the local node’s id.
less than (log2bN) hops
83
DHTs: Issues
Consistency
Soft-state publication
Tradeoff between consistency and communication
overhead
Performance
Load-balancing
Latency
=> Replication?
Location-awareness
Neighbors in a ring and far away in the internet
=> Reduce latency in routing
Geometry
Ring topology adds flexibility and helps in resilience
Bootstrapping
Relies on known node
84
DHTs: Applications
85
JXTA Platform
86
What is JXTA?
87
Why JXTA?
Uniform Peer Addressing
Each peer has its own and unique ID(128-bits).
Well defined virtual network
It makes application design simple.
Fault tolerance
It works persistently.
Self-Organization
Peer works independently.
Independence from
Hardware platform
Programming language
Network
88
JXTA Protocols
Peer Discovery
Protocol
Peer Information
Protocol
Pipe Binding
Protocol
Peer Resolver
Protocol
Rendezvous
Protocol
Peer Endpoint
Protocol
89
API Architecture JXTA
90
JXTA :
API Architecture
ID:
JXTA ID a standard URN in JXTA namesapce
Based on 128-bit long UUIDs
urn:jxta:uuid-DEADBEEFDEAFBABAFEEDBABE000000010206
Implements UUID self-generator
Cache Manager:
Caches, indexes and stores soft-state advertisements
Exple: Apache open-source Xindice XML (JXTA 2.0)
XML parser:
enables the serialization and deserialization of Java objects into
output and input XML character streams.
Advertisements:
allows a Java object to be serialized and deserilized into an XML
document representing an advertisement (peer, peerGroup, endpoint,
rendezvous, pipe, route, etc.)
91
JXTA :
API Architecture
92
JXTA :
API Architecture
Message:
All messages are represented as XML payload
Sequence of elements manipulated by specific
services
Virtual Messenger:
abstracts all JXTA transports into a common
interface for the endpoint service
Abstracts synchronized/asynchronized
transports
93
JXTA :
API Architecture
Relay service:
Implemented by super-peers
Message store & forward to peers not directly
reachable (behind firewalls, NATs…)
94
JXTA :
API Architecture
Router service:
For dynamically discovering and maintaining
route information to other peers
Implements the Endpoint Routing Protocol
(ERP), allowing a peer to query and discover
route information
Each message sent and received contains a
routing element used for updating route
information
95
JXTA Core Services:
Endpoint Service
Endpoint service:
Provides network abstraction and allows peers
to communicate independently of the underlying
network topology (firewalls or NAT,…. ), and
physical transports
Provides uniform de-multiplexing of incoming
messages
Delegates network propagation, and
connectivity establishment to the appropriate
messenger transports
96
JXTA Core Services:
Rendezvous Service
Rendezvous service:
Used for message propagation within the scope
of the peerGroup
Implements the Rendezvous Protocol (RVP)
Uses the endpoint service for propagating
messages
Rendezvous peers are super-peers:
Maintain indices over resources shared by sub-
peers (through SRDI service)
Maintain lists of other Rendezvous peers
(through RPV service)
97
JXTA Core Services:
Resolver Service
Resolver Service:
Used for sending generic query/response
Uses the endpoint and rendezvous services for
unicasting and propagating requests within the
scope of a peergroup
The resolver service implements the Peer
Resolver Protocol (PRP).
98
JXTA Core Services:
Discovery Service
Discovery service:
Used for discovering and publishing any type of
advertisements (peer, peergroup, pipe, etc.) in a
peergroup
Implements the Peer Discovery Protocol (PDP)
Uses the resolver service for sending and
receiving discovery requests
99
JXTA Core Services:
Pipe Service
Pipe service:
Used for creating and binding pipe ends (input
and output pipes) to peer endpoints within the
scope of a peergroup
Three types of pipes: unicast (one-to-one),
secure, and propagate (one-to-N)
The pipe service uses a pipe resolver service
for dynamically binding a pipe end to a peer
endpoint
The pipe resolver service implements the Pipe
Binding Protocol (PBP).
100
JXTA Core Services:
Peer Info service
101
JXTA Core Services:
Membership service
Membership Service:
Manage PeerGroup membership, and issue
membership credentials
Provides a pluggable authentication framework
to support different authentication mechanisms
102
JXTA Core Services:
PeerGroup Service
PeerGroup Service:
manages a group of peers and enables a peer to
create, advertise, and join new PeerGroups
NetPeerGroup: default group
• basic/core services (discovery, resolver, pipe,
rendezvous, etc.)
• exposed by the PeerGroup service
103
References
K.W. Ross and Dan Rubenstein, "Tutorial on P2P Systems " Infocom 2003,
http://cis.poly.edu/~ross/p2pTheory/P2Preading.htm.
Dejan Milojicic, Vana Kalogeraki, Rajan Lukose, Kiran Nagaraja, Jim Pruyne, Bruno
Richard, Sami Rollins and Zhichen Xu "Peer-to-Peer Computing”, HP Labs
Technical Report, HPL-2002-57 http://www.hpl.hp.com/techreports/2002
Karl Aberer, Philippe Cudré-Mauroux, Anwitaman Datta, Zoran Despotovic,
Manfred Hauswirth, Magdalena Punceva, Roman Schmidt “P-Grid: A Self-
organizing Structured P2P System” ACM SIGMOD Record, 32(3), September
2003
Athens Univ. of Economics and Bussiness, White Paper : A Survey of Peer-to-Peer
File Sharing Technologies. http://www.eltrun.aueb.gr/whitepapers/p2p_2002.pdf
Matei Ripeanu and Ian Foster, “Mapping the Gnutella Network: Macroscopic
Properties of Large-Scale Peer-to-Peer Systems,” Proceedings of the 1st
International Workshop on Peer-to-Peer Systems, March 2002.
J. Liang, R. Kumar, K.W. Ross, “Understanding KaZaA”,
http://cis.poly.edu/~ross/papers/.
A. Crespo and H. Garcia-Molina, “Semantic overlay networks for p2p systems”,
Technical report, Computer Science Department, Stanford University, October
2002.
104
References (cont.)
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans
Kaashoek, Frank Dabek, Hari Balakrishnan, “Chord: A Scalable Peer-to-peer
Lookup Protocol for Internet Applications”, IEEE/ACM Transactions on
Networking, February 2003.
S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, "A scalable
content-addressable network.” In SIGCOMM, August 2001
Antony Rowstron and Peter Druschel, “Pastry : Scalable, decentralized object
location and routing for large-scale peer-to-peer systems”, Proc. Of 18th
IFIP/ACM Interna-tional Conference on Distributed Systems Platforms,
Heidelberg, Germany, November 2001.
B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph., "Tapestry: An infrastructure for
fault-tolerant wide-area location and routing," Technical Report UCB/CSD-01-
1141, UC Berkeley, April 2001.
Bernard Traversat, Ahkil Arora, Mohamed Abdelaziz, “Project JXTA 2.0 Super-
Peer Virtual Network”,
http://www.jxta.org/project/www/docs/JXTA2.0protocols1.pdf
Ekaterina Chtcherbina, Thomas Wieland, “Project JXTA-Guide to a peer-to-peer
framework”, http://www.drwieland.de/jxta-tutorial-part1a.pdf,
http://www.drwieland.de/jxta-tutorial-part2a.pdf
105