Beruflich Dokumente
Kultur Dokumente
A PRESENTATION BY
Consumed about 40% of the world bandwidth in 2012 No requirement of large capacity server or expensive bandwidth
AVAILABILITY
CLIENT HASH/HASH SUM HIT AND RUN LEECH LURKER SEED PEER P2P (PEER TO PEER) SWARM
Popular Terms
TRACKER
METADATA
Used for replication of large amounts of static data. Scalability : the speed of data transfer or throughput increases with the increase of peers and seeds to a swarm. Efficient utilizes a large amount of available network bandwidth Tolerant to dropping peers Ability to verify data integrity (SHA-1 Hashes)
Minimise piece overlap among peers to allow each peer to exchange pieces with as many other peers as possible
File Splitting
The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece.
Metadata File
A metadata file (commonly known as .torrent file) is distributed to all the peers in a swarm, using a different protocol.(Generally, http or ftp) The metadata contains
The SHA-1 hashes of all pieces (Secure Hash Algorithm) A mapping of the pieces to files A tracker reference URL of tracker Name Length of file
Tracker
The tracker is a central server keeping a list of all peers participating in the swarm A swarm is the set of peers that are participating in distributing the same files A peer joins a swarm by asking the tracker for a peer list and connects to those peers.
Graphic Visualization
The above visualization shows a set of swarm with the pieces they have. The node connected with an arrow is a new peer to the swarm requesting for the list of ip of its peers.
Graphic Visualization
The above visualization shows a set of swarm with the pieces they have. The node connected with an arrow is a new peer to the swarm requesting for the list of ip of its peers.
Works as a hash table with sha1-hashes as keys. The key is the info-hash, the hash of the metadata. It uniquely identifies a torrent.
Kademlia Bootstrap
Each node bootstraps by looking for its own ID The search is done recursively until no closer nodes can be found The nodes passed on the way are stored in the routing table The routing table have more room for close nodes than distant nodes
Each node knows much more about close nodes than distant nodes The key space each bucket represents is growing with the power of 2 with the distance Querying a node for a specific ID will on average halve the distance to the target ID each step
A piece is broken into sub-pieces typically 16KB in size. Until a piece is assembled, only download the sub-pieces of that piece only This policy lets pieces assemble quickly.
Piece overlap
Small overlap
Big overlap
Piece Selection
The order in which pieces are selected by different peers is critical for good performance If an inefficient policy is used, then peers may end up in a situation where each has all identical set of easily available pieces, and none of the missing ones. If the original seed is prematurely taken down, then the file cannot be completely downloaded!
First priority (commonly in a sequential order, otherwise can be reconfigured to our own choice) General Rule (preset in all the torrent clients Bit Torrent, Torrent, Vuze, Transmission) Special Case, at the beginning Special Case for end of data transmission.
Rarest First
Endgame Mode
Choking Algorithm
Choking is a temporary refusal to upload. It is one of BitTorrents most powerful idea to deal with free riders (those who only download but never upload). Tit-for-tat strategy is based on game-theoretic concepts. A good choking algorithm caps the number of simultaneous uploads for good TCP performance.
Optimistic Unchoking
A BitTorrent peer has a single optimistic unchoke to which it uploads regardless of the current download rate from it. This peer rotates every 30s Reasons:
To discover currently unused connections are better than the ones being used To provide minimal service to new peers
Transition from a peer to a seed. Once download is complete, a peer has no download rates to use for comparison nor has any need to use them. Upload to those with the best upload rate. This ensures that pieces get replicated faster, and new seeders are created fast.
A centrally monitored and controlled uplink and downlink of data between two nodes may theoretically provide a larger throughput, but in real environment that has to deal with - non responsive nodes, an asymmetrical upload and download bandwidth; Bit Torrent might be the most efficient p2p sharing network in the entire course of file exchange protocols.
Well go through a quick revision on what we learnt. You may ask any questions post summary. Thank you for your attention !