Trees: Distributed Caching Protocols for Relieving
Hot Spots on the World Wide Web Sagar Subudhi Indian Institute of Space Science and Technology Thiruvananthapuram Email: subudhi.sagar123@gmail.com
Abstract—We describe a family of caching protocols for III. CONSISTENT HASHING
distributed networks which are based on a special kind of hashing that we call consistent hashing and can be used to decrease or Consistent hashing solves the problem of different views, eliminate the occurrence of hot spots in the network defined to be the set of caches of which a particular client is aware. A client uses a consistent hash function to map a I. I NTRODUCTION object to one of the caches in its view. In this report writer has briefly described caching protocols IV. RANDOM TREES IN AN INCONSISTENT for distributed networks that can be used to decrease or WORLD eliminate the occurrences of hot spots. Hot spots occurs mainly when so many clients simultaneously want to access data from Here the assumption is only that each machine knows about a single server. a 1/t fraction of the caches chosen by an adversary. There is no difference in the protocol, except that the mapping h is a A. Main contributions consistent hash function. This change will not affect latency. 1) MODEL: Here Web is classified in three categories It will only effect swamping and storage. browsers, servers and caches. Number of caches represented as V. FAULT TOLERANCE C set of all pages- P and latency for a massage from machine m1 to arrive at m2 - δ(m1 ; m2) The modification to the protocol is quite simple. Choose a 2) RANDOM TREES: Simple caching protocol with the parameter t,and simultaneously send t requests for the page. specifications- 1. All machines know about the caches; 2. δ(mi ; A logarithmic number if requests is sufficient to give a high mj ) for all i6= j and 3. All requests are made at the same time. probability of one of the requests goes through. This change This basically combines both the aspects of last two algorithms in the protocol will of course have an impact on the system. described in past work and where swamping of any server VI. ADDING REAL TIME TO THE METHOD is prevented with high probability and minimizing memory Till now the main assumption was static. But if we have requirements. to consider real time then cache machine size and cache load II. A NALYSIS changes according to given theorem in the paper. Here it used A. Latency the rate at which requests were issued to measure the rate at which connections are established to machines. Here, the delay a browser faces in obtaining a page depends on the height of the tree. If a request is forwarded from a leaf VII. C ONCLUSION to the root, the latency is twice the length of the path, 2 logd C. The ideas have broader applicability. In particular, consistent If the request is within a cache, the latency is less. hashing may be a useful tool for distributing information from B. Swamping name servers such as DNS and label servers such as PICS in a load-balanced.These two schemes may together provide an If h is chosen uniformly and at random from the space of interesting method for constructing multicast trees which can hash functions then with probability at least 1 1/N, where N is effectively overcome the hots spots issues. a parameter the number of requests a given cache gets depends on theρ and O function of C and N. R EFERENCES [1] David Karger, Eric Lehman, Matthew Levine and Daniel Lewin, Con- C. Storage sistent Hashing and Random Trees: Distributed Caching Protocols for The amount of storage each cache must have in order to Relieving Hot Spots on the World Wide Web make this protocol work. and it is simply the number of pages for which it receives more than q requests.