WWW Caching-Trends and Techniques

World Wide Web Caching: Trends and Techniques
Greg Barish Katia Obraczka

USC Information Sciences Institute
4676 Admiralty Way, Marina del Rey, CA 90292, USA
fbarish, katiag@isi.edu
Abstract content provider. In this case, server-side caches reduce re-

quest serving time by balancing server load and improving
Academic and corporate communities have been dedicat- server availability as discussed above.
ing considerable eort to World Wide Web caching, When Second, temporary unavailability of the network can
correctly deployed, Web caching systems can lead to signif- be hidden from users, thus making the network appear to
icant bandwidth savings, server load balancing, perceived be more reliable. Network outages will typically not be
network latency reduction, and higher content availability. as severe to clients of a caching system, since local caches
In this paper, we survey state-of-the-art caching designs can be leveraged regardless of network availability. This
and implementations. We present a taxonomy of architec- will especially hold true for objects1 which are temporal
tures and describe some of their design techniques. in nature (such as isochronous delivery of multimedia data
like video and audio), where consistent bandwidth and re-
1 Introduction sponse times are particularly important.
Though web caching oers much hope for better per-
As the Internet continues to grow in popularity and size, so formance and improved scalability, there remain a number
have the scalability demands on its infrastructure. Expo- of ongoing issues. Perhaps primary among these is object
nential growth without scalability solutions will eventually integrity: if an object is cached, is the user guaranteed that
result in prohibitive network load and unacceptable service the cached copy is up-to-date with the version on the origi-
response times. nating server? Other issues include use of HTTP "cookies"
It is well-known that the primary reason why Inter- to personalize versions of identically named URLs, content
net trac has increased is due to the surge in popularity security, and the ethics of caching. Open issues are not the
of the World Wide Web, specically the high percentage focus of this paper rather, we simply provide a taxonomy
of HTTP trac. To better understand this growth rate, of designs and techniques which are available today.
consider that studies conducted less than ve years ago
revealed that 44% of Internet trac originated from FTP 1.2 The Need for a Survey
requests
5]. However, within the last year, it was shown Web caching is a rapidly evolving eld, which is one of the
that HTTP activity had grown to account for somewhere
between 75-80% of all Internet trac
30]. key factors that motivated us to produce this survey. It is
dicult to keep up with recent advances, since there are a
1.1 The Appeal of Web Caching number of ongoing eorts (both industrial and academic),
many of which contain solutions based on emerging web
The unparalleled growth of the Internet in terms of to- standards (such as persistent connections
22] and XML).
tal bytes transferred among hosts, coupled with the sud- Web caching is also a young industry, in which a number
den dominance of the HTTP protocol suggest much can of commercial vendors pushing new solutions either have
be leveraged through World Wide Web caching (hereafter direct ties to research systems or are architected by individ-
called web caching) technology. Specically, web caching uals with notable research backgrounds. Thus, commercial
becomes an attractive solution because it represents an ef- solutions often contain aggressive and unique architectures.
fective means for reducing bandwidth demands, improving The survey presented here serves to capture a snapshot
web server availability, and reducing network latencies. of current design trends and techniques. Its purpose is not
Deploying caches close to clients can reduce overall back- to compare one solution against another, but instead to
bone trac considerably. Cache hits eliminate the need to identify common designs and put them in context.
contact the originating server. Thus, additional network
communication can be avoided. 1.3 Outline
To improve web server availability, caching systems can This paper is organized as follows. The next section de-
also be deployed in a "reverse" fashion (such as in the re- scribes several distinct web caching architectures. In Sec-
verse proxy server approach, described in section 2), where tion 3, we summarize various cache deployment options
caches are managed on behalf of content providers, who some deployments go hand-in-hand with the caching sys-
want to improve the scalability of their site under existing tem architecture, whereas some architectures allow for a
or anticipated demand. In these cases, caches not only im- variety of deployment options. Section 4 focuses on com-
prove the availability and fault-tolerance of their associated mon design techniques found in many existing architec-
web servers, but also act as load balancers. tures. In Appendix A, we briey discuss the diculty in
Caching can improve user perceptions about network
performance in two ways. First, when serving clients lo- 1 In this paper, web objects are usually the building blocks of a
cally, caches hide wide-area network latencies. On a local web page. A given page may contain several objects: the HTML for
cache miss, the client request will be served by the original the page itself, an embedded picture, a video, a Java applet, etc.
Most Web URLs contain several distinct objects.
identifying metrics of cache eectiveness and specically congure web browsers. Transparent caches work by inter-
address recent work on benchmarking tools. Finally, in cepting HTTP requests (TCP trac destined to port 80)
Appendix B, we elaborate on related networking compo- and redirecting them to web cache servers or cache clus-
nents which augment existing cache deployment strategies ters. This style of caching establishes a point at which
to improve scalability. dierent kinds of administrative control are possible, for
example, deciding how to load balance requests across mul-
2 Caching Architectures tiple caches.
The strength of transparent caching is also its main
2.1 Proxy Caching weakness: it violates the end-to-end argument by not main-
taining constant the end points of the connection. This is
A proxy cache server intercepts HTTP requests from clients, a problem when an application requires that state is main-
and if it nds the requested object in its cache, it returns tained throughout successive requests or during a logical
the object to the user. If the object is not found, the cache request involving multiple objects.
goes to the object's home server - the originating server - The ltering of HTTP requests from all outbound In-
on behalf of the user, gets the object, possibly deposits it ternet trac adds additional latency
12]. For example,
in its cache, and nally returns the object to the user. caches deployed in conjunction with Layer 4 (L4) switches
Proxy caches are usually deployed at the edges of a rely on the fact that these switches intercept all trac
network (i.e., at company or institutional gateway/rewall which is directed at port 80 and send all other trac di-
hosts) so that they can serve a large number of internal rectly to the WAN router.
users. The use of proxy caches typically results in wide- There are two ways to deploy transparent proxy caching:
area bandwidth savings, improved response time, and in- at the switch level and at the router level. Router-based
creased availability of static web-based data and objects. transparent proxy caching (Figure 1b) uses policy-based
Standalone proxy conguration is shown in Figure 1a. routing to direct requests to the appropriate cache(s). For
Notice that one disadvantage to this design is that the example, requests from certain clients can be associated
cache represents a single point of failure in the network. with a particular cache.
When the cache is unavailable, the network also appears In switch-based transparent proxy caching (Figure 1c),
unavailable to users. Also, this approach requires that all the switch acts as a dedicated load balancer. Switches
user web browsers be manually congured to use the appro- are generally less expensive than routers. Switch-based
priate proxy cache. Subsequently, if the server is unavail- transparency is also often viewed as more attractive than
able (due to a long term outage or other administrative router-based transparency because there is not the addi-
reason), all of the users must re-congure their browsers in tional overhead that policy-based routing incurs.
order to use a dierent cache. The use of L4 switches for transparent caching is an
Browser auto-conguration is a recent trend which may example of how related network components play a role in
alleviate this problem. The IETF recently reviewed a pro- the eectiveness of a web caching solution. For transparent
posal for a Web Proxy Autodiscovery Protocol (WPAD) caching, these switches provide a form of local load balanc-
15], a means for locating nearby proxy caches. WPAD ing. There are other switches and solutions for local load
relies on resource discovery mechanisms, including DNS balancing as well as solutions for global load balancing.
records and DHCP, to locate an Automatic Proxy Cong- The use of related network components for this purpose is
uration (APC) le. covered in more detail in Appendix B.
One nal issue related to the standalone approach has
to do with scalability. As demand rises, one cache must 2.2 Adaptive Web Caching
continue to handle all requests. There is no way to dy-
namically add more caches when needed, as is possible with Adaptive web caching
20] views the caching problem as
transparent proxy caching (described below). a way of optimizing global data dissemination. A key
problem adaptive caching targets is the "hot spot" phe-
2.1.1 Reverse Proxy Caching nomenon, where various, short-lived Internet content can,
overnight, become massively popular and in high demand.
An interesting twist to the proxy cache approach is the no- Adaptive caching consists of multiple, distributed caches
tion of reverse proxy caching, in which caches are deployed which dynamically join and leave cache groups (referred to
near the origin of the content, instead of near clients. This as cache meshes) based on content demand. Adaptivity
is an attractive solution for servers or domains which ex- and the self-organizing property of meshes is a response
pect a high number of requests and want to assure a high to those scenarios in which demand for objects gradually
level of quality of service. Reverse proxy caching is also a evolves and those in which demand spikes or is otherwise
useful mechanism when supporting web hosting farms (vir- unpredictably high or low.
tual domains mapped to a single physical site), an increas- Adaptive caching employs the Cache Group Manage-
ingly common service for many Internet service providers ment Protocol (CGMP) and the Content Routing Proto-
(ISPs). col (CRP). CGMP species how meshes are formed and
Note that reverse proxy caching deployment is totally how individual caches join and leave those meshes. In
independent of client-side proxy caching. In fact, they may general, caches are organized into overlapping multicast
co-exist and collectively improve Web performance from groups which use voting and feedback techniques to es-
the user, network, and server perspectives. timate the usefulness of admitting or excluding members
from that group. The ongoing negotiation of mesh forma-
2.1.2 Transparent Caching tion and membership results in a virtual topology.
Transparent proxy caching eliminates one of the big draw- CRP is used to locate cached content from within the
backs of the proxy server approach: the requirement to existing meshes. It can be said that CRP is a more deter-
Figure 1: (a) standalone, (b) router-transparent, and (c) switch-transparent proxy caching
ministic form of hierarchical caching, which takes advan- which are HTTP header elements typically indicating that
tage of the overlapping nature of the meshes as a means a request be personalized. As web servers become more
for propagating object queries between groups, as well as sophisticated and customizable, and as one-to-one market-
propagating popular objects throughout the mesh. This ing e-commerce strategies
23] proliferate the Internet, the
technique relies on multicast communication between cache level of personalization is anticipated to rise.
group members and makes use of URL tables to intel- Active caching uses applets, located in the cache, to cus-
ligently determine to which overlapping meshes requests tomize objects that could otherwise not be cached. When
should be forwarded. a request for personalized content is rst issued, the origi-
One of the key assumptions of the adaptive caching ap- nating server provides the objects and any associated cache
proach is that deployment of cache clusters across adminis- applets. When subsequent requests are made for that same
trative boundaries is not an issue. If the virtual topologies content, the cache applets perform functions locally (at the
are to be the most exible and have the highest chance of cache) that would otherwise (more expensively) be per-
optimizing content access, then administrative boundaries formed at the originating server. Thus, applets enable cus-
must be relaxed so that groups form naturally at proper tomization while retaining the benets of caching.
points in the network.
3 Cache Deployment Options
2.3 Push Caching
As described in
16], the key idea behind push caching is There are three main cache deployment choices: those
to keep cached data close to those clients requesting that which are deployed near the content consumer (consumer-
information. Data is dynamically mirrored as the originat- oriented), near the content provider (provider-oriented),
ing server identies where requests originate. For example, and those which are deployed at strategic points in the
if trac to a west coast based site started to rise because network, based on user access patterns and network topol-
of increasing requests from the east coast (typical of what ogy and conditions. Below we discuss each of these options'
happens on weekdays at 9am EST), the west coast site advantages and disadvantages.
would respond by initiating an east coast based cache. Positioning caches near the client, as is done in proxy
As with adaptive caching, one main assumption of push caching (including transparent proxy caching) has the ad-
caching is the ability to launch caches which may cross vantage of leveraging one or more caches to a user commu-
administrative boundaries. However, push caching is tar- nity. If those users tend to access the same kind of content,
geted mostly at content providers, which will most likely this placement strategy improves response time by being
control the potential sites at which the caches could be de- able to serve requests locally.
ployed. Unlike adaptive caching, it does not attempt to Caches positioned near the content provider, on the
provide a general solution for improving content access for other hand, which is what is done in reverse proxy caching
all types of content, from all providers. and push caching, improve access to a logical set of con-
A recent study
29] regarding the eectiveness of cache- tent. This type of cache deployment can be critical to
hierarchies noted that well-constructed push-based algo- delay-sensitive content such as audio or video. Positioning
rithms can lead to a performance improvement of up to caches near or on behalf of the content provider allows the
1.25%. This study also notes the general dilemma that provider to improve the scalability and availability of con-
push caching encounters: forwarding local copies of ob- tent, but is obviously only useful for that specic provider.
jects incurs costs (storage, transmission) but - in the end Any other content provider must do the same thing.
- performance and scalability are only seen as improved if Of course, there are compromises between provider-
those objects are indeed accessed. oriented and consumer-oriented cache deployments. For
example, resource sharing and security constraints permit-
ting, is may be possible for multiple corporations to to
2.4 Active Caching share the same client-side caches through a common ISP.
The WisWeb project at the University of Wisconsin, Madi- Also, one can envision media hubs such as Broadcast.com
son, is exploring how caching can be applied to dynamic providing content-based caching on behalf of the actual me-
documents
8]. Their motivation is that the increasing dia providers. Finally, the use of both consumer-oriented
amount of personalized content makes caching such infor- and provider-oriented caching techniques is perhaps the
mation dicult and not practical with current proxy de- most powerful and eective approach, since it combines
signs. the advantages of both while lowering the disadvantages of
Indeed, a recent study
6] of a large ISP trace revealed each.
that over 30% of client HTTP requests contained cookies, The last approach, dynamic deployment of caches at
network choke points, is a strategy embraced by the adap- jects among caches also allows load balancing, and per-
tive caching approach. Although it would seem to provide mitting subsequent intercache communication allows the
the most exible type of cache coverage, it still work in overall logical system to eciently resolve requests inter-
progress and, to the best of the author's knowledge, there nally.
have not been any performance studies demonstrating its There are ve well-known protocols
19] for intercache
benets. The dynamic deployment technique also raises communication: ICP, cache digests, CRP, CARP, and WCCP.
important questions about the administrative control of Among these, ICP has the longest history and is the most
these caches what impact would network boundaries have mature. It evolved from the cache communication in Har-
on cache mesh formation? vest and was explored in more detail within Squid. With
Finally, a discussion about cache deployment would not ICP, caches issue queries to other caches to determine the
be complete without noting the capabilities of browsers to best location from which to retrieve the requested object.
do caching on a per-user basis, using the local le system. The ICP model consists of a request-reply paradigm, and
Obviously, while this is perhaps useful for a given user, it is commonly implemented with unicast over UDP.
does not aid in the global reduction of bandwidth or decline Although it was mentioned earlier that multiple dis-
in average network latency for common web objects. tributed caches can improve scalability, they can also im-
pede it, as ICP later revealed. One issue was raised by
2],
4 Design Techniques which identied desirable limits to the depth of the cache
hierarchy. For example, trees deeper than four levels pro-
Caching systems are generally evaluated according to three vided noticible delays. Another scalability concern was the
metrics: speed, scalability, and reliability. There are a number of ICP messages which could be generated as the
variety of design techniques on which many commercial number of cache peers increased. As noted in
11], there
and academic systems rely in order to improve performance is a direct relationship between the number of peer caches
in these respects. and the number of ICP messages sent.
That same study of ICP also raised the issue of mul-
4.1 Hierarchical Caching ticast, which is a key enabling technology of the adaptive
caching design. Its CRP protocol uses multicast to query
Hierarchical caching was pioneered in the Harvest Cache
3], cache meshes. Overlapping nodes are used to forward unre-
9]. The basic idea is to have a series of caches hierarchi- solved queries to other meshes in the topology. To optimize
cally arranged in a tree-like structure and to allow those the path of meshes to query, adaptive caching uses CRP
caches to leverage from each other when an object request to determine the likely direction of origin for the content.
arrived and the receiving cache did not have that object. Although multicast was a proposed solution suggested for
Usually, in hierarchical designs, child caches can query use with ICP when querying peer caches, the scalability
parent caches and children can query each other, but par- implications of this approach are still unclear.
ents never query children. This promotes an architecture Another technique related to cache-to-cache communi-
where information gradually lters down to the leaves of cation is the notion of cache digests, such as those imple-
the hierarchy. In a sense, the adaptive caching approach mented by Squid
11] and the Summary Cache
14]. Digests
also uses cache hierarchies (in the form of cache groups) to can be used used to reduce intercache communication by
diuse information from dynamic hotspots to the outlying summarizing the objects contained in peer caches. Thus,
cache clusters, but these hierarchies are peer-based: the request forwarding can be more intelligent and more e-
parent/child relationships are established per information cient. This is similar to the notion of a URL routing table,
object. Thus, in one case, a cache group might act as a which is used by the adaptive caching approach, also as a
parent for a set of information object X but also act as a way to more intelligently forward a request.
child (or intermediary) node for information object Y. Finally, there are two other proprietary protocols: Cisco's
With hierarchical caches, it has been observed that par- WCCP and Microsoft's CARP method. When used with
ent nodes can become heavily swamped during child query devices such as the Cisco Cache Engine
10], WCCP en-
processing. Commercial caches, such as NetCache, employ ables HTTP trac to be transparently redirected to the
clustering to avoid this swamping eect
12]. Cache Engine from network routers. WCCP has gradu-
ally been integrated into Cisco's Internet Operating Sys-
4.2 Intercache Communication tem (IOS) which Cisco ships with its routers. Recently,
Cisco announced it was licensing WCCP to vendors such
Web caching systems tend to be composed of multiple, dis- as Network Appliance and Inktomi.
tributed caches to improve system scalability, availability, In contrast to ICP-based approaches, where caches can
or to leverage physical locality. In terms of scalability and communicate with each other to locate requested content,
availability, having multiple, distributed caches permits a Microsoft's CARP
25] is deterministic: it uses a hashing
system to deal with a high degree of concurrent client re- scheme to identify which proxy has the requested infor-
quests as well as survive the failure of some caches during mation. When a request comes in from a client, a proxy
normal operation. In terms of physical locality, assuming evaluates the hash of the requested URL with the name of
that bandwidth is constant, simply having caches closer in the proxies it knows about, and the one with the highest
proximity to certain groups of users may be an eective value is realized to be the owner of that content.
way to reduce average network latencies, since there is of- The CARP hash-routing scheme is proposed as a means
ten a correlation between the location of a user and the for avoiding the overhead and scalability issues associated
objects requested. with intercache communication. In that sense, it is very
Regardless of why a logical cache system is composed similar to the CRP protocol used by the adaptive caching
of multiple, distributed caches, it often desirable to al- project, as well as the cache digest approach championed
low these caches to query each other. Distributing ob- by both Summary Cache
6] and Squid2. All of these eorts
attempt to reduce intercache communication. At least ve notable vendors have embraced microkernel
architectures
12, 31, 28, 10, 17] and these systems share a
4.3 Hash-Based Request Routing number of common features. In some, caches are modeled
as nite state machines. They are event driven, allowing
Hash-based routing
24] is used to perform load balancing them to scale better than thread-based approaches, those
in cache clusters. It uses a hash function to map a key, which are commonly used for deployment on general pur-
such as the URL or domain name of originating server, to pose operating systems. These microkernels also optimize
a cache within a cluster. Good hash functions are critical access to disk resources, by increasing the number of le
in partitioning workload evenly among caches or clusters handles for each process as well as creating very fast APIs
of caches. For example, NetCache uses MD5-indexed URL for disk hardware access.
hash routing to access clusters of peer child caches which
do not have overlapping URLs
12]. 4.6 Content Prefetching
Since hashing can be used as the basis for cache selec-
tion during object retrieval, hash-based routing is seen as Prefetching refers to the retrieving data from remote servers
an intercache communication solution. Its use can reduce in anticipation of client requests. Pre-fetching can be clas-
(or eliminate) the need for caches to query each other. For sied as either local-based or server-hint-based
18]. The
example, in the Microsoft CARP method
25], caches never former relies on data such as reference patterns to choose
query each other. Instead, requests are made to caches as what to prefetch next. The latter uses data accumulated
a function of the hashing the URL key. by the server, such as which objects have been frequently
There are also scenarios in which hash-based routing is requested by other clients. One disadvantage of the server-
used only to point the caller in the direction of the content. hint approach is that integration between client and server
This can be the case for very large caching infrastructures, is more complicated, since there needs to be some coor-
such as the type described by the adaptive caching project. dination related to the communication of the hints from
When locating remote content, hash-based routing can be the server to the client. Also, as noted by
8], there are
used as a means to point the local cache in the direction three distinct prefetching "scenarios": prefetching between
of other caches (or cache meshes) which either have the clients and servers, between clients and proxies, and be-
object or can get it (from other caches or the originating tween proxies and servers. Several studies
22, 18, 8] have
server). attempted to quantify the benets of prefetching for caching
for these various scenarios.
4.4 Optimized Disk I/O Kroeger et al.
18] looked at prefetching between proxies
and servers. They found that, without prefetching, proxy
Many systems, especially those which are commercial, have caching with an unlimited cache size resulted in a 26%
spent substantial time tuning their disk I/O, treating the reduction in latency. With a basic prefetching strategy in
object cache as one does a high performance database. Net- place, that reduction in latency could be improved to 41%,
Cache
12] and Novell's ICS
31], for example, either use and with a more sophisticated prefetching strategy (server
APIs provided by their own custom microkernel or use low hints), that number could be further improved to be 57%.
overhead APIs provided by the host operating system. As might be expected, having a good prefetching algorithm
More specic disk I/O optimizations include improving and higher prefetch lead times plays an important role in
the spatial locality of objects and using in-memory data optimizing the prots associated with this technique.
structures to avoid disk I/O altogether, as in
31]. The Padmanabhan and Mogul
22] looked at prefetching be-
former technique takes advantage of logically related con- tween clients and servers, and classied their gains of in
tent in order to determine where on the disk to place that terms of object access times, creating three categories of
content. In-memory data structures (such as hash tables) such gains: zero, small, or large access times. For exam-
can be used to quickly determine if an object has been ple, that study found that, without prefetching, these were
cached if querying these data structures nds that the ob- 20% (zero access time), 0% (small), and 80% (large). With
ject has indeed been cached, disk operations can begin to prefetching, that distribution improved to 42%, 6%, and
actually locate the object (if not already in RAM). Other- 52%. That study also found that, in addition to the sig-
wise, disk access can be avoided altogether and the object nicant improvement in object access time under prefetch-
can be fetched from the originating server. Thus, these ing, there was not an increase in the variability of those
data structures summarize the contents of a cache so that retrievals.
costly I/O operations can be avoided when possible. A study by Cao et. al.
8] explored scenarios in which
proxy caches pushed content out to clients connected through
4.5 Microkernel Operating Systems low-bandwidth links (i.e., modem users). Thus, this ex-
plored the issues of prefetching between clients and proxies.
Microkernel architectures have emerged as a mechanism for That study showed that the average latencies encountered
optimizing cache performance, specically in terms of im- by these types of clients could be reduced by over 23%.
proving resource allocation, task execution, and disk access Their design was based on a proxy that predicted which
and transfer times. The general motivation for microkernel- documents a user might access in the near future and then
based approaches has been that general purpose operating pushed those documents out to the users local cache during
systems, such as UNIX and Windows NT, are not suitable periods of idle network activity. This study acknowledged
for the specialized needs of optimized web caching. These that their results were based on simulations constructed
general purpose operating systems handle resources such from actual modem trace logs, where the average band-
as processes and le handles in a cooperative way, whereas width between clients and proxies was 21K/sec.
caching systems have specic needs about how these re-
sources are managed.
4.7 Cache Consistency Methods practicality of handling dynamic and real-time data, and
Cache consistency is concerned with ensuring that the cached dealing with complex functional objects (such as Java pro-
object does not reect stale or defunct data. For pur- grams). The next few years will likely be exciting ones as
poses of our discussion, "stale or defunct data" refers to web caching researchers and vendors deal with these chal-
locally cached objects which are either (a) no longer are lenges.
equivalent to the object on the originating server (a phe-
nomenon discovered through comparison of object check- 6 Acknowledgements
sums) or (b) have since been obsoleted. As identied by
Dingle and Partl
13], there are four well-known cache We are grateful to Gary Tomlinson, for making his ICS
consistency maintenance techniques to deal with detect- report quickly available to us, and to Peter Danzig, for his
ing such instances: client polling, invalidation callbacks, helpful discussions.
time-to-live, and if-modied-since.
With client polling, caches timestamp each cached ob- A Measuring Performance
ject and periodically compare the cached object timestamp
with that of the original object (at the originating server). Given all the existing caching solutions, how does one de-
Out of date objects are discarded and newer versions are cide which one is the best? Ultimately, the eectiveness of
fetched when necessary. This approach is very similar a web caching system will be largely dependent on the need
to the timestamp-based le system cache consistency ap- and constraints of the deployer. A given architecture might
proach employed by Sun Network File System (NFS)
21]. inherently be more suitable (and thus perform) better for
Instead of having clients periodically check for inconsis- certain scenarios.
tency, invalidation callbacks rely on the originating server For example, corporations may have signicant success
to identify stale objects. The server must then keep track with basic proxy serving, and may not view browser con-
of the proxies that are caching its objects and contact guration as an issue (perhaps the company enforces use
these proxies when objects change. On one hand, callbacks of a particular browser which supports auto-conguration).
improve cache consistency as well as save network band- Other deployers, such as ISPs, may prot more from router
width by not requiring clients to periodically poll servers. or switch-based transparent caching approaches, deployed
On the other hand, there are clear scalability issues and at the edges of their networks. Finally, there are content
privacy/security concerns regarding this approach, since providers who have no control over how their content is
servers need to track caches for each cached object. cached at the client side - they merely want to deliver
Similar to the limited life of packets in transit on com- their content in a scalable manner. Thus, reverse prox-
munication networks, cached objects can also have a time- ying or push caching might be most suitable, given those
to-live (TTL). When expired, these objects are dumped constraints.
and new copies are fetched. The TTL approach does not Regardless of deployment needs, there still is a demand
guarantee that an object which never changes will not be to quantify the performance of various caching systems.
continually refetched, but it does signicantly limit the re- Similar to the standard SPEC benchmarks used to mea-
peated retrieval. Adaptive TTL is a technique whereby sure the performance of multiprocessor architectures and
the TTL of an object is updated within the cache, when a the TPC family of benchmarks used with database sys-
cache hit occurs. tems, the cache developer and user community recently
If-modied-since is a recent variation to TTL-based con- recognized the need for standard benchmarking tools to
sistency (Squid migrated to this approach) which is de- evaluate the performance of cache systems. Cache-specic
mand driven. In this scenario, caches only invalidate ob- performance metrics include amount of bandwidth saved,
jects when they are requested and their expiration date has response time, hit rate, and various scalability metrics.
been reached. In this section, we briey describe two well-known cache
Despite the concerns listed above,
7] reported that in- benchmarks: the Wisconsin Proxy Benchmark and Poly-
validation and adaptive TTL are comparable in terms of graph.
network trac, client response times, and server CPU work-
load. These methods were found to be preferable over the A.1 The Wisconsin Proxy Benchmark
other two approaches, given situations where consistency
is important. Furthermore, it was also found that choosing The Wisconsin Proxy Benchmark (WPB)
1] was one of
to support strong consistency over weak consistency does the rst publicly available cache benchmarking itools. Its
not necessarily result in increased network bandwidth. distinguishing features include support for studying the ef-
fect of adding disk arms and the eect of handling low-
5 Conclusions bandwidth (modem-based) clients. One interesting nding
through use of WPB was that latency advantages due to
Web caching is an important technology which can im- caching were essentially erased when considering the over-
prove content availability, reduce network latencies, and all prot to modem-based clients.
address increasing bandwidth demands. In this paper, we While WPB addresses a number of important bench-
have presented several caching architectures, deployment marking requirements, such as initial support for temporal
options, and specic design techniques. We have shown locality and some ability to generate load on web server
that while there are many approaches to designing and processes, it has some limitations. These include lack of
deploying caches, some issues (such as intercache commu- support for modeling spatial locality, persistent HTTP 1.1
nication) remain common among them. connections, DNS lookups, and realistic URLs. The WPB
There remain a number of open issues in web caching. has been used to benchmark several proxy caches, includ-
Some of the technical issues include content security, the ing some of the research and commercial systems men-
tioned in
1].
A.2 Polygraph
Polygraph
26] is a recently developed, publicly available
cache benchmarking tool developed by NLANR. It can sim-
ulate web clients and servers as well as generate workloads
that try to mimic typical Web access.
Polygraph has a client and a server component, each
of which uses multiple threads to generate or process re-
quests. This allows Polygraph to simulate concurrent re-
quests from multiple clients to multiple servers. Polygraph
can generate dierent types of workload to simulate vari-
ous types of content popularity. For example, requests can
be generated which obey a Zipf-like distribution, which is
largely believed to be a good estimate of real web usage
patterns
4].
More recently, Polygraph has been playing an increas-
ing role in holding open benchmark Web Caching Bake-
o's as a way of inspiring the development community and Figure 2: Request-based Routing via L5 switching
encouraging competition towards good caching solutions.
A summary of their study comparing a number of commer-
cial and academic systems can be found at
27]. ing approaches load-balancing from a more semantic level.
One example is the Arrowpoint Content SmartSwitch (CSS).
B Related Network Components Like a standard L4 switch, the CSS performas load balanc-
ing for both client-based or server-based deployment strate-
The eectiveness of web caching is, in part, aected by the gies. However, CSS operates above the transport layer and
related network components which are deployed in con- balances based on requested content type.
junction with a cache, cache cluster, or web server. We For instance, on the client side, CSS can be congured
now summarize a few types of local and global load bal- to redirect static HTTP requests to a local cache cluster,
ancing solutions which augment existing cache systems. and bypass caching for dynamic HTTP requests. When
balancing load among content servers, CSS can distinguish
B.1 Local Load Balancers among dierent \higher-level" protocols like HTTP, SSH,
and NTTP, and divert them to the appropriate server or
Local load balancing is concerned with improving the scal- group of servers. Furthermore, requests can be redirected
ability, availability, and reliability of web servers or web based on content-based quality-of-service. For example, as
caches. Incoming requests are intercepted and "redirected" Figure 2 shows, administrators can redirect SSH requests
to one member of a group of servers or caches, all of which les to dierent servers. Fulllment of those requests may
exist in a common geographic location. By splitting the per request type (i.e., requests for video les may be slowly
requests among members of a group, one server or cache is lled, whereas text les are handled much quicker, since the
not forced to handle all incoming requests. Thus, such ser- latter is easier to transmit than the former).
vices can scale better and are more robust (no single point Distributed Web serving solutions may combine caching,
of failure). Local load balancing usually involves distribut- L4-, and L5- switching. Take for example the case of a ISP
ing requests according to some load balancing algorithm, that also provides Web hosting services. An L4 switch can
such as round-robin or least-connections. be placed in front each group of replicated servers, each
One way to achieve local load balancing is to use L4 of which serving dierent content. An L5 switch can be
switches in a transparent caching environment. In that placed at the entrance of the ISP so as to divert dierent
case, client outbound requests can be intercepted and redi- content to the appropriate L4-replicated server group.
rected towards members of a cache cluster. Another type of
local load balancing can be achieved by using L5 switches, B.2 Global Load Balancers
which perform a more semantic style of load balancing.
The Alteon ACE-Director and CACHE-Director switches Global load balancers are similar to local load balancers
represent examples of how L4 switching can improve server in that they have the goal of improving performance and
and cache load balancing. Another example switch is the scalability However, instead of distributing requests among
LocalDirector, from Cisco, which is typically deployed to members of a group which are geographically local to one
distribute incoming load upon multiple web servers or re- another, global load balancers usually distribute requests
verse proxy caches. to servers or caches which are near the client, to achieve
Recent alternatives to L4-based switching have been lower average network latencies, as well as to improve scal-
L7-switching and L2 switches which support L4 nd L7 pro- ability.
cessing. The former type provides the same functionality For example, if an organization has web servers de-
as an L4 switch but also includes sophisticated partition- ployed throughout the world, a global load balancer can
ing support and comes with IP lters that normally need determine the appropriate host (based on physical loca-
to be added to L4 switches. Again, Alteon is example of tion) for a given client request and return the IP address
a vendor which makes "advanced L2" switches that also of that server to the client. Since browsers often cache DNS
support L4 and L7 processing. entries (local POPs also do this, etc), repeated requests by
More recently, Layer 5 (L5) switching has emerged as a client to a given logical Internet address will continue to
another load balancing alternative. This type of switch- result in those requests being forwarded to the most local
server. 16] J. Gwertzman and M. Seltzer. The case for geographical push
Cisco's Distributed Director is one example of a global caching. Fifth Annual Workshop on Hot Operating Systems,
load balancer. When a client request is issued, it is initially 1995.
directed towards a primary DNS server. This request even- 17] InfoLibria. Dynacache whitepaper. Available from
tually reaches the local DNS server for that logical site. At www.infolibria.com, 1999.
this point, the local DNS server contacts the Distributed- 18] T.M. Kroeger, D.E. Long, and J.C. Mogul. Exploring the
Director and determines the IP address of the geographi- bounds of web latency reduction from caching and prefetch-
ing. Proceedings of the Symposium on Internet Technologies
cally appropriate server for handling the request. This IP and Systems, 1997.
acdress is then sent back to the client, and will most likely 19] I. Melve. Inter-cache communication protocols. IETF WREC
be cached for futher use. Working Group Draft, 1999.
Radware's Cache Server Director (CSD), is another global 20] S. Michel, K. Nguyen, A. Rosenstein, L. Zhang, S. Floyd, and
load balancing solution, dedicated towards improving cache V. Jacobson. Adaptive web caching: towards a new global
fault tolerance. Consider the case of a content provider caching architecture. Computer Networks & ISDN Systems,
whose replicated servers are located at dierent Internet 1998.
sites. A CSD is placed at each site as a front-end to the 21] Sun Microsystems. NFS: Network File System.
local set of replicated servers. Each CSD runs a local DNS 22] V. Padmanabhan and J.C. Mogul. Improving http latency.
server and exchanges network proximity as well as server Second World Wide Web Conference, 1994.
load information with the other CSDs. A CSD maps a 23] D. Peppers and M. Rogers. The one to one future: Building
client request's host name to one of the ISP site's IP ad- relationships one customer at a time. Doubleday, 1997.
dress. It makes its decision based on where the request 24] K. Ross. Hash routing for collections of shared web caches.
originated and its current network and server load infor- IEEE Network, November/December 1997.
mation (about itself and the other server clusters). 25] K. Ross and V. Valloppillil. Cache array routing protocol v1.1.
IETF WREC Working Group Draft, 1998.
References 26] A. Rousskov. Polygraph. http://polygraph.ircache.net/.
1] J. Almeida and P. Cao. Measuring proxy performance with the 27] A. Rousskov and D. Wessels. First web cache bakeo. Fourth
wisconsin proxy benchmark. Technical Report 1373, Com- International Web Caching Workshop, 1999. Results available
puter Sciences Dept, Univ. of Wisconsin-Madison, 1998. at http://bakeo_ircache_net/bakeo-01.
2] M. Baentsch, L. Baum, G. Molter, S. Rothkugel, and P. Sturm. 28] CacheFlow Technologies. Web caching white paper.
World wide web caching - the application level view of the http://www_cacheow_com.
internet. IEEE Communications Magazine vol. 35 June 1997. 29] R. Tewari, M. Dahlin, H.M. Yin, and J.S. Kay. Beyond hier-
3] C. M. Bowman, P. Danzig, D. R. Hardy, U. Manber, , and archies: Design considerations for distributed caching on the
M. Schwartz. Harvest: A scalable, customizable discovery and internet. Technical Report TR98-04 University of Texas At
access system. Technical Report CU-CS-732-94 University of Austin, 1998.
Colorado Boulder, 1994. 30] K. Thompson, G. Miller, and R. Wilder. Wide-area internet
4] L. Breslau, L. Fan, P. Cao, G. Phillips, and S. Shenker. The trac patterns and characteristics. Proceedings of Third In-
implications of zipfs law for web caching. International Con- ternational Conference on Web Caching, 1998.
ference on Web Caching, 1998. 31] G. Tomlinson, D. Major, and R. Lee. High-capacity internet
5] R. Caceres, P. Danzig, D. Jamin, and D. Mitzel. Characteris- middleware: Internet caching system architectural overview.
tics of Wide-Area TCP/IP Conversations. Proceedings of SIG- Second Workshop on Internet Server Performance, 1999.
COMM, 1991.
6] R. Caceres, F. Douglis, A. Feldmann, and M. Rabinovich. Web
proxy caching: The devil is in the details. Proceedings of the
Workshop on Internet Server Performance, 1998.
7] P. Cao and C. Liu. Maintaining strong cache consistency in
the world wide web. Proceedings of Third International Con-
ference on Web Caching, 1998.
8] P. Cao, J. Zhang, and Kevin Beach. Active cache: Caching dy-
namic contents on the web. Proceedings of IFIP International
Conference on Distributed Systems Platforms and Open Dis-
tributed Processing, 1998.
9] A. Chankhunthod, P. Danzig, C. Neerdaels, M. Schwartz, and
K. Worrell. A hierarchical internet object cache. Proceedings
of the USENIX Technical Conference, 1996.
10] Cisco. Cisco Systems Cache Engine 500 Series Q and A.
http://www.cisco.com/.
11] K. Clay and D. Wessels. ICP and the Squid Web Cache. 1997.
12] P. Danzig. Network Appliance NetCache White Paper: Archi-
tecture and Deployment. http://www.netapp.com.
13] A. Dingle and T. Partl. Web cache coherence. Fifth Interna-
tional World Wide Web Conference, 1996.
14] L. Fan, P. Cao, J. Almeida, and A.Z. Border. Summary
cache: A scalable wide-area web cache sharing protocol.
Computer Sciences Department University of Wisconsin-
Madison, 1998.
15] P. Gauthier, J. Cohen, M. Dunsmuir, and C. Perkins. The web
proxy auto-discovery protocol. http://www.ietf.org/internet-
drafts/draft-ietf-wrec-wpad-00.txt, 1998.

WWW Caching-Trends and Techniques

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

WWW Caching-Trends and Techniques

Hochgeladen von

Copyright:

Verfügbare Formate

World Wide Web Caching: Trends and Techniques

Greg Barish Katia Obraczka

Abstract content provider. In this case, server-side caches reduce re-

Das könnte Ihnen auch gefallen