Sie sind auf Seite 1von 43

Web Content Distribution

Trends & Techniques

Dr. G P SAJEEV
Asst.Professor

Dept. of Electronics & Communication Govt Engineering College Kozhikode


gpsajeev@gmail.com

November 29, 2011

Web Content Distribution

1 / 43

Contents
Outline
1 2

Background Web Content Distribution


Web Caching Content Delivery Networks (CDN) P2P Networking

3 4 5

Research & Project Topics Summary References

Web Content Distribution

2 / 43

Web Content Distribution


Main ingredients of the Web
URL, HTML, and HTTP HTTP: the protocol and its stateless property

Web Systems Components


Clients Servers DNS (Domain Name System)

Interaction with underlying network protocol: TCP Scalability and performance enhancement
Server farms Web Caching Content Distribution Network (CDN)

Web Content Distribution

3 / 43

Main ingredients of the Web


URL
Denotes the global unique location of the web resource Formatted string
e.g., http://www.princeton.edu/index.html Protocol for communicating with server (e.g., http) Name of the server (e.g., www.princeton.edu) Name of the resource (e.g., index.html)

HTML
Actual content of web resource, represented in ASCII

HTTP
Protocol for client/server communication

Web Content Distribution

4 / 43

Main ingredients of the Web


HTML
HyperText Markup Language (HTML)
Format text, reference images, embed hyperlinks Representation of hypertext documents in ASCII format Interpreted by Web browsers when rendering a page

Web Page
Base HTML le Referenced objects (e.g., images), Each object has its own URL

+
Straight-forward and easy to learn Simplest HTML document is a plain text le Automatically generated by authoring programs
Web Content Distribution 5 / 43

Web Access

+ +
Client program
E.g., Web browser Running on end host Requests service

+
Server program
Provides service E.g., Web server

Web Content Distribution

6 / 43

Web System Components


+
Clients
Send requests and receive responses Browsers, spiders, and agents

Servers
Receive requests and send responses Store or generate the responses

DNS (Domain Name System) and the Web


Distributed network infrastructure Transforms site name to IP address Direct clients to servers

Web Content Distribution

7 / 43

Web Browser
Functions
Generating HTTP requests
User types URL, clicks a hyperlink, or selects bookmark User clicks reload, or submit on a Web page Automatic downloading of embedded images

Layout of response
Parsing HTML and rendering the Web page Invoking helper applications (e.g., Acrobat, PowerPoint)

Maintaining a cache
Storing recently-viewed objects Checking that cached objects are fresh

Web Content Distribution

8 / 43

Performance & Scalability

+
Caching & Replication

Content Delivery Networks (CDN) P2P Networking

Web Content Distribution

9 / 43

Caching
Relevance of Cache Web Cache
To keep the Web services attractive, the client-side latencies and congestion in the network are to be reduced to a tolerable limit. Caching the documents at strategic points across the network is a solution, which is termed as Web caching.
1

Reducing the cost of connecting to the Internet. Reducing the latency of todays WWW. Bandwidth will always have some cost. Non-uniform bandwidth and latencies. Bandwidth demands continue to increase.

Web Content Distribution

10 / 43

Web Cache

Client-Side Cache Proxy-Cache


All the requests are routed via cache server. Cache serves the objects which are available in it. Cache Hit:- Objects will be delivered faster.

Performance Parameters
Hit Ratio (HR) Byte Hit Ratio (BHR) Mean Response Time (MRT)

Web Content Distribution

11 / 43

Caching Tasks
Replacement To decide whether an object is to be replaced/evicted from the cache store, when the cache storage is full. This is the process of nding out the best candidate for eviction. Admission To decide whether an object is to be cached or admitted to the cache store, when it arrives from the origin server. Consistency Handling To decide an action (invalidate/update/pre-fetch), when an object in the cache store becomes stale.

Web Content Distribution

12 / 43

Cache: Example
Assumptions
average object size = 100,000 bits avg. request rate from institutions browser to origin serves = 15/sec delay from institutional router to any origin server and back to router = 2 sec

Network

Consequences
utilization on LAN = 15% utilization on access link = 100% total delay = Internet delay + access delay + LAN delay = 2 sec + minutes + milliseconds
Web Content Distribution 13 / 43

Cache: Example
Possible solution
increase bandwidth of access link to, say, 10 Mbps

Consequences
utilization on LAN = 15 utilization on access link = 15% Total delay = Internet delay + access delay + LAN delay = 2 sec + msecs + msecs often a costly upgrade
Web Content Distribution 14 / 43

Network

Cache: Example
Install Cache
suppose hit rate is 40 %

Consequences
40 % requests will be satised almost immediately 60% requests satised by origin server utilization of access link reduced to 60%, resulting in negligible delays (say 10 msec) total delay = Internet delay + access delay + LAN delay = 0.6*2 sec + .6*.01 secs +
Web Content Distribution 15 / 43

Network with cache

Web Proxy

+ Web Proxies are Intermediaries Web Proxies as trac lters


A server to the client A client to the server

Web Proxies are Intermediaries

Web Content Distribution

16 / 43

Proxy Caching
The process
Client 1 requests http://www.foo.com/fun.jpg
Client sends GET fun.jpg to the proxy Proxy sends GET fun.jpg to the server Server sends response to the proxy Proxy stores the response, and forwards to client

Client 2 requests http://www.foo.com/fun.jpg


Client sends GET fun.jpg to the proxy Proxy sends response to the client from the cache

Benets
Faster response time to the clients Lower load on the Web server Reduced bandwidth consumption inside the network
Web Content Distribution

17 / 43

Other Functions
Anonymization
Server sees requests coming from the proxy address rather than the individual user IP addresses

Transcoding
Converting data from one form to another. E.g., reducing the size of images for cell-phone browsers

Prefetching
Requesting content before the user asks for it

Filtering
Blocking access to sites, based on URL or content

Web Content Distribution

18 / 43

Content Delivery Networks


Motivation
Providers want to oer content to consumers
Eciently Reliably Securely Inexpensively

The server and its link can be overloaded Peering points between ISPs can be congested Alternative solution: Content Distribution Networks
Geographically diverse servers serving content from many sources

CDN

Web Content Distribution

19 / 43

Content Delivery Networks


CDN
Proactively replicate data by caching static pages Architecture
Backend servers Geographically distributed surrogate servers Redirectors (according to network proximity, balancing) Clients

Redirector Mechanisms
Augment DNS to return dierent server addresses Server-based redirection: based on HTTP redirect feature

Web Content Distribution

20 / 43

CDN Architecture
+
The content providers are the CDN customers.

CDN company installs hundreds of CDN servers throughout Internet in lower-tier ISPs, close to users CDN replicates its customers content in CDN servers. When provider updates content, CDN updates servers

Web Content Distribution

21 / 43

CDN Principle
Origin server
www.foo.com distributes HTML Replaces:

http://www.foo.com/sports.ruth.gif with http://www.cdn.com /www.foo.com/sports/ruth.gif CDN company


cdn.com distributes gif les uses its authoritative DNS server to route redirect requests
Web Content Distribution 22 / 43

More about CDN


Routing requests
CDN creates a map, indicating distances from leaf ISPs and CDN nodes when query arrives at authoritative DNS server:
server determines ISP from which query originates uses map to determine best CDN server

not just Web pages


streaming stored audio/video streaming real-time audio/video
CDN nodes create application-layer overlay network

Web Content Distribution

23 / 43

P2P Network
P2P
no always-on server arbitrary end systems directly communicate peers are intermittently connected and change IP addresses Three topics:
File distribution Searching for information Case Study: Skype

P2P

Web Content Distribution

24 / 43

P2P and Client-Server model

P2P

Web Content Distribution

25 / 43

P2P and Client-Server model

P2P

Web Content Distribution

26 / 43

File Distribution Time

P2P

Web Content Distribution

27 / 43

Client-Server Vs. P2P

P2P

Web Content Distribution

28 / 43

Bit Torrent

P2P: BT

Web Content Distribution

29 / 43

Bit Torrent

P2P: BT

Web Content Distribution

30 / 43

Bit Torrent

P2P: BT

Web Content Distribution

31 / 43

Bit Torrent

P2P: BT

Web Content Distribution

32 / 43

Bit Torrent
single point of failure performance bottleneck copyright infringement: target of lawsuit is obvious

Centralized Directory

+ le transfer is decentralized, but locating content is highly centralized


Web Content Distribution 33 / 43

Bit Torrent

Query ooding

Web Content Distribution

34 / 43

Bit Torrent

Query ooding

Web Content Distribution

35 / 43

Bit Torrent

Hierarchical Overlay

Web Content Distribution

36 / 43

Bit Torrent

Skype

Web Content Distribution

37 / 43

Project & Research Areas


Web Cache
Class based Intelligent approaches Admission Control Cache Placement

Overview

Literature
Cache models and architecture [?, Dolgikh2002], [?, Rodriguez2002], [?, Starobinski2001] Trac and workload characterization [?, Bai2004] [?, Breslau1999], [?, Dill2002]

Web Content Distribution

38 / 43

Project & Research Areas


Web Cache
General caching replacement algorithms based on recency, frequency, rank of the object etc [?, Jin2001], [?, Romano2008],[?, Psounis2002] Caching schemes based on trac characteristics [?, Bahn2002] [?, Kayari2006],[?, Kumar2009], [?, He2009]. Caching scheme based on object properties [?, Koskela2003] Adaptive and intelligent caching [?, Boonchi2001], [?, Cobb2008], [?, Foong1999], [?, Sulaiman2008]

CDN & P2P


CDN: Content Classication & Caching [?], [?], P2P: Video Delivery, YouTube Trac Characterization, Data Management [?], [?], [?]
Web Content Distribution 39 / 43

Simulation Tools & Frame Works


Web Cache
NS2 :- http://isi.edu/nsnam/ns/ Web Cache Simulator:- http://pages.cs.wisc.edu/cao/ Data Source:- http://ircache.net/

CDN
CDNSim: http://sourceforge.net/projects/cdnsim/ Data Source :- http://cdn.novell.com/cached/video/bs08/LLK9.iso

P2P
Peersim:- http://peersim.sourceforge.net P2PSim:- http://pdos.csail.mit.edu/p2psim/ D-P2P-Sim:- http://www.ohloh.net/p/d-p2p-sim Data Source:- http://nsl.cs.sfu.ca/wiki/index.php/P2PTrac
Web Content Distribution 40 / 43

Concluding Remarks

+
Web Content Distribution has a history of more than one decade. Web Caching, CDN and P2P, all co-exist Plays an important role in scalability and performance enhancement

Web Content Distribution

41 / 43

References I

Web Content Distribution

42 / 43

End

Thanks

Web Content Distribution

43 / 43

Das könnte Ihnen auch gefallen