An Architecture For Distributed Content Delivery Networks PDF

An Architecture for Distributed Content Delivery Network
A minor thesis submitted in partial fullment of the requirements for the degree of Masters of Applied Science (Information Technology)
Jaison Paul Mulerikkal School of Computer Science and Information Technology Science, Engineering, and Technology Portfolio, Royal Melbourne Institute of Technology Melbourne, Victoria, Australia July 17, 2007
Declaration
This thesis contains work that has not been submitted previously, in whole or in part, for any other academic award and is solely my original research, except where acknowledged. This work has been carried out since January 2007, under the supervision of Dr.Ibrahim Khalil.
Jaison Paul Mulerikkal School of Computer Science and Information Technology Royal Melbourne Institute of Technology July 17, 2007
Acknowledgment
I would like to thank Dr. Ibrahim Khalil, for his continuous support and guidance throughout the course of this minor thesis. It is his constant inspiration and encouragement that helped me to complete this task, successfully. I specially thank him for his painstaking eorts in proof reading the drafts of this work. I also thank Dr Jiankun Hu for his valuable suggestions and contributions towards this project.
ii
Contents
1 Introduction 2 Background 2.1 CDN Main Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.1.7 2.1.8 2.2 Surrogate Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DNS Lookup and Redirection . . . . . . . . . . . . . . . . . . . . . . . DNS Load Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Selection of Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cached Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outsourcing Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accounting and Billing Mechanism . . . . . . . . . . . . . . . . . . . . 3 9 9 9 9 10 11 11 11 12 12 13 13 14 15 16 17 17
Conventional CDN Architectures . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 2.2.3 Commercial (Client-Server) Architecture . . . . . . . . . . . . . . . . . Academic (Peer-to-Peer) Architecture . . . . . . . . . . . . . . . . . . Limitations of Existing CDN Architectures . . . . . . . . . . . . . . .
2.3
Distributed Content Delivery Network - An Eective Alternative . . . . . . .
3 Architecture - Distributed Content Delivery Network 3.1 DCDN Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iii
3.1.1 3.1.2 3.2
Distribution of Content - The Process . . . . . . . . . . . . . . . . . . Content Delivery to a User . . . . . . . . . . . . . . . . . . . . . . . .
19 21 25 25 26 27 27 28 30 31 31 32 33 34 38 38 39 39 41 41 43
DCDN Design Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 3.2.2 3.2.3 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eective Redirection and Load-balancing Algorithm . . . . . . . . . . Billing and SLA (Service Level Agreement) Verication Software . . .
3.3
Business Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 Network Marketing (NM)/ Multi Level Marketing (MLM) . . . . . . . Special Scenarios of DCDN Advantage . . . . . . . . . . . . . . . . . .
4 Performance Analysis and Load Balancing Algorithm 4.1 4.2 4.3 4.4 Performance Parameters and Assumptions . . . . . . . . . . . . . . . . . . . . Queuing Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Queuing Theory Modeling for Dierent Scenarios . . . . . . . . . . . . . . . . Load Balancing Algorithm for DCDN Servers . . . . . . . . . . . . . . . . . .
5 Simulations and Results 5.1 5.2 5.3 5.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview of Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 5.4.2 5.4.3 Page Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . DCDN Surrogate - CPU Utilization vs. CDN Server - CPU Utilization DCDN Server - CPU Utilization vs. CDN Load Balancer - CPU Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 44
5.5
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv
6 Conclusion and Future work 6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46 47 48 49 50 51
A Softwares Used B Abbreviations C Symbols D Simulation Snapshots
List of Figures
1.1 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 4.1 4.2 5.1 5.2 5.3 CDNs and Web Content Distribution . . . . . . . . . . . . . . . . . . . . . . . DCDN Content Distribution Architecture . . . . . . . . . . . . . . . . . . . . DCDN Content Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DCDN Basic Transition Diagram . . . . . . . . . . . . . . . . . . . . . . . . . DCDN Transition Diagram - Including Contingency Plans . . . . . . . . . . . Local DCDN Server Zones - Contingency Plan . . . . . . . . . . . . . . . . . DCDN Transition Diagram - Including Security Solutions . . . . . . . . . . . Pyramid Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MLM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Utilization v/s Total System Delay . . . . . . . . . . . . . . . . . . . . . . . . Utilization v/s Rejection Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . Page Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 18 20 22 23 24 26 28 29 34 35 42 43 44 52 53 54 54
DCDN Surrogate(Server) Utilization . . . . . . . . . . . . . . . . . . . . . . . DCDN Server (Load Balancer) Utilization . . . . . . . . . . . . . . . . . . . .
D.1 Simulation Snapshot - CDN . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.2 Simulation Snapshot - DCDN . . . . . . . . . . . . . . . . . . . . . . . . . . . D.3 Application Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.4 Prole Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables
1.1 1.2 5.1 Commercial Content Delivery Networks . . . . . . . . . . . . . . . . . . . . . Academic Content Delivery Networks . . . . . . . . . . . . . . . . . . . . . . 5 6 40
Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Abstract
Commercial Content Delivery Networks create their own network of servers around the globe to eectively deliver Web content to the end-users. The peering of Content Delivery Networks (CDN) increase the eciency of commercial CDNs. But still the high rental rates resulting from huge infrastructure cost make it inaccessible to medium and low prole clients . Academic models of peer-to-peer CDNs aim to reduce the nancial cost of content distribution by forming volunteer group of servers around the globe. But their eciency is at the mercy of the volunteer peers whose commitment is not ensured in their design. We propose a new architecture that will make use of the existing resources of common Internet users in terms of storage space, bandwidth and Internet connectivity to create a Distributed Content Delivery Network (DCDN). The prot pool generated by the infrastructure savings will be shared among the participating nodes (DCDN surrogates) which will function as an incentive for them to support DCDN. Since the system uses the limited computing resources of common Internet users, we also propose a suitable load balancing (LB) algorithm so that DCDN surrogates are not burdened with heavy load and requests are fairly assigned to them. Simulations have been carried out and the results show that the proposed architecture (with LB) can oer same or even better performance as that of commercial CDN.
Chapter 1
Introduction
The growth of the World Wide Web and new modes of Web services have triggered an exponential increase in Web content and Internet trac [Molina et al., 2004; Vakali and Pallis, 2003; Presti et al., 2005]. The Web content consists of static content (e.g. Static HTML pages, images, documents, software patches), streaming media (e.g. audio, real time video) and varying content services (e.g. directory service, e-commerce service, le transfer service) [R. Buyya and Tari, 2001]. As the Web content and the Internet trac increases, individual Web servers nd it dicult to cater to the needs of end-users. In order to store and serve huge quantities of Web content, Web server farms - a cluster of Web servers functioning as a single unit - are introduced [Burns et al., 2001]. Even those Web server farms nd it dicult to deal with ash crowds - large number of simultaneous requests for a popular content - that are frequently experienced in Web trac [Pan et al., 2004]. Moreover, those server farms are geographically distant from the end-users in most of the cases. The non-proximity of the Web servers to the end-users badly aect the response time of the Web requests, resulting in undesirable delays [Pan et al., 2004]. Replication of same Web content around the globe in a net of Web servers is a solution to the above issue. However, it is not nancially viable for individual content providers to set up their own server networks. An answer to this challenge is the concept of Content Delivery Network (CDN) that was initiated in 1998 [Douglis and Kaashoek, 2001; Vakali and Pallis, 2003]. The basic idea is to improve the performance and scalability of content retrieval by geographically distributing a network of Web servers around the globe and allowing several content providers to host their content in those servers. . It allows a number of content providers to
Public Internet
ISP Internet Backbone CDN Node
Web Server (Content Provider)
ISP
CDN Node
CDN Node
ISP
Figure 1.1: CDNs and Web Content Distribution upload their Web content into the same network of Web servers (also called, CDN servers) and thereby to reduce the cost of content replication and distribution. In a typical CDN environment(Figure 1.1), the replicated Web server clusters are located at the edge of the network to which the end-users are connected. The end-users interact with the CDN specifying the content-service request through cell phone, PDA, laptop, desktop etc. The Web content based on user requests are fetched from the origin server and a user is served with the content from the nearby replicated Web server. Thus the users end up communicating with a replicated CDN server close to them and retrieve les from that server. From the very inception of the concept, CDN has gone through dramatic evolution. There are a number of CDNs available around the globe Douglis and Kaashoek [2001]; Vakali and Pallis [2003]; Pathan [2007] and are collectively called as Conventional CDN architectures in this minor thesis. They can be mainly classied into two: 1. Commercial CDNs 2. Academic CDNs The Commercial networks are owned by corporate companies and generally follow centralized client-server architecture. Some of them have more than 20,000 servers around the globe 4
Name Akamai
Description Founded in 1998 at Massachusetts, USA, Akamai is considered to be the pioneer in CDN business. It has reported a net income of 283.115 million USD in 2005.
Mirror Image Web, Inc
Founded in 1999 at Massachusetts, USA. Besides content distribution, streaming and content access services are provided.
Local Mirror
It is a U.S.-based privately held corporation that offers Content Delivery Network service incorporated in 2005. It is a provider for static content, audio, video streaming and distribution.
Limelight Networks
Founded in 2001 in Tempe, Arizona, USA Limelight Network provides a network for bandwidth-intensive rich media applications over Web.
Table 1.1: Commercial Content Delivery Networks to support their network. A list of prominent commercial CDN providers are given in Table 1.1 [Pathan, 2007]. The academic CDNs are non-protable in nature and generally follow peer-to-peer architecture. These peer-to-peer Content Delivery Network models allow content providers to organize themselves together and to operate within their own hosting platforms. Some of the important academic CDN providers are given in Table 1.2 [Pathan, 2007]. Conventional CDN architectures - Commercial CDN and Academic CDN - have got their own advantages. But their marjor pitfalls are: High rental rates of commercial CDN services resulting from huge infrastructural cost. Eciency of academic CDNs is at the mercy of the volunteer peers whose commitment is not ensured in their design. The huge nancial cost involved in setting up a commercial CDN compels the commercial CDN providers to charge high remuneration for their service from their clients (the content providers). Usually this cost is so high that only large rms can aord it. On the other hand, the academic CDNs do not provide a built-in network of independent servers around 5
Name Coral
Description It is a free peer-to-peer CDN designed to mirror Web content. It uses architecture very similar to a distributed Web proxy. To access a Website through the Coral cache, we need to simply add .nyud.net:8080 to the hostname in the sites URL.
Globule FCAN
It is an open-source CDN developed at Vrije University in Amsterdam. It is introduced as a third party module for Apache HTTP server. Flash Crowd Alleviation Network is an adaptive CDN that dynamically optimises between peer-to-peer and client-server architectures to alleviate ash crowds. Table 1.2: Academic Content Delivery Networks
the globe. That means, the risk and responsibility of running content distribution network ultimately goes back to the content providers themselves. The content providers, who are generally not interested in taking such big risks and responsibility, do not nd academic CDNs as attractive alternatives to commercial CDNs.
Objectives The above brief discussion (which will be further explained in 2.3.1) suggests that there is a need for more reliable and scalable CDN architecture without fresh infra-structural investment. A unique CDN architecture is required to address these issues. A lot of work has been done in this area aimed at these ends. An academic CDN, Globule, which is envisaged as Collaborative Content Delivery Network (CCDN) [Pierre and van Steen, 2006a] aims to provide performance and availability through Web servers that cooperate across a wide area network. Coppens et al. [2006] proposes the idea of a self-organizing Adaptive Content Distribution Network (ACDN), where they introduce a less centralized replica placement algorithm - (COCOA - Cooperative Cost Optimization Algorithm) which will push the content more to the clients. Though most of these works seem to be theoretically sound, they never challenged the eciency and reliability of commercial client-server architecture for they were purely peer-to-peer architecture which will be eective only at the mercy of participating peers, whose performance is not under the control of suggested architecture. A successful alternative to Commercial CDNs with comparable performance and reliability can be assured only by ensuring proportionate incentives to the participating nodes which will function as a driving force for those peers to stay alive with minimum service rates. 6
Involving Web users with comparatively high bandwidth of Internet connection (broadband or higher) to form a Distributed Content Delivery Network (DCDN) for proportionate remuneration evoke curiosity and challenges. Clusters of DCDN surrogates (participating Web users) will be replacing the conventional CDN servers in this architecture pushing the content very much near to the end-users. The objectives of this thesis can be summed up as follows: Suggest a practical and viable architecture for DCDN and discuss its possible challenges. Suggest a load balancing algorithm for DCDN servers based on queuing theory analysis of DCDN surrogate. Compare the performance of DCDN architecture against commercial CDN architecture using simulation techniques.
Contribution This work aims to propose a new architecture for CDN that will make use of the limited but readily available resources of common Internet users. To achieve this objective, the thesis makes the following contributions. Suggests a unique DCDN architecture and proposes a workable business model to successfully implement it in the real-time scenario. Suggests an appropriate load balancing algorithm for DCDN Local servers by analyzing the performance of DCDN surrogate in terms of average system delay and rejection rate. Discusses the performance of DCDN architecture in comparison with commercial CDN using simulation results.
Organization The origin of CDN and the need and scope of DCDN are given in Chapter 1. The main concepts and the evolution of conventional CDN architectures in the light of previous work are discussed in Chapter 2. Chapter 3 discusses the proposed DCDN architecture in detail. It will be followed by an analysis of major performance parameters of DCDN surrogates - average system delay and rejection rate - in Chapter 4. On the light of those results, a probable load 7
balancing algorithm for DCDN servers is suggested in the same chapter. Simulations and its results to compare the DCDN architecture against commercial CDN architecture constitute Chapter 5. Finally the thesis is concluded with a discussion about future work.
Chapter 2
Background
In this chapter we discuss the dierent entities that constitute the technical backbone of a Content Delivery Network (CDN) in the light of previous works. Further the conventional architectures of CDN - commercial (client-server) and academic (peer-to-peer) - are evaluated and the need of a new architecture is discussed.
2.1
2.1.1
CDN Main Concepts

Surrogate Servers
These are the collection of (non-origin) servers that attempt to ooad work from origin servers by delivering content on their behalf. Surrogate Servers are to be placed all around the globe, according to various needs and business considerations. Since location of surrogate servers is closely related to the content delivery process, it puts extra emphasis on the issue of choosing the best location for each surrogate. Many approaches (e.g: theoretical, heuristic) have been developed to model the surrogate server placement problem [Telematica Institute, 2007].
2.1.2
DNS Lookup and Redirection
The rst step taken by a client to retrieve the content for a URL from Web is to resolve the server name portion of the URL to the IP address of a machine containing the URL content. The client does this resolution with a Domain Name System (DNS) lookup. The resolution causes a DNS request to be sent to a local DNS server. If the local DNS server does not have the address mapping already in its cache, the local DNS server sends a query 9
to the authoritative DNS server for the given server name. Servers in a CDN are located at dierent locations in the Web. A primary issue for a CDN is how to direct client requests for an object served by the CDN to a particular server within the network. DNS redirection and URL rewriting are two of the commonly used techniques for directing client requests to a particular server in a distributed network of content servers [Krishnamurthy et al., 2001; Vakali and Pallis, 2003]. For the DNS redirection technique, the authoritative DNS name server is controlled by the CDN. The technique is termed DNS redirection because when this authoritative DNS server receives the DNS request from the client the DNS server redirects the request by resolving the CDN server name to the IP address of one content server. This resolution is done based on factors such as the availability of resources and network conditions [Molina et al., 2004].
2.1.3
DNS Load Balancing
In order to achieve Web server scalability, the DNS redirection has to be distributed evenly among the surrogate servers in a CDN. When multiple Web servers - the surrogate servers are present in the server group of CDN, these servers appear as one Web server to the Web client. So, the Web trac needs to be evenly distributed among these servers. Such a load distribution among these servers is known as load balancing. The load balancing mechanism used for spreading Web requests is known as IP Spraying [Dilley et al., 2002]. The equipment used for IP spraying is also called the load dispatcher or the load balancer. In this case, the IP sprayer intercepts each Web request, and redirects them to a server in the server cluster of CDN. Depending on the type of sprayer involved, the architecture can provide scalability, load balancing and fail over requirements [Dilley et al., 2002]. For example, Akamais DNS-based load balancing system continuously monitors the state of services and their servers and networks. To monitor the entire systems health end-toend, Akamai uses agents that simulate end-user behaviour by downloading Web objects and measuring their failure rates and download times [aka, 2007]. Akamai uses this information to monitor overall system performance and to automatically detect and suspend problematic data centres or servers. Each of the content servers frequently reports its load to a monitoring application, which aggregates and publishes load reports to the local DNS server. That DNS server then determines which IP addresses to return when resolving DNS names. If a certain server load exceeds a pre-dened threshold, the DNS server simultaneously assigns some of the servers allocated content to additional servers. If the servers load exceeds another threshold, the servers IP address is no longer available to clients. The server can thus shed a fraction of
10
its load when it experiences moderate to high load. The monitoring system in Akamai also transmits data centre load to the top-level DNS resolver to direct trac away from overloaded data centres. In addition to load balancing, Akamais monitoring system provides centralized reporting on content service for each customer and content server. This information is useful for network operational and diagnostic purposes [Wikipedia, 2007].
2.1.4
Replication
Commercial CDNs (e.g. Akamai) replicate content across the globe for large organizations like CNN or Apple, that needs to deliver large volumes of data in a timely manner. Using replication techniques, one or more copies of a single Web content (e.g: streaming media asset) can be maintained on one or more surrogate servers. Context-aware heuristics are proposed by Thomas Buchholz and Linnho-Popien [2005] for content replication to increase the monetary value of replicated content where a replicas prot is dependent on the number of requests it receives from time interval. The clients discover an optimal replica origin server for clients to communicate with. Here, optimality is a policy based decision which is based upon proximity or other criteria such as load [Telematica Institute, 2007].
2.1.5
Selection of Content
The choice of content to be delivered to the end-users is important for content selection. Content can be delivered to the customers in full or in partial. In full-site content delivery the surrogate servers perform entire replication in order to deliver the total content site to the end-users. In contrast, partial content delivery provides only embedded objects such as Web page images from the corresponding CDN.
2.1.6
Cached Delivery
A surrogate server may be equipped with a streaming media cache. This enables on-demand content to be dynamically replicated locally, perhaps in an encrypted format. The surrogate may attempt to store all cacheable media les upon rst request. When a surrogate receives a client request for on-demand media, it determines whether the content is cacheable. Then it checks to see whether the requested media already resides in its local cache. If the media is not already in the cache, the surrogate acquires the media le from the source server and simultaneously delivers it to the requesting client. Subsequent requests for the same media
11
clip can be served without repeatedly pulling the clip across the network from the source server [Telematica Institute, 2007].
2.1.7
Outsourcing Content
Given a set of properly placed surrogate servers in a CDN infrastructure and a chosen content for delivery, it is crucial to decide which content outsourcing practice is to follow. There are basically three content outsourcing schemes and they are enumerated below. 1. Cooperative push-based approach: In this appraoch, content is pushed to the surrogate servers from the origin and each request is directed to the closest surrogate server or otherwise the request is directed to the origin server [Zhiyong Xu and Bhuyan, 2006]. 2. Non-cooperative pull-based approach:, Here, client requests are directed (DNS redirection) to their closest surrogate servers. If there is a cache miss, surrogate servers pull content from the origin server [Dilley et al., 2002]. 3. Cooprative pull-based approach: It diers from the non-cooperative approach in the sense that surrogate servers cooperates each other to get the requested content in case of cache miss. Using a distributed index, the surrogate servers nd nearby copies of requested content and store in the cache [Zhiyong Xu and Bhuyan, 2006].
2.1.8
Accounting and Billing Mechanism
CDN providers charge their customers according to the content delivered by their surrogate servers to the clients. There are technical and business challenges in pricing CDN services. The average cost of charging of CDN services is quite high. The most inuencing factors affecting the price of CDN services include: bandwidth cost, variation of trac distribution, size of content replicated over surrogate servers, number of surrogate servers, reliability and stability of the whole system and security issues of outsourcing content delivery [Krishnamurthy et al., 2001]. CDNs support an accounting mechanism that collects and tracks information related to request routing, distribution and delivery. This mechanism gathers information in real time and collects it in for each CDN component. This information can be used in CDNs for accounting, billing and maintaining purposes.
12
2.2
2.2.1
Conventional CDN Architectures

Commercial (Client-Server) Architecture
Akamai oers content delivery services to content
The classical example is of Akamai.
providers by oering worldwide distributed platform to host their content. It is done by installing a worldwide network of more than twenty thousand Akamai Surrogate Servers [Dilley et al., 2002]. Akamai represents the centralized approach of CDN where the customers (the content providers) hire their share of space in Akamai servers to support the distribution and easy download of their Web content (Web pages or dynamic streaming content). A typical approach by which Akamai provides this service is as follows: 1. The clients browser requests the default Web page at the Content Providers site. The site returns the Web page index.html. 2. The HTML code contains link to some content (eg: images) hosted on the Akamai owned server. 3. As the Web browser parses the HTML code, it pull the content from Akamai server [Wikipedia, 2007]. Akamai uses a simple tool called Free Flow Launcher for its customers that they use to Akamaize their pages [Mahajan, 2004]. The users will specify what content they want to be served through Akamai and the tool will go ahead and Akamaize the URLs. This way the customers still have complete control of what gets served through Akamai and what they still are in charge of. Now the customer is responsible only for the content he chooses to server himself and rst few hits of other content till the Akamai caches warm up [Reitz, 2000].
Peering of Commercial CDNs The commercial CDNs are owned and operated by individual companies. Although there are many commercial CDN providers, they do not cooperate in delivering content to end-users in a scalable manner. In addition, content providers are typically subscribed to one of the CDN providers and are unable to utilize services of multiple CDN providers at the same time. Such a closed, non-cooperative model results in creation of islands of CDNs.
13
To compromise expense and to ensure better service to the clients, CDN providers need to partner together so that each can supply and receive services in a cooperative and collaborative manner that one CDN cannot provide to content providers otherwise. The objective of a CDN is to satisfy its customers with competitive services. If a particular CDN provider is unable to provide quality service to the end-user requests, it may result in Service Level Agreement (SLA) violation and adverse business impact. In such scenarios, one CDN provider partner with other CDN provider(s), which has caching servers located near to the end-user and serve the users request, meeting the Quality of Service (QoS) requirements [Lazar and Terrill, 2001]. This is called peering of CDNs. A Virtual Organization (VO) model for forming Content and Service Delivery Networks (CSDN) and a policy framework within the VO model is suggested for the peering of CDNs by R. Buyya and Tari [2001]. Delivery of content in such an environment will meet QoS requirements of end-users according to the negotiated SLA.
2.2.2
Academic (Peer-to-Peer) Architecture
Distributed computer architectures labelled peer-to-peer are designed for the sharing of computer resources (content, storage, CPU cycles) by direct exchange, rather than requiring the intermediation or support of a centralized server or authority. Peer-to-peer architectures are characterized by their ability to adapt to failures and accommodate transient populations of nodes while maintaining acceptable connectivity and performance [Androutsellis-Theotokis and Spinellis, 2004]. The same technique has been proposed and adopted for creating reliable CDN for the propagation of Web content. A peer-to-peer (P2P) CDN is a system in which the users get together to forward contents so that the load at a server is reduced. In its most basic form, a peer-to-peer content distribution system creates a distributed storage medium that allows for the publishing, searching, and retrieval of les by members of its network. So, instead of delegating content delivery to an external company (like Akamai), content providers can organize together to trade their (relatively cheap) local resources against (valuable) remote resources. A classical example would be the academic peer-to-peer CDN - Globule, developed by Vrije University in Amsterdam. It is implemented as a third-party module for the Apache HTTP server that allows any given server to replicate its documents to other Globule servers. This improves the sites performance; maintain the site available to its clients even if some servers
14
are down, and to a certain extent help to resist ash-crowds [Pierre and van Steen, 2003]. A user participating in the Globule network is oered a distributed set of servers in which his/her Web content can be replicated. Globule is designed in the form of an add-on module for the Apache Web server. To replicate their content, content providers only need to compile an extra module into their Apache server and edit a simple conguration le. Globule automatically replicates the sites content and redirects clients to a nearby replica. Servers also monitor each others availability, so that client requests are not redirected to a failing replica [Halderen and Pierre, 2006; Guillaume Pierre, 2006; Pierre and van Steen, 2006b]. S. Sivasubramanian, B Halderen and G. Pierre rightly observe that a peer-to-peer CDN aims to allow Web content providers to together and operate their own worldwide hosting platform S. Sivasubramanian and Pierre [2004] .
2.2.3
Limitations of Existing CDN Architectures
Despite the many advantages of commercial CDNs, they suer from some major limitations. Commercial CDN providers compete each other and forced to set up costly infrastructure around the globe. Since they want to meet the QoS standards agreed with the clients they are constantly in a process of installing and updating new infrastructure. This process gives rise to the following issues: 1. Network cost : Increase in total network cost in terms of new set of servers and corresponding increase in network trac. 2. Economic cost : Increase in cost per service rate for the distribution of Web content, resulting from increase in initial investment and running cost of each commercial CDN. 3. Social cost : Content distribution is been centralized to a couple of CDN providers and the possible issues of monopolization of revenue in this area. The huge nancial cost involved in setting up a commercial CDN compels the commercial CDN providers to charge high remuneration from their clients (the content providers). Usually this cost is so high that only large rms can aord it. As a result, Web content providers of medium and small sizes are not in a position to rent the services of commercial CDN providers. Moreover, the revenue from content distribution is monopolized. Only large CDN providers with huge infrastructure around the world are destined to amass revenue from this big busi15
ness. At the same time, the resources in terms of processing power, storage capacity and the network availability of large number of common Internet users are ignored who would support a content delivery network for proportionate remunerations. On the other hand, the academic CDNs are non-protable initiatives in a peer-to-peer fashion. But they serve only the content providers who own their own network of servers around the globe. Or they have to become a part of a voluntary net of servers. However, the academic CDNs do not provide a built-in network of independent servers around the globe. That means, the risk and responsibility of running content distribution network ultimately goes back to the content providers themselves. The content providers, who are generally not interested in taking such big risks and responsibility, do not nd academic CDNs as attractive alternatives.
2.3
Distributed Content Delivery Network - An Eective Alternative
The above discussion proves that there is a need for much reliable, responsible and scalable CDN architecture, which can make use of the resources of a large number of general Web users. A unique architecture of Distributed Content Delivery Network (DCDN) is proposed in this thesis to meet these ends. DCDN aims at involving general Web users with comparatively high bandwidth of Web connection (broadband or higher) to form a highly distributed content delivery network. Those who become the part of DCDN network are called DCDN surrogates. A cluster of those DCDN surrogates that are distributed very much to the local levels around the globe, will replace the conventional CDN server pushing the content very much near to the endusers. Since the content is pushed very much into the local levels, the eciency of the content retrieval in terms of response time is expected to increase considerably. It will also reduce network trac, since clients can access the content from locally placed surrogates. A local DCDN server, which is mainly a redirector and load balancer, is designed to redirect the client requests to the appropriate DCDN surrogate servers. Since DCDN is aimed at using the available storage space and Web connectivity of existing Web users, it will not demand the installation of fresh new infrastructure. This approach is supposed to reduce the economic cost, considerably. This acquired new value (prot pool) could be shared between the DCDN surrogates through proper accounting and billing mechanism and through highly attractive business models. It will serve as an incentive for the DCDN surrogates to share their resources to support DCDN network. 16
Chapter 3
Architecture - Distributed Content Delivery Network

In order to provide a highly distributed network of DCDN surrogates a basic structure of commercial client-server CDN is adopted with novel peer to peer concepts. Therefore the DCDN architecture will be a hybrid architecture which integrates some of the major features of conventional client-server CDN and an academic peer-to-peer CDN. A single surrogate server in the conventional client-server CDN model is replaced with lightweight DCDN servers (which are basically redirectors) and a number of DCDN surrogates associated with it. However, the content is distributed among the DCDN surrogate servers in a peer-to-peer fashion and retrieved at a client request with the help of DCDN Local servers.
3.1
DCDN Framework
A collection of Local DCDN Servers and innumerable DCDN Surrogates are networked together to deliver requested Web content to the clients. The main elements of DCDN architecture Content providers, DCDN servers and DCDN surrogates are arranged in a hierarchical order as depicted in Figure 3.1 Content Provider : It is that entity that request to distribute its Web content through DCDN. DCDN Administrators : Rather than a technical entity, it is a managerial/business entity. The entire DCDN network is managed, supported and run by a team of administrators. They do it by controlling and franchising the Master DCDN servers.
17
Content Provider
Master DCDN Server
Local DCDN Server
Local DCDN Server
Local DCDN Server
DCDN Surrogate
DCDN Surrogate
DCDN Surrogate
DCDN Surrogate
DCDN Surrogate
DCDN Surrogate
DCDN Surrogate
Figure 3.1: DCDN Content Distribution Architecture DCDN Servers : DCDN servers are basically redirectors that will only have the knowledge about the location of the content. They do not store any content as such. It may function as a buer system, which help to push the content provided by the content providers to DCDN surrogates. They monitor, keep log of and regulate the content ow from providers to the surrogates. In the proposed architecture, DCDN servers are of two types: Master and Local. 1. DCDN Master Servers : Master DCDN servers are the rst point of contact of a content provider. A global network of Master DCDN servers are set up in such a way that every network region will have at least one Master DCDN server. Network region can be geographical regions like, the Americas, Europe, Asia and Asia Pacic, and Africa or network regions identied on the basis of a number of other criteria like, network trac and network volume. Content providers deal with administrators through Master DCDN servers and reach terms and conditions with DCDN administrators for the service provided by DCDN. They monitor, regulate and control the content ow into DCDN servers and surrogates.
18
2. DCDN Local Servers : They are placed very near to the end-users (virtually they reside among the end-users). A number of Local servers can come under the service of a single Master server. They have got two major functions. Firstly, they decide where to place the content (among the surrogates) and keep log of it. So, Local DCDN servers will have more local and specic knowledge about a particular Web content. Secondly, they nd out and return the IP address of the best available surrogate a client on request for a particular content under the care of DCDN. In doing so, they also function as a load balancer that will protect the surrogates in the network from being overloaded. These Local DCDN servers are networked together to form a globally distributed massive DCDN architecture. The distinction between Master and Local servers refer only to the role a given server plays within DCDN. The same server can act both as a Master as well as a Local server, if it is assigned to do so. DCDN Surrogates : As explained before, DCDN surrogates are the large number of Web users who oers resources in terms of storage capacity, bandwidth and processing power to store and make available DCDN Web content. A requested client Web content is ultimately fetched from DCDN surrogates. DCDN Client : The client refers to an end user, who makes a request for a particular Web content using a Web browser. The assumption is that the client uses a standard Web browser, without the use of any special component such as plugins or daemons.
3.1.1
Distribution of Content - The Process
The aim is the place the replica of the content as close as possible to the clients. In this process, rstly, the content providers approach DCDN administrators. Once the Service Level Agreement is reached, content providers can upload their content to DCDN net. This can be done either through the Master DCDN servers or through the Local DCDN servers assigned by the Master DCDN servers. If they are uploading the content through the Master servers, they will push it to the Local servers. The Local servers push replicas to the surrogates in their own region and keep a track of these records. The Master servers will have more universal knowledge about it (Like, what are the network areas in which a particular content is distributed) and the Local servers will have more local knowledge of the location of the content (That is, which are the surrogates that actually holding a particular content). 19
Content Provider
Master DCDN Server
Local DCDN Server
1 DNS Server
DCDN Surrogate
DCDN Suggogate
DCDN Surrogate
5 2
Client
Figure 3.2: DCDN Content Delivery On request from a Local server, a surrogate may share the replicas with other surrogates in a peer-to-peer fashion. This will ooad the Local severs from additional workload. The process will make sure that the Local server still has the knowledge about the replicated content in the new surrogate/s. However, the content providers need not choose to distribute their content in a true global manner. If they want DCDN to support only for some region(s), they can request for regional support too. In that case, the administrators (with the help of Master servers) choose only those Local DCDN servers, which are set by the parameters given in the QoS (Quality of Service) agreement between the content provider and DCDN administration. (For example, if the content is to be distributed in the Asia and Asia Pacic region, it is sent to the Local DCDN servers at those regions only). In order to keep sync with the updates and modications, or in the event of termination of service to a specic content provider, Master DCDN through the Local DCDN servers request the DCDN surrogates to update/delete the content. DCDN does not expect individual surrogates to host a huge volume of content, for they are
20
only general Web users with low storage capacity. Moreover, they may not be connected to the Web for all the time but makes themselves online for a considerable period of time, everyday. DCDN relay on the magnitude of storage space and bandwidth expected from the innumerably large number of surrogates participating in the DCDN net and their absolute proximity to the clients.
Partial Replication Because of the unlikelihood of being online at the time of request of a specic content in a specic surrogate, the same content is replicated in large number of surrogates. It not suggested that the whole content of a Website should be stored in an individual surrogate. Partial replication of a Website is allowed because the storage space of surrogates are expected not to be very big. In case of partial replication, the knowledge about the remaining content is kept in the respective surrogate to facilitate HTTP redirection in case of query for the rest of the content. The content is updated, deleted or added dynamically in a regular manner, in sync with the Local server updates. The Local DCDN server assesses the demand for a particular content in a particular locality. Local DCDN server increases or decreases the number of replications within its locality according to this assessment. That is, if there is higher demand for a particular content in a particular locality, the number of replicas in that locality is increased or vice versa. This will allow ecient content delivery service with optimum use of resources.
3.1.2
Content Delivery to a User
The DCDN Local server, which is envisaged as a redirector, will follow the DNS protocol. It will take care of the queries related to the Websites under the care of DCDN. This information is shared with other DNS servers too. So, when there is a request for a Website under the care of DCDN, the DNS redirectors will redirect it to the nearest available Local DCDN server. The DCDN Local server searches the log of active surrogates holding those les using a suitable technique (Eg. Distributed Hash Table (DHT) algorithm). It will then make a decision based on the other relevant parameters (availability of full or partial replica content, bandwidth, physical/online nearness, etc) and will return the IP addresses of the best suitable surrogate to the client. Now the client fetches the content from the respective surrogate. The participating surrogates will have a client daemon program running on their machines, which will handle the requests 21
Client
DNS Server
DCDN Surrogate
Local Server
Master Server
Content Provider
Figure 3.3: DCDN Basic Transition Diagram from the clients and the parent DCDN server. If the surrogate is having only a partial content of the Website under request, it has to get the rest from other surrogates. The surrogate may use HTTP redirection to fetch the content from other surrogates. Diagrammatical representation of the above process is given in Figure 3.2 and the following interactions between dierent entities of DCDN are identied. 1. Local DCDN Server - DNS Server Interaction : The Local DCDN server updates the DNS server with the list of content providers under DCDN care and request DNS server to map corresponding URL requests to the IP address of the Local DCDN server. DNS Server queries the Local DCDN server from time to time to update its library. 2. Client - DNS Server Interaction : Client requests for a particular content (Website) under DCDN care. The DNS server directs the request to the Local DCDN server, using DNS protocol. 3. Client - Local DCDN Server Interaction : Local DCDN server nds out the best possible surrogate to cater to the request of the client and returns the IP address of that particular surrogate. 4. Local DCDN Server - Surrogate Interaction : There is a constant interaction going be22
Client
DNS Server
DCDN Surrogate
Local Server
Master Server
Content Provider
Figure 3.4: DCDN Transition Diagram - Including Contingency Plans tween the Local server and the surrogates. The content from the content providers are stored in the surrogates through the Local DCDN servers. The surrogates inform their availability and non-availability to the Local server as and when they become online or oine in terms of connectivity. Local servers keep a track of it. Local DCDN servers direct the surrogates to add, delete, update or modify the content according to the decisions made from time to time. 5. DCDN Surrogate - Client Interaction : Once the Local DCDN server returns the IP address of the most suitable surrogate, the client contacts that surrogate to fetch the requested content. On request from the client, surrogate delivers the content to the client. The transition diagram (Figure 3.3) clubs the major two ows of interactions in DCDN, namely, content distribution and the content delivery. The sequence of interactions are already discussed in 3.1.1 and 3.1.2.
23
zone 2
zone 1
to the wider dcdn net
Figure 3.5: Local DCDN Server Zones - Contingency Plan Contigency Plans The special design of DCDN suggests the possibility of a number of unavailable surrogates at any instance. So, it becomes a high priority to assess the availability of surrogates at every moment. Asking the surrogates to notify the Local server as and when they become online and oine, DCDN achieve this end. At the same time the Local servers issue ping commands at regular intervals to make sure the availability of surrogates, if at all they fail without notifying the Local server. So, the sequence diagram is modied as in Figure 3.4 Another scenario is, when specic Web content is not available within a local DCDN Network. In order to cope up with this scenario, each DCDN local surrogate will be classifying the nearby DCDN Local servers into zones in the representative order of network proximity (Figure 3.5). That is, the nearby Local DCDN servers with least cost accessibility will fall in zone 1, and so on. When a specic content is not found in a Local DCDN net, the DCDN server will rst search its availability in the nearby zone 1 DCDN servers. If its found, the request is redirected to the specic Local DCDN server. If its not found in the lower zones the search is extended to the higher zones, till the specic content is found.
24
3.2
DCDN Design Challenges
In spite of all its advantages, DCDN architecture arouses its own unique set of challenges. The major challenges would be: Security Ecient algorithm for the eective load balancing and DNS redirection. Development of ecient software for quantifying the service of DCDN servers and peers.
3.2.1
Security
The security requirements for a DCDN service environment will be driven by the general security issues such as: 1. Authentication of a content provider (who is recognized by the administrators to use the service of DCDN) while uploading its content to DCDN through Master/Local servers. 2. Authentication of Master and Local DCDN server when they contact each other (for sharing/updating content information and so on). 3. Authentication of Local Servers by the surrogates to authenticate pushed content. In addition to the above issues, maintaining integrity of the content provided by the content provider throughout the DCDN surrogate replicas become a crucial criteria in the business success of DCDN. This is because, the large number of surrogates suggest possible vulnerability of the content being manipulated by vicious surrogates or hackers. On the other hand, content providers will be keen to see that their original content is not tampered within the DCDN network. The DCDN daemon running on the surrogates are supposed to ensure security of the content stored in it. The DCDN surrogate daemon authenticates the injected content from the Local DCDN server and make sure that they receive original replicas. Dierent security measures can be employed to block any attack from the hackers or even from surrogate owner itself to access of tamper the content within the DCDN daemon. One of the solutions is to make sure that we track down the anomalies when the content is tampered and delivered to the end-users. If that can be identied, the respective surrogate can be put on alert, corrected or even eliminated from DCDN. 25
Client
DNS Server
DCDN Surrogate
Local Server
Master Server
Content Provider
Figure 3.6: DCDN Transition Diagram - Including Security Solutions This can be achieved by stamping all content injected to the surrogate with a digital stamp like md5 or the like. The Local server will keep a record of these digital stamps. On each delivery of content, the surrogate daemon shall calculate the digital stamp of the delivered content and send it back to the Local server. The Local server compares it with its database and makes sure that there is no anomaly. If there an anomaly is found, content manipulation is identied and the Local server takes appropriate action. Verication of digital stamp for each and every transaction can create a huge volume of trac between surrogates and the Local server. In order to moderate this trac, this security measure can be done in some random basis. The nal transition diagram incorporating the contingency and security issues is shown in Figure 3.6. Furhter discussions about the security of DCDN architecture are out of the scope of this minor thesis.
3.2.2
Eective Redirection and Load-balancing Algorithm
The key to the success of DCDN would rely on the success of an eective redirection algorithm. The DCDN will be having multiple replications of the same content within a local DCDN set up to ensure scalability of the system. This replication may exponentially increase as
26
the number of local DCDN networks increase throughout the globe. A combination of NDS HTTP address redirection system as mentioned in 3.1.2 has to be a developed as a possible solution in this regard. The DCDN server has to distribute the load within a local system. It should also take care of the availability or non-availability of peer nodes. If the requested content is not within the local DCDN system, DCDN server should be able to make the right decision to get it from the other local DCDN systems without causing network congestion. Eective load-balancing algorithms have to be developed in this regard. Based on the results of queuing delay analysis, a basic algorithm for DCDN servers is proposed in the next chapter.
3.2.3
Billing and SLA (Service Level Agreement) Verication Software
DCDN has to provide content providers with accounting and access-related information. This information has to be provided in the form of aggregate or detailed log les. In addition, DCDN should collect accounting information to aid in operation, billing and SLA verication. The DCDN Master surrogates deal with these content provider related issues. At the same time, DCDN has to quantify proper remuneration for surrogates according to their availability, performance, storage space, etc. There is a need for generalized systems or protocols in the calculation of the contributions of surrogates and a local DCDN servers, on the basis of the business model adopted for DCDN.
3.3
Business Model
The success of DCDN architecture depends upon building up a global DCDN tree consists of Major/Local DCDN servers and considerably large number of DCDN surrogates. There should be strong incentive for individuals to become a part of DCDN tree. The incentive is the shared monetary benet from the bonus pot, which is lled with the money saved by not paying to the middlemen, that is, the commercial CDNs. According to their share of service the online availability, storage, bandwidth, processing power and other relevant factors - the surrogates are to be oered proportionate remuneration. A possible business model for DCDN could be that of Network Marketing/ Multilevel marketing which is based on pyramid scheme.
27
1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 10,000,000,000
Figure 3.7: Pyramid Scheme
3.3.1
Network Marketing (NM)/ Multi Level Marketing (MLM)
Wikipedia denes it as follows: Multi-level marketing (MLM) (also called network marketing or NM) is a business model that combines direct marketing with franchising [Wikipedia, b]. In a typical multi-level marketing or network marketing arrangement, individuals associate with a parent company as an independent contractor or franchisee and are compensated based on their sales of products or service, as well as the sales achieved by those they bring into the business. This is like many franchise companies where royalties are paid from the sales of individual franchise operations to the franchisor as well as to an area or region manager. MLM is inspired by the mathematical model of Pyramid scheme. If a pyramid were started by a human being at the top with just 10 people beneath him, and 100 beneath them, and 1000 beneath them, etc., the pyramid would involve everyone on earth in just ten layers of people with a single man on top. The human pyramid would be about 60 feet high and the bottom layer would have more than 4.5 billion people [Skeptic Dictonary, 2007]. Figure 3.7 will help us to see this: This scheme is eectively used by MLM giants such as Amway, Big Planet, Excel communications, Mary Kay, etc [Wikipedia, a]. A general business model of NM/MLM Distributor hierarchy (Figure 3.81 ), which resembles the DCDN hierarchy, shows the scope of adopting NM/MLM model for the eective creation of DCDN tree of surrogates. In the DCDN model, the Distributor will be replaced with the content provider, the rst level will be the net of Master DCDN servers and the second level will be of Local DCDN servers.
1
Ref: http://www.mlmknowhow.com/articles/startup/getpaid.htm
28
Figure 3.8: MLM Architecture In other words, the DCDN administers will be franchising the concept to the Master/Local server levels. They in turn recruit the nal level of hierarchy - the surrogates - to store the content in local levels. According to the expansion needs, more and more levels could be envisaged in the long run. This can be achieved by adding dierent layers of Master DCDN servers in dierent hierarchical levels. Eventually, an active DCDN server develops a hierarchical substructure known as a downline, that looks like an organization chart in a company with a lot of employees. Each DCDN server gets commission/remuneration on the service of surrogates in their down-line. There are also likely to be performance bonuses available for reaching certain service levels. The prot earned from the commission over its surrogates become the driving force for the DCDN servers (Master/Local) to maintain their technological infrastructure (both hardware and software) and to add more and more surrogates to their hierarchical structure. This will nally improve the scalability and the eciency of DCDN network.With this kind of business model, there are no big capital requirements, no geographical limitations and no special education or skills needed for its participants. Since, the revenue collected from the content providers are proportionately shared among the surrogates it can become a low-overhead, home-based business for the participating surrogates. Network marketing is a people-to-people business, which goes very well with the idea of near peer-to-peer architecture of DCDN.
29
3.3.2
Special Scenarios of DCDN Advantage
The new architectural model suggested for DCDN and the corresponding MLM business model will open up whole new possibilities in content distribution. DCDN architecture is supposed to be more eective in distributing static content than that of dynamic content. The most important beneciaries of DCDN will be the popular streaming media sharing websites like youtube.com and photo-sharing websites (e.g. Picasa Web Albums, Orkutpictures ) who has to support millions of media les uploaded throughout the world and to eectively deliver to the end-users in a more distributive manner. The popular music sharing services will also nd DCDN as an eective and cheaper means of delivering their services to the customers.
30
Chapter 4
Performance Analysis and Load Balancing Algorithm

The performance of DCDN architecture can be expressed in terms of total delay in retrieving a Web content. Here, the DCDN surrogates are expected to be the bottlenecks for they are the common Internet users with limited resources. So, we can say that the success of DCDN architecture will depend upon the performance of DCDN surrogates. This chapter analyzes the performance of a DCDN surrogate using queuing theory techniques. On the basis of this analysis a load balancing algorithm for DCDN server is suggested.
4.1
Performance Parameters and Assumptions
Total delay in retrieving a content is the sum of propagation delay, processing delay at DCDN surrogates and the transmission delay. Transmission delay is the time required by a DCDN surrogate to transmit all data packets of the requested content onto the transmission link. In our case, It is directly proportional to the available bandwidth of DCDN surrogate. Once the packets are pushed onto the link, they need to be propagated to the client. The time taken for propagation is called propagation delay. Propagation delay is taken out of consideration in analyzing the performance of DCDN architecture. This is because, it assumes replication of content much near to the clients than that of conventional architectures. The processing speed of surrogates is assumed to be so high that processing delay is negligible as compared to transmission delay.
31
In nutshell, the eciency of DCDN network can be expressed in terms of transmission delay at DCDN surrogate. This is termed as total system delay at the surrogates. In order to ensure better performance a truncated buer system is suggested at DCDN surrogates. Also we assume a poisson process of randomly spaced requests in time and an exponential distribution of service-time. It will result in M/M/c/k model of queuing analysis, where c is number of servers or server daemon programs engaged and k is the total buer size.
4.2
Queuing Metrics
Total system delay in a DCDN surrogate for dierent M/M/c/k models and corresponding rejection rates are found out using the following formulas as explained by D. Gross and C. M. Harris [Gross and Harris, 1998]. We assume to replicate the eect of multiple servers in a single surrogate by running more than one DCDN surrogate daemons (as multiple processes or threads) within the same machine. Service Time (S): In our case, it is the transmission time, which is equal to; S=
F ilesize U pstreamCapacity 1 ) Therefore, Service rate of the DCDN surrogate = ( S Server Utilization (): = ( ) for M/M/1/k and ) for M/M/c/k queuing model = ( c
where is the arrival rate of requests to DCDN surrogate. Eective arrival rate e : e = (1 Pk ) wherePk is the probability of k requests in the system. Probablity of zero requests in the system P0 :
c 1
P0 =
i=0
(n /)i (/)c 1 kc+1 + i! c! 1
Probablity of n customers in system for 0 n c Pn = Probablity of n customers in system for c n k Pn = Average number of requests in the queue Lq : Lq =
k n=c (n
n ( ) n! n ) ( c!cnc
P0 P0
c)Pn
32
Average number of requests in the system L:

(1 Pk ) L = Lq +
Average System Delay W : Average System Delay is the time duration a request has to wait from the moment it enters a server queue, till it is served by any of the servers available to take a request. for M/M/c/k, W =
L e
Rejection Rate: The number of requests that will be lost due to congestion per unit time is given by: Pk If the mean arrival rate of requests in greater than the service rate of surrogates, it will choke the surrogates. In order to avoid this scenario, the mean arrival rate of requests () is to be kept less than the service rate of surrogates. In other words, server utilization () is kept below one in all queuing models. At the same time, we have to be cautious about the probability of blocking (loss) of requests. Since we cannot aord the loss of requests beyond a very minimum level, rejection rate of requests for dierent models is to be taken into account in the design of a load balancing algorithm for DCDN server.
4.3
Queuing Theory Modeling for Dierent Scenarios
Dierent queuing theory models are analyzed for dierent cases to nd out average system delay and rejection rate as described in the previous section. The following assumptions are made to analyze the queuing parameters. 1. The surrogates are supposed to have a minimum of DLS/Cable Web access. 2. Minimum capacity of DLS/Cable line is rated at 768 Kbps downstream and 128 Kbps upstream. Using a Web Page Analyzer1 it is found that the average size of Web pages of medium size content providers (example: www.rajagiritech.ac.in) is about 30 KB. However, the upstream capacity of surrogates will not be same for all surrogates in real-time scenario. We can reasonably assume that there will also be some surrogates with higher level of connectivity
1
Available at: http://www.websiteoptimization.com/services/analyze/
33
Average Delay in DCDN Surrogate (sec)
Average Delay in DCDN Surrogate (sec)
M/M/1/1 M/M/1/2 M/M/1/3 M/M/2/2 M/M/2/3 M/M/3/3
M/M/1/1 M/M/1/2 M/M/1/3 M/M/2/2 M/M/2/3 M/M/3/3
0 20 40 60 80
0 20 40 60 80
Utilization in Percentage(Access Rate/Service Rate)
(with 128 Kbps capacity)
(with 256 Kbps capacity )
Figure 4.1: Utilization v/s Total System Delay who would be able to give a better performance. In order to reect this scenario, we have also analyzed the queuing delay for a doubled service rate of DCDN surrogates. That means, the upstream capacity of surrogates is raised from 128 Kbps to 256 Kbps. A surrogate originally intending to serve a single request at a time may actually end up serving 2 or 3 in a real-time scenario. This may happen if the multiple requests for a particular content is only available with a single surrogate. In that case, the surrogate is supposed to serve those requests with reduced service rates, i.e., for M/M/1/1 the service rate is ; for M/M/2/2 it is /2, and for M/M/3/3 the rate is /3. The values are found using QtsPlus, a queuing theory analysis software provided by D. Gross and C. M. Harris2 [Gross and Harris, 1998]. The results of these analysis are compiled in Figure 4.1 for 128 Kbps and 256 Kbps upstream capacity. The Rejection rate of dierent queuing models are presented in Figure 4.2.
4.4
Load Balancing Algorithm for DCDN Servers
Many load-balancing algorithms have been proposed in the past to ensure scalable Web servers [Bryhni et al., 2000; Godfrey et al., 2004; Aweya et al., 2002; Wolf and Yu, 2001; Chen et al., 2005]. The stateless property of HTTP protocol by which requests can be routed
2
Available at: http://www.geocities.com/qtsplus/
34
4
Loss of requests per unit time
3 M/M/1/1 M/M/1/2 M/M/1/3 M/M/2/2 M/M/2/3 M/M/3/3
0 20 40 60 80

Figure 4.2: Utilization v/s Rejection Rate separately to dierent servers is widely used to achieve load sharing in a cluster of Web servers [Bryhni et al., 2000]. The canonical name (CNAME) associated with a Web link can be mapped to the IP addresses of a number of replicated servers, who hold the same content. Bryhni et al. [2000] suggest that this mapping can be done at the network to achieve best performance. Same techniques can be adopted for DCDN but by customizing it for its highly distributed nature. An algorithm for load-balancing in highly heterogeneous and dynamic P2P environment is suggested by Godfrey et al. [2004]. They uses the concept of virtual server where a physical node hosts one or more virtual servers. The load balancing is done by moving virtual servers from heavily loaded physical nodes to lightly loaded physical servers. But it is proposed on the assumption that load balancer has got very little control over where the objects are stored. But in DCDN environment, DCDN server has got more control over the content within its surrogates. Moreover, the load balancing algorithms in P2P systems, generally do not consider the dierence in capacity of its peers. In DCDN we can not discard this dierence as we want to oer as ecient service as that of a commercial CDN. The formulation of a simple but ecient load balancing algorithm to ensure almost equal server load to the surrogates by making use of the information and control residing with the DCDN server becomes an inevitability. By carefully analyzing the Utilization v/s Total System Delay graphs (Figure 4.1) and Uti35
lization v/s Rejection Rate graph (Figure 4.2) in the previous section, the following inferences can be made. 1. The best performance is expected from DCDN, when surrogates follow M/M/c/c queuing models. 2. The reduction in average delay time is almost directly proportional to increase in upload capacity, in all the scenarios. 3. Loss of requests can be reduced by increasing the number of requests in the whole system by increasing the value of k in M/M/c/k queuing model. Based on the above inferences we can suggest a load balancing algorithm for the DCDN severs. An algorithm based on M/M/c/c model is expected to be more scalable and comparatively higher ecient than other models. This reection is made by considering an optimum balance between total system delay and rejection rate. The real time scenario also suggests that there may be cases where multiple content have to be served to dierent clients simultaneously from a single surrogate. In the light of above discussion, we make the assumption that the surrogates will be designed to support M/M/c/c queuing model of request streams where the value of c will be proportional to the processing capacity of surrogates. The following optimum server load algorithm for eective load balancing is proposed to ensure reasonable load sharing between the surrogates. Load Balancing Algorithm for DCDN Server
1: let, DCDN server has the knowledge of the service rate () of its surrogates; 2: let, DCDN server is aware of the requests send to () its surrogates; 3: let, DCDN surrogates support only M/M/c/c queuing models; 4: c is the maximum number of requests allowed in a particular surrogate; 5: Web requests arrive at the Local DCDN Server; 6: if requested content is available in the Local DCDN surrogate network then 7: 8: 9: 10: 11: 12: 13:
if there are surrogates with P0 (Probability of NO requests) equal to 100 (i.e., idle surrogates) then send request to the surrogate with highest c value ( i.e., to surrogate with highest service capacity); else while search do not exceed the Max Trial Number do nd the surrogate with lowest Server Utilization
( = ( c ));
end while send request to that surrogate;
36
14: 16:
end if redirect request to other Local DCDN server who has the requested content;
15: else 17: end if
We expect that this algorithm will distribute the workload reasonably well between the surrogates. However, this can only be validated by conducting extensive simulations which reproduce the highly distributed DCDN environment. The next chapter provides those simulations and its results.
37
Chapter 5
Simulations and Results

In the previous chapter, major two matrices of interest, namely queuing delay and rejection rate at the DCDN surrogates were discussed. The results were compiled in the form of graphs. On the basis of those results, a probable load balancing algorithm for DCDN servers was suggested. Various scenarios are created using simulation tool to replicate the DCDN as well as the commercial client-server CDN architecture. Simulations are conducted to compare the performance of DCDN architecture with the client-server CDN architecture using optimum server load - load balancing algorithm. The simulations are conducted using Opnet IT Guru network simulator. The main reason to use Opnet IT Guru is its user-friendliness in picking the predened models and objects using drag and drop functionality. The Opnet predened model and objects are validated and hence require no further validation. The devices, links and nodes in Opnet IT Guru are using reasonable assumptions and enable us to have a strong data analysis This chapter presents the goals, assumptions and the setup of simulations. The performance comparison between dierent scenarios of DCDN and commercial CDN architectures using optimum server load - load balancing algorithm is further discussed.
5.1
Goals
The objective of the simulations is to evaluate the feasibility of DCDN architecture. That is to check the performance of DCDN architecture in comparison with that of commercial client-server CDN architecture in terms of page response time, utilization of DCDN server 38
(load balancer, in case of conventional CDN) and utilization of DCDN surrogate (CDN server, in case of conventional CDN). The simulation scenarios were designed to achieve the following goals: The simulations should be able to provide some reasonable data to show that the DCDN architecture will be able to give better or at least the same performance of the commercial client-server CDN architecture. The technologies and the protocols used in the simulation environment should reproduce the standard protocols used in the industry. The simulation should allow the addition, deletion and modication of the clients, DCDN servers (load balancers) and the surrogates (servers) for easy comparison of dierent parameters used for the evaluation.
5.2
Assumptions
The simulations are designed to simulate a commercial client-server CDN environment in the rst place and then to simulate the DCDN setup. The following assumptions are made to create those environments: DCDN server lies within the IP cloud unlike in the case of commercial CDN (where it is the load balancer of CDN server farm). The use of an additional IP cloud between DCDN server and DCDN surrogates is assumed to represent this environment (Figure D.2). The simulations are conducted in a standard PC and the results are expected to be only suggestive. However, we assume that similar scenarios of commercial CDN and DCDN setup are comparable since both are conducted at similar environments. The data obtained from the simulations can be scaled with an appropriate value so as to have a reasonable approximation of the parameters assessed.
5.3
Overview of Simulation Setup
The experiment was conducted by choosing a standard commercial CDN setup serving 150 clients. Performance of the setup was found in terms of page response time, load balancer 39
Commercial DCDN: CDN Number of Clients Number Link (Mbps) Load Balancing Algorithm round robin server load of Surrogates (or Servers) Capacity 100 10 150 3 Scenario 1 150 6
DCDN: Scenario 2 75 6 10 server load
DCDN: Scenario 3 30 6 10 server load
Table 5.1: Simulation Setup utilization and server utilization. The environment was reset to represent DCDN architecture and the above performance parameters were found again. The experiment was repeated until the DCDN setup could replicate the performace of commercial CDN setup, by altering the critical parameters of simulation enviornment . The critical parameters that dened the dierent simulation scenarios were: Number of Clients: The clients were all HTTP clients with requests of medium le size. The number of requests in the system is directly proportional to the number of clients. Number of Surrogates(or Servers): The ethernet servers in the commercial CDN setup were replaced with larger number of ethernet work-stations as surrogates. Link Capacity: The link capacity of DCDN surrogates are kept signicantly lower than the commercial CDN setup to reect the DCDN architecture in simulation. The four scenarios created by altering the above parameters for the simulation purpose are given below: Commercial CDN: It is the standard CDN setup with 150 medium HTTP clients. They were served by a server farm of 3 CDN servers. The link capacity of the CDN servers was set to 100 Mbps to reect the fact that commercial CDN can aord more resources. Round robin algorithm was used for load balancing. DCDN - Scenario 1: The scenario is changed to DCDN setup serving same number of (150) medium HTTP clients. The three CDN servers in the previous setup was replaced with six DCDN surrogates(work-stations). DCDN surrogate link capacities were reduced to 10 Mbps
40
to simulate the fact that they will have lower link capacity than of commercial CDN servers. Optimum server load algorithm is used for load balancing. DCDN - Scenario 2: In this DCDN setup, the number of clients was reduced to 75 by keeping all other parameters the same as the previous setup. Optimum server load algorithm is used for load balancing. DCDN - Scenario 3: The number of clients was further reduced to 30 by keeping all other parameters intact in this DCDN setup. Optimum server load algorithm is used for load balancing.
5.4
Simulation Results
The simulations were conducted using the setups described in the previous section. The number of clients, surrogates (or servers) and the link capacity of the surrogates used for the simulations are given in Table 5.1. Dierent number of clients in dierent cases produced dierent number of requests that were handled by the DCDN surrogates (or servers in the case of commercial CDN). The simulations were run long enough to achieve a steady system state. The page response time, surrogate (or server) utilization and load balancer utilization are recorded for each case. The simulation results and its implications are explained in the subsequent sections.
5.4.1
Page Response Time
Page response time is the interval between the instance at which an end-user at a terminal enters a request for Website and the instance at which the Webpage is received at the terminal. This parameter is very critical in our comparison of DCDN with commercial CDN, for it is the most visible experience of the end-user regarding the performance of a CDN. The average page response time obtained during the simulations are compiled in graph 5.1. Commercial CDN architecture produce an excellent result for 150 clients using a server farm of 3 servers connected using a hub. It proves the fact that commercial CDN provides a better service for the end-users. The DCDN scenario 1 where the same number of clients were allowed to fetch content from six surrogate work-stations (double the number of servers in the previous case) has produced an average page response time of around 15 seconds. It is compared to the less than 2 second
41
Average Page Response Time (Sec)
15
10
Commercial CDN DCDN - Scenario 1 DCDN - Scenario 2 DCDN - Scenario 3
500
1000
1500
Simulation Time (Sec)

Figure 5.1: Page Response Time average page response time of the similar commercial CDN setup. It shows that DCDN architecture with similar setup of commercial CDN is inecient. Though, the result seems to be discouraging, it was perfectly in line with our early assumptions. Since the powerful servers in the case of commercial CDN is replaced with work-stations and the link capacity of the surrogates were reduced to 1/10 of that of CDN setup, the eciency of system was bound to be reduced considerably. The aim of the experiment was to nd out whether DCDN could replicate the performance of commercial CDN in any scenario. Because of the limitations of the Academic Version of - Opnet IT Guru, we could not increase the number of surrogates. So, we reduced the number of clients to half, namely 75, in DCDN scenario 2. By decreasing the number of clients we are decreasing the volume of requests. The result was promising. There was a considerable improvement in the page response time. It was reduced to nearly half. But it was still above the commercial CDN performance. The number of clients was further reduced to 30 in DCDN scenario 3. The graph shows us that DCDN scenario 3 gives us a curve as good as that of commercial CDN. This information is very vital to our discussions.
42
2
Server(Surrogate) Utility (%)
0 0 500 1000 1500

Figure 5.2: DCDN Surrogate(Server) Utilization
5.4.2
DCDN Surrogate - CPU Utilization vs. CDN Server - CPU Utilization
CPU utilization is the measure of how much of the available processing power in an enterprise is used. In our case, it will give us an idea about the amount of resources we need to put in terms of processing power by a single surrogate. Though we cannot specify the exact congurations of commercial CDN servers and the DCDN surrogate work-stations in the Academic Version of Opnet IT Guru, the simulations does give us comparable outcomes. The results are compiles in Figure 5.2 The commercial CDN server CPU utilization is very low (less than a percent) as expected. The DCDN surrogates in scenario 1 and 2 use nearly double the processing power of that of commercial CDN servers. The scenario 3 of DCDN setup gives us a surrogate CPU utilization of less than a percent.
5.4.3
DCDN Server - CPU Utilization vs. CDN Load Balancer - CPU Utilization
The CPU utilization of DCDN server, which is basically a load balancer and a redirector, is critical in the design of DCDN architecture for it has to be run by the DCDN administrators
43
Load Balancer Utility (%)
0.3
0.2
0.1
500
1000
1500

Figure 5.3: DCDN Server (Load Balancer) Utilization or their franchises. It will give us an idea about how powerful those machines should be. The results as shown in Figure 5.3 suggests that DCDN scenario 3 could produce similar outcomes as that of commercial CDN setup. However, the CPU utilization in all cases is negligibly small.
5.5
Discussion
The whole exercise was to nd out the feasibility of DCDN architecture in comparison with the commercial CDN architecture. The average page response time was the key parameter for our evaluation. The DCDN scenarios 1 and 2 could not reproduce the performance of commercial CDN but left enough traits to suggest that DCDN setup could produce it if the request load is further reduced. In DCDN scenario 3, the request load was further reduced by reducing the number of clients to 30. The results show that this scenario could produce the same performance as that of commercial CDN which handled 150 clients. It gives us a valuable inference: DCDN architecture can replicate the performance of commercial CDN, if the critical parameters are adjusted well enough. When we scale the simulations results, as we have assumed in the beginning, we can say 44
that to replicate the performance of commercial CDN using DCDN architecture, a server in commercial CDN setup has to be replaced with 2 x 5 = 10 surrogates in DCDN setup (for, the number of surrogate servers was doubled and the number of clients was reduced to 1/5) when the link capacity of DCDN surrogate is reduced to 1/10 of that of commercial CDN. Further research and experiments have to be done to verify this exact ratio. However, the above results suggest that the ratio between the number of commercial CDN servers and DCDN surrogates will fall within a reasonable limit. It also suggests that the DCDN architecture could fall well within nancial viability limits for the number of surrogates needed to support the system will not be that high. Surrogate CPU utilization in DCDN setup also shows that it will not take a big chunk of the processing capacity of the surrogate work-stations. This is important because, the participating Internet users in DCDN architecture may not have to sacrice a huge chunk of the processing power of their machines. It will allow them to use their machines for all their regular works, while the DCDN surrogate daemon running behind the scene. DCDN server CPU utilization is as negligible as that of load balancer in the commercial CDN setup. It suggests that running Local as well as Master DCDN servers within the Internet cloud will not be a bottleneck. In general, the results show that by using an ecient equal server load algorithm we could run an eective DCDN for fast content retrieval that will not choke the surrogates, provided we keep proper balance between the volume of requests and the number of surrogates available to handle it.
45
Chapter 6
Conclusion and Future work

The thesis was an attempt to address some of the major disadvantages of existing architectures for CDNs. The commercial client-server architecture is expensive in terms of its network as well as nancial costs. The medium and small scale content providers nd it dicult to access their service, because of the resultant higher nancial liability. At the same time, in order to meet the QoS agreement criteria with the clients, commercial CDNs tend to expand their network infrastructure constantly even when the surplus resources with common Web users are unexplored. It poses an unsustainable model of development. The academic architectures try to provide eective content delivery in a peer-to-peer fashion. But the eciency of the system depends upon the reliability and commitment of the peers. There are no eective measures built into the design of academic CDNs to address this issue. The thesis proposes a new architecture which will empower the common Web users to create a global net of CDN which will help content providers to distribute their Web content in a more local and distributive manner. The proposed architecture of Distributed Content Delivery Network (DCDN) oers proportionate remuneration to the participating nodes for the service they provide in terms of processing power, bandwidth and Web connectivity. This remuneration works as an incentive to the surrogates to maintain and support DCDN and ensures a better service to the content providers and end-users. The minor thesis exposes the DCDN procedures for content delivery and explores the possible major challenges of this unique architecture. Based on queuing delay analysis, the paper assess the performance of DCDN surrogates. The thesis then suggests a possible load balancing algorithm for DCDN servers. It also proposes an eective business model for the wide spread implementation of DCDN. The simulation results show that the DCDN architecture can
46
produce the same level of performance of that of commercial CDN if we can keep a proper balance between the expected volume of requests and the number of DCDN surrogates. The success of DCDN would heavily rely on the magnitude in which it will grow among the Internet users. The business model suggested in the paper could address this issue. In this architecture, the Internet users will take their share of responsibility in maintaining DCDN and will inherit their share of revenue.
6.1
Future Work
This work suggests only an architectural prototype of DCDN. Majority of the design and implementation issues are yet to be explored and worked out. The content distribution and delivery processes dealt in this minor thesis deals only with most common scenarios. The special cases like contingency plans in the wake of non-availability of a particular content in a Local DCDN surrogate tree and the delivery of partially replicated content are yet to be dealt in detail. The development of a fully edged algorithm for DCDN server by considering dierent trac distributions such as heavy-tailed and pareto distributions is also an important task before us. The protocol development to incorporate DCDN servers to the DNS server net would be another crucial milestone in the development of DCDN architecture. Protecting the integrity and security of the content being distributed through DCDN is of utmost importance since the content is being heavily decentralized and vulnerable to attacks from hackers and malicious surrogates. Unique and eective protocols and security measures are to be developed to ensure the integrity of content in DCDN architecture. The accounting and billing mechanism which will measure the service of the DCDN surrogates in terms of it processing power, bandwidth, Web connectivity and availability to award proportionate remuneration to the surrogates is another important research area. Furthermore, the development of client-server program modules for the surrogates as well as for the DCDN servers will constitute an essential part of the real-time success of DCDN business model.
47
Appendix A
Softwares Used
1. OPNET IT Guru Network Simulator - Academic Edition. Opnet IT Guru Academic Edition is a free software from Opnet Technologies. It is a complete network simulation software with wide range of network devices from dierent vendors and across dierent platforms. However,the academic edition has limited functionality as compared to the purchased software called the Opnet Modeler. Further information for Opnet IT Guru and other Opnet products can be obtained from their Website. http://opnet.com/services/ 2. QtsPlus Queuing Theory Software QtsPlus is a queuing software that is given out with The Fundamentals of Queueing Theory, 3rd Edition by Harris and Gross. This software allows easy analysis of the metrics like server utilization, queueing delay, service delay and server idle time etc. after the required inputs are provided to the software. The software and information about it can be obtained from the following link. http://www.geocities.com/qtsplus/
48
Appendix B
Abbreviations
CDN CPU DLS DNS HTTP IP ISP MLM NM P2P QoS SLA Content Delivery Network Central Processing Unit Depth Limited Search Domain Name System Hyper Text Transfer Protocol Internet Protocol Internet Service Provider Multi Level Marketing Network Marketing Peer-to-Peer Quality of Service Service Level Agreement
49
Appendix C
Symbols
c k e Lq M P0 W Number of servers Number of requests in the system (including queue) Mean arrival rate Eective arrival rate Average number of requests in the queue Exponential Mean service rate Probablity of zero requests in the system Server utilization Average system delay
50
Appendix D
Simulation Snapshots
The simulation snapshot showing the commercial CDN setup and DCDN setup - Figure D.1 and Figure D.2 - are given in next pages. Application conguration (Figure D.3) and prole conguration (Figure D.4) are given in the following page.
51
Figure D.1: Simulation Snapshot - CDN
52
Figure D.2: Simulation Snapshot - DCDN
53
Figure D.3: Application Conguration
Figure D.4: Prole Conguration
54
Bibliography
Akamai technologies, May 2007. URL www.akamai.com. S. Androutsellis-Theotokis and D. Spinellis. A survey of peer-to- peer content distribution technologies. ACM Computing Surveys, 36(4), December 2004. J. Aweya, M. Ouellette, D. Y. Montuno, B. Doray, and K. Felske. An adaptive load balancing scheme for web servers. International Journal of Network Management, 12(1):339, 2002. URL http://dx.doi.org/10.1002/nem.421. H. Bryhni, E. Klovning, and O. Kure. A comparison of load balancing techniques for scalable web servers. IEEE Network, 14(4):5864, July-August 2000. URL http://ieeexplore. ieee.org/xpls/abs_all.jsp?arnumber=855480. R. Burns, R. Rees, and D. Long. Ecient data distribution in a web server farm. IEEE Internet Computing, 5(5):56 65, September - October 2001. J. Chen, B. Wu, M. Delap, B. Knutsson, H. Lu, and C. Amza. Locality aware dynamic load management for massively multiplayer games. ACM Press, New York, NY, USA, 2005. J. Coppens, T. Wauters, F. D. Turck, B. Dhoedt, and P. Demeester. Design and performance of a self-organizing adaptive content distribution network. IEEE/IFIP Network Operations and Management Symposium 2006, Vancouver, Canada, April 2006. J. Dilley, B. Maggs, J. Parlkh, H. Prokop, R. Sitaraman, and B. Welhl. Globally distributed content delivery. IEEE Internet Computing, page 50 to 56, September-October 2002. F. Douglis and M. Kaashoek. Scalable internet services. IEEE Internet Computing, 5(4): 3637, 2001. B. Godfrey, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica. Load balancing in dynamic structured p2p systems. INFOCOM 2004. Twenty-third AnnualJoint Conference
55
of the IEEE Computer and Communications Societies, 4(4):22532262, March 2004. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1354648. D. Gross and C. M. Harris. Fundamentals of queueing theory. Wiley, London, 1998. M. v. S. Guillaume Pierre. A Trust Model for Peer-to-Peer Content Distribution Networks. Research paper archives, Vrije University, Amsterdam, 2006. URL citeseer.ist.psu. edu/578834.html. B. Halderen and G. Pierre. Globule User Manual Version 1.3.1, February 2006. URL http://www.globule.org. B. Krishnamurthy, C. Wills, and Y. Zhang. On the use and performance of content distiribution netowrks. Technical report, ACM SIGCOMM Internet Measurement Workshop, 2001. I. Lazar and W. Terrill. Exploring content delivery networking. IT Professioinal, 3(4):47 49, July - August 2001. R. Mahajan. How akamai works, 2004. URL www.cs.washington.edu/homes/ratul/
akamai.html. B. Molina, V. Ruiz, I. Alonso, C. Palau, J. Guerri, and M. Esteve. A closer look at a content delivery network implementation. In Electrotechnical Conference, 2004. MELECON 2004. Proceedings of the 12th IEEE Mediterranean, volume 2, pages 685 688, Dept. of Commun., Univ. Politecnica de Valencia, Spain, May 2004. C. Pan, M. Atajanov, T. Shimokawa, and N. Yoshida. Design of adaptive network against ash crowds. In Proc. Information Technology Letters, pages 323326, September 2004. M. Pathan. Content distribution networks(cdn) - research directory, May 2007. URL http: //www.cs.mu.oz.au/~apathan/CDNs.html. G. Pierre and M. van Steen. Globule: A collaborative content delivery network. IEEE Communications Magazine, pages 127 133, August 2006a. G. Pierre and M. van Steen. Globule: a collaborative content delivery network. IEEE Communications Magazine, 44(8):127133, August 2006b. URL http://www.globule. org/publi/GCCDN_commag2006.html. G. Pierre and M. van Steen. Design and implementation of a user-centered content delivery network, 2003. URL http://www.citeseer.ist.psu.edu/pierre03design.html. 56
F. Presti, N. Bartolini, and C. Petrioli. Dynamic replica placement and user request redirection in content delivery networks. In IEEE International Conference on Communications, volume 3, pages 1495 1501, Dubrovnik, Croatia, May 16 - 20 2005. J. B. R. Buyya, A. Pathan and Z. Tari. A case for peering of content delivery networks. IEEE Distributed Systems Online, 5(5):56 65, October 2001. URL http://portal.acm.org/ citation.cfm?id=1187679. H. Reitz. Cachet Technologies. Harward Business School Publishing, Boston, MA 02163, March 2000. URL http://www.hbsp.harward.edu. B. H. S. Sivasubramanian and G. Pierre. Globule: a user-centric content delivery network. poster presented at 4th International System Administration and Network Engineering Conference, September-October 2004. Skeptic Dictonary. Multi-level marketing (a.k.a. network marketing and referral marketing), April 2007. URL http://skepdic.com/mlm.html. Telematica Institute. Content distribution networks - state of the art, April 2007. URL https://doc.telin.nl/dscgi/ds.py/Get/File-15534. I. H. Thomas Buchholz and C. Linnho-Popien. A prot maximizing distribution strategy for context-aware services. Proccedings of the 2005 Second IEEE International Workshop on Mobile Commerce and Services, 2005. A. Vakali and G. Pallis. Content distribution networks - status and trends. IEEE Internet Computing, pages 68 74, November - December 2003. Wikipedia. Content delivery networks (cdn), April 2007. URL http://en.wikipedia.org/ wiki/Content_Delivery_Network. Wikipedia. List of multi-level marketing companies, a. URL http://en.wikipedia.org/ wiki/List_of_multi-level_marketing_companies. Wikipedia. Multi-level marketing, b. URL http://en.wikipedia.org/wiki/Multi-level_ marketing. J. L. Wolf and P. S. Yu. On balancing the load in a clustered web farm. ACM Trans. Inter. Tech., 1(2):231261, 2001. URL http://doi.acm.org/10.1145/502152.502155. Y. H. Zhiyong Xu and L. Bhuyan. Ecient server cooparation mechanism in contnet delivery network. In Performance, Computing, and Communications Conference, page 8 . IPCCC 2006, IEEE, 2006. 57

An Architecture For Distributed Content Delivery Networks PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

An Architecture For Distributed Content Delivery Networks PDF

Hochgeladen von

Copyright:

Verfügbare Formate

An Architecture for Distributed Content Delivery Network

Distributed Content Delivery Network - An Eective Alternative . . . . . . .

3 Architecture - Distributed Content Delivery Network 3.1 DCDN Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.1.1 3.1.2 3.2

Distribution of Content - The Process . . . . . . . . . . . . . . . . . . Content Delivery to a User . . . . . . . . . . . . . . . . . . . . . . . .

5 Simulations and Results 5.1 5.2 5.3 5.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Conclusion and Future work 6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A Softwares Used B Abbreviations C Symbols D Simulation Snapshots

DCDN Surrogate(Server) Utilization . . . . . . . . . . . . . . . . . . . . . . . DCDN Server (Load Balancer) Utilization . . . . . . . . . . . . . . . . . . . .

ISP Internet Backbone CDN Node

Web Server (Content Provider)

Mirror Image Web, Inc

CDN Main Concepts

DNS Lookup and Redirection

DNS Load Balancing

Accounting and Billing Mechanism

Conventional CDN Architectures

The classical example is of Akamai.

Academic (Peer-to-Peer) Architecture

Limitations of Existing CDN Architectures

Distributed Content Delivery Network - An Eective Alternative

Architecture - Distributed Content Delivery Network

Master DCDN Server

Local DCDN Server

Local DCDN Server

Local DCDN Server

Distribution of Content - The Process

Master DCDN Server

Local DCDN Server

Content Delivery to a User

to the wider dcdn net

DCDN Design Challenges

Eective Redirection and Load-balancing Algorithm

Billing and SLA (Service Level Agreement) Verication Software

1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 10,000,000,000

Figure 3.7: Pyramid Scheme

Network Marketing (NM)/ Multi Level Marketing (MLM)

Special Scenarios of DCDN Advantage

Performance Analysis and Load Balancing Algorithm

Performance Parameters and Assumptions

(n /)i (/)c 1 kc+1 + i! c! 1

Average number of requests in the system L:

Queuing Theory Modeling for Dierent Scenarios

Available at: http://www.websiteoptimization.com/services/analyze/

Average Delay in DCDN Surrogate (sec)

Average Delay in DCDN Surrogate (sec)

M/M/1/1 M/M/1/2 M/M/1/3 M/M/2/2 M/M/2/3 M/M/3/3

M/M/1/1 M/M/1/2 M/M/1/3 M/M/2/2 M/M/2/3 M/M/3/3

Utilization in Percentage(Access Rate/Service Rate)

Utilization in Percentage(Access Rate/Service Rate)

(with 128 Kbps capacity)

(with 256 Kbps capacity )

Load Balancing Algorithm for DCDN Servers

Available at: http://www.geocities.com/qtsplus/

3 M/M/1/1 M/M/1/2 M/M/1/3 M/M/2/2 M/M/2/3 M/M/3/3

Utilization in Percentage(Access Rate/Service Rate)

end while send request to that surrogate;

15: else 17: end if

Simulations and Results

Overview of Simulation Setup

DCDN: Scenario 2 75 6 10 server load

DCDN: Scenario 3 30 6 10 server load

Page Response Time