Download presentation
Presentation is loading. Please wait.
Published byMoris Dorsey Modified over 8 years ago
1
September 2008 Josilene Aires Moreira
2
Overview CDN Topology CDNs nowadays Contructing a CDN ◦ Basic model ◦ Modules ◦ Characteristics References
3
Origin Servers Surrogate Servers (mirrow servers)
4
First evolved in 1998, replicate content over many web servers to deal with flash crowds Geographically distributed web servers to improve performance and scalability Improves network performance through content replication ◦ Maximizing bandwidth ◦ Improving accessbility ◦ Mantaining correctness Moves content to edge servers located close to the users
5
Host thirdy-part content Any digital content Static ◦ Html pages ◦ Images ◦ Documents ◦ Software Streaming media ◦ Audio ◦ Video Content services ◦ E-commerce service Source of the content Large enterprises Web service providers Media companies New broadcasters
6
Origin Server Surrogate Servers ◦ A set of replica servers that cache the origin’s server content Routers and network elements ◦ Deliver content requests to the optimal location and the optimal surrogate server Accounting mechanism ◦ Provides logs and information to the origin server
7
Advantages ◦ Reducing the customer need to invest in Website infrastructure ◦ Decreasing the operational effort and costs of managing such infrastructure ◦ Bypassing traffic jams Data is closer to the user There is no needs to transverse all congested pipes ◦ Improving content delivery quality, speed and reliability ◦ Reducing the load on origin servers
8
Akamai Technologies (www.akamai.com) is the market leader ◦ (80% of the overall CDN market) in providing content delivery services. It owns more than 12,000 servers over 1,000 networks in 62 countries. Mirror Image Internet, Inc. (www.mirror-image.com) ◦ supports surrogate servers located in 22 cities around the world (North America, Europe, and Asia), which provide a range of value-added services, from content distribution to media streaming and managed caching. Inktomi, a Yahoo Company (www.inktomi.com) ◦ Provides managed services for global load balancing, failover, content delivery, and streaming media using more than 1,000 surrogate servers worldwide. LimeLight Network (www.limelightnetworks.com) ◦ provides a suite of services (including music download and subscription services, video game developers and distributors, movie/video download services, and so forth) and supports surrogate servers located in 72 locations around the world (Asia, the U.S., and Europe).
9
Codeen ◦ Codeen is developed at Princeton University, USA. It provides caching of Web content and redirection of HTTP requests. ◦ It is an academic testbed content distribution network built on top of PlanetLab. This testbed consists of a network of high performance proxy servers. ◦ Each of these proxy servers acts both as request redirectors and surrogate servers. They provide fast and robust Web content delivery service through cooperation and collaboration. Coral CDN ◦ Coral is a free, peer-to-peer content distribution network designed to mirror web content. ◦ Coral is designed to use the bandwidth of volunteers to avoid congestion and to reduce the load on websites and other web content providers in general. Globule ◦ Globule is an open-source collaborative content delivery network developed at the Vrije Universiteit in Amsterdam. Globule aims to allow Web content providers to organize together and operate their own world-wide hosting platform. It provides replication of content, monitoring of servers and redirecting of client requests to available replicas.
10
Origin Surrogate Content manager Redirector Clients Access network On Content Delivery Network Implementation, Molina Moreno et al, 2006
11
Replica servers ◦ Acts as proxy/cache servers ◦ Store and delivery content Components ◦ Portal = http-based web-server Provides access to the content stored in the CDN ◦ Streaming server A media streaming server to distribute multimedia content ◦ Surrogate Database (DB) Contains a list of all available streaming sessions, the objects stored in the surrogate and information for the management of the CDN Research Questions ◦ Number of Surrogate Servers ◦ Surrogate Placement Algorithms (Greedy, Hot-spot, Tree-based,…)
12
Research Questions ◦ Number of Surrogate Servers ◦ Surrogate Placement Algorithms Placement Algorithms Content distribution includes the placement of surrogate servers to some strategic positions, close to the clients. Some theoretical approaches model the problem as the “center placement problem”: for the placement of a given number of centers, minimize the maximum distance between a node and the nearest center. ◦ Greedy ◦ Hot spot ◦ Tree-based
13
Control the media objects stored in each Surrogate Provide this information to the Redirector, to get each client served by the most suitable Surrogate or Peer Content Locator Determines ◦ Number of replicas of a media object ◦ In which surrogate servers a new object has to be stored ◦ Elimination of non-popular objects from the surrogates ◦ Interaction of the CDN within the origin servers ◦ Update media objects in the surrogates ◦ Move media objects among surrogates DB Content ◦ Stores the information managed by the Content Locator
14
Research Questions Content Outsourcing Related to the relationship between surrogates ◦ Cooperative push-based The content is initially pushed from Origin Server The surrogate servers cooperate to reduce replication This approach has not yet been adopted by a CDN provider ◦ Uncooperative pull-based Clients’ requests are directed to their closest Surrogate Server If there is a cache miss, the request is redirected to another Surrogate or to the Origin Server Akamai and Mirror Image use this approach ◦ Cooperative pull-based Client requests are directed to their closest surrogate In case of cache misses, the surrogate servers use a distributed index to find nearby copies of the requested objects Coral (academic CDN) has implemented this approach
15
Research Questions (cont.) Content Selection Related to how the content is replicated at the surrogates ◦ Entire replication High disk prices Web objects are increasing Updating is a problem ◦ Partial site content selection and delivery Empirical Uses heuristics to decide the content to be replicated at the edge servers (uncertainty in choosing right heuristics) Popularity-based Most popular objects are replicated Object-based Content is replicated in units of objects that gives the maximum performance gain (high complexity to implement on real applications) Cluster-based Content is grouped based on correlation or access frequency and replicated in units of clusters (high complexity)
16
Research Questions (cont.) Caching Techniques Related to how the caches manage a cache miss ◦ Intra-cluster caching Query-based On a cache miss, a CDN server broadcasts a query to others cooperating CDN servers (significant query traffic) Digest-based Each CDN server has a map of content held by the other cooperating surrogates (update traffic overhead) Directory-based A centralized server keeps information of all the cooperating surrogates in a cluster (potential bottleneck = single point of failure) Hashing-based / Semi-hashing based for streaming Cooperating CDN servers maintain the same hash function More efficient, highest content sharing efficiency
17
Research Questions (cont.) Cache update ◦ Periodic Update Most common method ◦ Update propagation An updated version is delivered to all caches whenever a change is made into the origin server ◦ On-demand update The latest copy of a document is propagated to the surrogate cache server based on prior request for that content. The content is not updated until be requested ◦ Invalidation approach An invalidation message is sent to all surrogate caches when there is a change at the origin server, and the caches need to fetch an updated version later
18
Provides intelligence to the system ◦ Estimates the most adequate surrogate server for each different client request CDN DNS ◦ Accept requests from the client local DNS and route the client to the most adequate surrogate (several approaches, Akamai uses servers in two hierarchies) ◦ DB DNS Stores information about addresses and names of the surrogate servers, and some additional information
19
Redirector (Request-routing) Monitor Module (Main knowledge gatherer of the system) ◦ Gets statistical information from different key elements of the CDN architecture ◦ Conducts a variety of measurements to obtain information about the network and the CDN components ◦ Uses SNMP to get data from the surrogates to estimate the RTT between Clients and Surrogates ◦ DB Monitor : Stores all information Redirection Algorithm ◦ Selects the optimal surrogate server on the basis of the information gathered by the Monitor
20
Research Questions Which one is the appropriate surrogate server? ◦ The replica server ‘closest’ to the client Decision based on Metrics (‘closest’) ◦ Network proximity ◦ Client perceived latency ◦ Distance ◦ Replica server load Request Routing Algorithm ◦ Non-adaptive Round robin, prediction on load of servers (number of requests), client-server distance ◦ Adaptive Network proximity (path length), Client perceived latency Akamai uses a complex combination of server load, reliability of paths, bandwidth available to a replica server Cisco DistributedDirector uses inter-AS distance, intra-AS distance, end-to-end latency
21
Pallis, G. and Vakali, A., “Insight and Perspectives for Content Delivery Networks”, Communications of the ACM, vol. 49, no. 1, ACM Press, NY, USA, January 2006. Moreno, B. M.; Salvador, C. P.; Domingo, M. E.; Peña, I. A. & Extremera, V. R. (2006), 'On content delivery network implementation', Computer communications 29(12), 2396--2412. Pathan, A. K. & Buyya, R. (2007), 'A Taxonomy and Survey of Content Delivery Networks', Technical report, The University of Melbourne, Australia
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.