A Taxonomy and Survey of Content Delivery Networks Meng-Huan Wu 2011/10/26 1.

Slides:



Advertisements
Similar presentations
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Advertisements

1 Server Selection & Content Distribution Networks (slides by Srini Seshan, CS CMU)
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
The Internet Useful Definitions and Concepts About the Internet.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
CDNs & Replication Prof. Vern Paxson EE122 Fall 2007 TAs: Lisa Fowler, Daniel Killebrew, Jorge Ortiz.
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Application-Layer Anycasting: A Server Selection Architecture and Use in a Replicated Web Service IEEE/ACM Transactions on Networking Vol.8, No. 4, August.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Object Naming & Content based Object Search 2/3/2003.
Caching And Prefetching For Web Content Distribution Presented By:- Harpreet Singh Sidong Zeng ECE Fall 2007.
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
Web Caching and CDNs March 3, Content Distribution Motivation –Network path from server to client is slow/congested –Web server is overloaded Web.
Loopback: Exploiting Collaborative Caches for Large-Scale Streaming Ewa Kusmierek, Yingfei Dong, Member, IEEE, and David H. C. Du, Fellow, IEEE.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
AKAMAI Content Delivery Services AKAMAI Content Delivery Services CIS726 : PRESENTATION Avinash Ponugoti Avinash Ponugoti Nagarjuna Nagulapati Sathish.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
Content Delivery Networks (CDN) Dr. Yingwu Zhu Reverse Proxy Reverse Proxy Reverse Proxy Intranet Web Cache Architecure Browser Local ISP cache L4 Switch.
Information-Centric Networks05a-1 Week 5 / Paper 1 On the use and performance of content distribution networks –Balachander Krishnamurthy, Craig Wills,
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Caching and Content Distribution Networks. Some Interesting Observations r Top 1 % of all documents account for 20% - 35% of proxy requests r Top 10%
On the Use and Performance of Content Distribution Networks Balachander Krishnamurthy Craig Wills Yin Zhang Presenter: Wei Zhang CSE Department of Lehigh.
Content Distribution March 8, : Application Layer1.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.
CH2 System models.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
INFORMATION AND COMMUNICATION SYSTEMS MERIT 2008 Research Symposium Melbourne Engineering Graduates Look to the Future System Architecture An internetworking.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
World Wide Web Caching: Trends and Technologys Gerg Barish & Katia Obraczka USC Information Sciences Institute, USA,2000.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Application of Content Computing in Honeyfarm Introduction Overview of CDN (content delivery network) Overview of honeypot and honeyfarm New redirection.
Web Hosting Herng-Yow Chen. Outline How different web site can be “ virtually hosted ” on the same server, and how this affects HTTP How to make web sites.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Content distribution networks (CDNs) r The content providers are the CDN customers. Content replication r CDN company installs hundreds of CDN servers.
CONTENT DELIVERY NETWORKS
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Computer Science Lecture 14, page 1 CS677: Distributed OS Last Class: Concurrency Control Concurrency control –Two phase locks –Time stamps Intro to Replication.
Globally Distributed Content Delivery Presenter: Baoning Wu 03/25/2003.
Content Distribution Network, Proxy CDN: Distributed Environment
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Information-Centric Networks Section # 5.1: Content Distribution Instructor: George Xylomenos Department: Informatics.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
26/07/2011 IETF 81 CDNI WG - draft-xiaoyan-cdni-requestrouting-01 1 CDNI WG draft-xiaoyan-cdni-requestrouting-01 IETF81 - Quebec Xiaoyan He
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Video Caching in Radio Access network: Impact on Delay and Capacity
Content Distribution Internetworking IETF BOF December 12, 2000 Phil Rzewski Gary Tomlinson.
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
September 2008 Josilene Aires Moreira.  Overview  CDN Topology  CDNs nowadays  Contructing a CDN ◦ Basic model ◦ Modules ◦ Characteristics  References.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Multicast in Information-Centric Networking March 2012.
Internet Networking recitation #12
Edge computing (1) Content Distribution Networks
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
Existing CDNs Fail to Address these Challenges
AKAMAI Content Delivery Services
Presentation transcript:

A Taxonomy and Survey of Content Delivery Networks Meng-Huan Wu 2011/10/26 1

Outline Introduction Request-routing mechanisms Content selection and delivery Content routing and delivery Caching techniques Conclusion & Future work References 2

Introduction A CDN is a collection of network elements arranged for more effective delivery of content to end-users. Reduce network impact on the response time of user requests. Avoid flash crowd (or SlashDot effect) 3

The three key components of a CDN architecture A content provider or customer is one who delegates the URI name space of the Web objects to be distributed. The origin server of the content provider holds those objects. A CDN provider is a proprietary organization or company that provides infrastructure facilities to content providers in order to deliver content in a timely and reliable manner. End-users or clients are the entities who access content from the content provider’s website. 4

Servers Origin server : The server where the definitive version of a resource resides is called origin server Replica server(or surrogate server) : A server is called a replica server when it is holding a replica of a resource but may act as an authoritative reference for client responses. 5

Relationships 6

Abstract architecture of a Content Delivery Network (CDN) 7

Request-routing in a CDN environment 8

Content selection and delivery 9

Full-site content selection and delivery 10 Surrogate Server CDN Origin Server Client GET index.html GET image1.gif, image2.gif index.html, image1.gif, image2.gif index.html embedded image1.gif image2.gif

Partial site content selection and delivery 11 Origin Server Surrogate Server CDN Client GET index.html GET image1.gif, image2.gif image1.gif, image2.gif index.html embedded image1.gif image2.gif

Empirical-based approach In empirical-based approach, the Web site administrator empirically selects the content to be replicated to the edge servers. Heuristics are used in making such an empirical decision. The main drawback of this approach lies in the uncertainty in choosing the right heuristics. 12

Popularity-based approach In popularity-based approach, the most popular objects are replicated to the surrogates. This approach is time consuming and reliable objects request statistics is not guaranteed due to the popularity of each object varies considerably. Moreover, such statistics are often not available for newly introduced content. 13

Cluster-based approach In cluster-based approach, Web content is grouped based on either correlation or access frequency and is replicated in units of content clusters. 14

Content routing and delivery If the local CDN server accepts a user’s request but does not have the requested content, it will perform content routing to locate and then deliver the content to the user. 15

The steps the CDN takes to serve a user’s request Step 1. Try to satisfy the user’s request using the local CDN server. Step 2. If step 1 fails, try to satisfy the user’s request using a CDN server inside the cluster including the local CDN server. Step 3. If step 2 fails, try to satisfy the user’s request using a CDN server inside a nearby cluster. Step 4. If step 3 fails, try to satisfy the user’s request using the origin server. 16

17

Caching techniques 18

Query-based scheme The most straightforward scheme is the query-based scheme, in which a CDN server broadcasts a query for the requested content to other CDN servers inside the same cluster if it does not have the content. 19

Digest-based scheme In order to avoid flooding queries, the digest- based scheme was proposed. Each CDN server maintains a content digest that includes the content information of other CDN servers inside the same cluster. Once a CDN server has cached/ deleted some contents, it notifies other CDN servers to update their content digests. Hence, a CDN server knows where to locate the content by checking its content digest. 20

Directory-based scheme A centralized version of the digest-based scheme is the directory-based scheme, in which a directory server maintains the content information of the CDN servers inside the cluster. A CDN server only needs to notify the directory server when local updates occur, and queries the directory server when there is a local miss. Compared to the digest-based scheme the update traffic is greatly reduced, but the directory server is a single point of failure because it needs to handle the update and query messages from all the cooperating CDN servers. 21

Hashing-based scheme A more efficient scheme is the hashing-based scheme. The CDN servers inside a cluster maintain the same hashing function. Each content is assigned to a designated CDN server based on the content’s URL (or other unique identification), unique IDs (e.g., IP addresses) of the CDN servers, and the hashing function. All requests for the same content are redirected to the designated CDN server for that content. 22

Semi-hashing-based scheme Under the semi-hashing-based scheme, a local CDN server allocates a certain portion, P local, of its disk space to cache the most popular contents for its local users, and the remaining portion to cooperate with other CDN servers via a hashing function. 23

Cache update taxonomy 24

Periodic update The most common cache update method is the periodic update. To ensure content consistency and freshness, the content provider configures its origin Web servers to provide instructions to caches about what content is cacheable, how long different content is to be considered fresh, when to check back with the origin server for updated content, and so forth. With this approach, caches are updated in a regular fashion. But this approach suffers from significant levels of unnecessary traffic generated from update traffic at each interval. 25

Update propagation The update propagation is triggered with a change in content. It performs active content pushing to the CDN cache servers. In this mechanism, an updated version of a document is delivered to all caches whenever a change is made to the document at the origin server. For frequently changing content, this approach generates excess update traffic. 26

On-demand update On-demand update is a cache update mechanism where the latest copy of a document is propagated to the surrogate cache server based on prior request for that content. This approach follows a assume nothing structure and content is not updated unless it is requested. The disadvantage of this approach is the back and forth traffic between the cache and origin server in order to ensure that the delivered content is the latest. 27

Invalidation Another cache update approach is invalidation, in which an invalidation message is sent to all surrogate caches when a document is changed at the origin server. The surrogate caches are blocked from accessing the documents when it is being changed. Each cache needs to fetch an updated version of the document individually later. The drawback of this approach is that it does not make full use of the distribution network for content delivery and belated fetching of content by the caches may lead to inefficiency of managing consistency among cached contents. 28

Taxonomy of request-routing mechanisms 29

DNS based Request-Routing 30 Akamai DNS DNS query: DNS response: Session local DNS server DNS query: DNS response: Surrogate Surrogate Akamai CDN Client delaware.cnn.akamai.com california.cnn.akamai.com

31 DNS based Request-Routing DNS query DNS response Session Akamai DNS Surrogate Akamai CDN Client local DNS server DNS query DNS response Measure to Client DNS Measure to Client DNS Measurement results Measurements

URL rewriting 32 HTTP request for => DNS query for HTTP request for origin server CDN’s authoritative DNS server CDN server near client client =>

Content outsourcing Cooperative push-based: – This approach is based on the pre-fetching of content to the surrogates. Non-cooperative pull-based: – In this approach, client requests are directed to their closest surrogate servers. Cooperative pull-based: – The cooperative pull-based approach differs from the non- cooperative approach in the sense that surrogate servers cooperate with each other to get the requested content in case of cache miss. 33

Conclusion & Future work Conclusion – They offer fast and reliable applications and services – Reduce network impact on the response time – Enhance QoE Future work – Find a better way to content placement 34

References [1] A. K. Pathan, and R. Buyya, “A Taxonomy and Survey of Content Delivery Networks,” Tech Report, Univ. of Melbourne, 2007 [2] J. Ni, and D. H. K. Tsang, “Large Scale Cooperative Caching and Application-level Multicast in Multimedia Content Delivery Networks,” IEEE Communications, Vol. 43, Issue. 5, pp , May

Q&A 36