Presentation is loading. Please wait.

Presentation is loading. Please wait.

Content Distribution Networks Costin Raiciu Advanced Topics in Distributed Systems Fall 2012.

Similar presentations


Presentation on theme: "Content Distribution Networks Costin Raiciu Advanced Topics in Distributed Systems Fall 2012."— Presentation transcript:

1 Content Distribution Networks Costin Raiciu Advanced Topics in Distributed Systems Fall 2012

2 Problem: making the web go faster Use case: type ebay.com in your browser – The site takes a while to load – How long do you wait before you give up and try another site? 40% of customers will wait no more than 3 seconds for a webpage to load [forrester consulting]

3 Why would a site take long to load? Let’s assume the content is generated quickly by the servers A webpage will have many small objects and perhaps a few larger ones What network conditions will affect download time?

4 The Internet www.site.com Autonomous System (AS)AS Origin server Authoritative DNS Lookup www.site.com Download webpage

5 How do we fix the web? Change the Internet architecture – rather difficult to deploy More pragmatic choice: target static web content – Basic idea: bring content closer to the clients Design principles – Design for reliability and scalability (Akamai has 120K servers today) – Limit the need for human management

6 CDNs are really popular Quick check

7 The Basic Idea of CDNs: Cache static content near the user www.site.com Autonomous System (AS)AS Origin server

8 CDN Operation Summary www.site.com Autonomous System (AS)AS Origin server Authoritative DNS CDN Nameserver Lookup www.site.com Lookup www.site.com 1.www.site.com delegates zone to CDN namerserver DNS Resolver 2. CDN nameserver uses resolver’s IP to find edge server that is nearest to customer (geo-location) E 3. Edge server E will serve file from local storage if it has it 4. Otherwise will fetch file from origin server

9 Overview of the Akamai CDN [Akamai]

10 Edge server functionality Located at ISPs worldwide - close to customers Maintain metadata for each served file Metadata specified by: – XML files delivered using Akamai infrastructure – Akamai specific HTTP Response Headers

11 Metadata stored for each file Origin server location Content path Cache control – how long should we store replica, how is it invalidated? Cache indexing – how should an URL be used to create a key for that object? Access control – who is allowed to view this file? Response to origin server failure HTTP Header alteration – rewrite headers including cookies to deal with different browsers etc.

12 Mapping System A scoring system creates an up-to-date topological map of Internet – Divides IP addresses into equivalence classes – Computes connectivity between these classes Implementation: – Run and parse ping and traceroutes in real time – Parse BGP data and logs

13 Mapping System The Real Time Mapping system creates the actual maps used to direct users to the best Akamai edge servers Runs in two steps: a)Map to cluster: select a preferred edge server cluster for each equivalence class of users b)Map to server: a low level map sends the user to a specific server within the cluster goal: maintain locality within clusters

14 Implementing the mapping system using DNS 1.The first request goes to generic TLD servers, which return Akamai Top Level Name Servers (TLNS) as authorities, generally with long DNS TTLs. The Akamai TLNS are globally distributed, using a mixture of IP Anycast and large clusters. 2.The next query, to an Akamai TLNS, returns delegations with shorter DNS TTLs to a number of Akamai Low Level Name Servers (LLNS). The Akamai LLNS are typically located in close network proximity to the resolving name server. 3.The final query, to an Akamai LLNS, returns edge server IP addresses based on both the cluster assignment and the low level map described above. These answers have very short TTLs so that changes to the mapping assignments (such as in response to failures or shifts in demand) can be rapidly distributed to end users.

15 Akamai’s Transport protocol The communications between any two Akamai servers can be optimized to overcome the inefficiencies of BGP routing Goals: – Accelerate non-cacheable content – Accelerate apps that check origin server for freshness Techniques: – Path optimizations – Protocol enhancements

16 Path Optimization Build and Internet overlay – Use end-to-end path quality between servers maintained by the mapping system – Move traffic onto the best performing path according to measurements or use multiple paths. 30-50% performance improvements in Asia

17 Akamai Transport Protocol Proprietary (modified TCP) Use pools of persistent connections to avoid 3WHS Play with TCP Window size based on path conditions – E.g. increase initial cwnd when path is know to be good Set aggressive timeouts based on known path information

18 Coral CDN (slides adapted from Mike Freedman )

19 A problem… Feb 3: Google linked banner to “julia fractals” Users clicking directed to Australian University web site …University’s network link overloaded, web server taken down temporarily…

20 The problem strikes again! Feb 4: Slashdot ran the story about Google …Site taken down temporarily…again

21 The response from down under… Feb 4, later…Paul Bourke asks: “They have hundreds (thousands?) of servers worldwide that distribute their traffic load. If even a small percentage of that traffic is directed to a single server … what chance does it have?” → Help the little guy ←

22 Coral’s solution… Implement an open CDN Allow anybody to contribute Works with unmodified clients CDN only fetches once from origin server Origin Server Coral httpprx dnssrv Coral httpprx dnssrv Coral httpprx dnssrv Coral httpprx dnssrv Coral httpprx dnssrv Coral httpprx dnssrv Browser Pool resources to dissipate flash crowds

23 Using CoralCDN Rewrite URLs into “Coralized” URLs www.x.com → www.x.com.nyud.net:8090 – Directs clients to Coral, which absorbs load Who might “Coralize” URLs? – Web server operators Coralize URLs – Coralized URLs posted to portals, mailing lists – Users explicitly Coralize URLs

24 httpprx dnssrv Browser Resolver DNS Redirection Return proxy, preferably one near client Cooperative Web Caching CoralCDN components httpprx www.x.com.nyud.net 216.165.108.10 Fetch data from nearby ? ? Origin Server 

25 Functionality needed DNS: Given network location of resolver, return a proxy near the client put (network info, self) get (resolver info) → {proxies} HTTP: Given URL, find proxy caching object, preferably one nearby put (URL, self) get (URL) → {proxies}

26 Key Idea Use a distributed hash table – but locality is poor – So use multiple DHTs (called clusters)! – Each peer takes part in 3 clusters based on network proximity (<20ms, <60ms, all others) – Insertions are done in all DHTs – Lookups prefer “nearest” DHT A lot more details in the paper.

27 Coral lacks… – Central management – A priori knowledge of network topology Anybody can join system – Any special tools (e.g., BGP feeds) Coral has… – Large # of vantage points to probe topology – Distributed index in which to store network hints – Each Coral node maps nearby networks to self Challenges for DNS Redirection

28 Coral DNS server probes resolver Once local, stay local When serving requests from nearby DNS resolver – Respond with nearby Coral proxies – Respond with nearby Coral DNS servers → Ensures future requests remain local Else, help resolver find local Coral DNS server Coral’s DNS Redirection

29 Return servers within appropriate cluster – e.g., for resolver RTT = 19 ms, return from cluster < 20 ms Use network hints to find nearby servers – i.e., client and server on same subnet Otherwise, take random walk within cluster DNS measurement mechanism Resolver Browser Coral httpprx dnssrv Server probes client (2 RTTs)

30 References Forrester Consulting. eCommerce Web Site Performance Today: An Updated Look At Consumer Reaction To A Poor Online Shopping Experience. Aug. 17, 2009. The Akamai Network: A Platform for High-Performance Internet Applications – Erik Nygren et al. - http://www.akamai.com/dl/technical_publications/net work_overview_osr.pdf Democratizing Content Publication with Coral. Michael J. Freedman, Eric Freudenthal, and David Mazières. In Proc. 1st USENIX/ACM Symposium on Networked Systems Design and Implementation


Download ppt "Content Distribution Networks Costin Raiciu Advanced Topics in Distributed Systems Fall 2012."

Similar presentations


Ads by Google