Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients 

Similar presentations


Presentation on theme: "1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients "— Presentation transcript:

1 1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients  Reduces bandwidth usage  Reduces load on a server

2 2 Web cache types  Browser cache – for a single user  Shared cache (forward and reverse) – same principle for multiple users

3 3 Forward proxy cache Cache located closer to the client Usually deployed by an ISP Decreases bandwidth usage (ISP to the Internet link in the example below)

4 4 Reverse proxy cache Aka gateway proxy or web accelerators Cache proxy located closer to the origin web server Usually deployed by an Web hosting ISP Decreases load on the web server Several reverse proxy caches implemented together can form a Content Delivery Network

5 5 How a typical cache works Freshness – how long the document stays “fresh” or can be used from cache without rechecking the origin server Validation – compare the cached document to the origin document once it’s not “fresh” anymore

6 6 HTML tags vs HTTP headers HTML Meta tags - part of the document; mostly for browser cache (that parses HTML); most Proxy caches do not look inside the document HTTP headers are sent before HTML document; are seen by both browser and proxy caches HTTP/1.1 200 OK Date: Fri, 30 Oct 1998 13:19:41 GMT Server: Apache/1.3.3 (Unix) Cache-Control: max-age=3600, must-revalidate Expires: Fri, 30 Oct 1998 14:19:41 GMT Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT ETag: "3e86-410-3596fbbc" Content-Length: 1040 Content-Type: text/html

7 7 HTTP headers  max-age=[seconds] — specifies the maximum amount of time that an representation will be considered fresh.  s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches.  public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private.  private — allows caches that are specific to one user (e.g., in a browser) to store the response; shared caches (e.g., in a proxy) may not.  no-cache — forces caches to submit the request to the origin server for validation before releasing a cached copy, every time.  no-store — instructs caches not to keep a copy of the representation under any conditions.  must-revalidate — tells caches that they must obey any freshness information you give them about a representation.  proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches.

8 8 Validators Are used by caches to compare the cached document to the original document for changes If validator is not present and no freshness information is available, the document won’t be cached Last-Modified HTTP header E-Tag

9 9 Proxy Server software examples Squid (Unix/Linux and Windows) Squid Varnish (web accelerator) Varnish Apache proxy module and cache moduleproxy module cache module NGINX (HTTP (reverse) and email proxy NGINX

10 10 Interception caching To avoid configuring each client to point to cache proxy Can be accomplished using inline cache, layer 4 switch, WCCP, policy-based routing

11 11 Content Delivery Networks Network of computers that deliver content on the web. Content pushed-out/delivered “closer” to the clients Designed to improve Internet performance (i.e. decrease latency for clients, decrease bandwidth use) Consists of origin server, surrogate (edge servers) Caching and server load balancing techniques are used ESI (Edge-Side Includes) – open standard markup language to augment HTML for help with dynamic delivery and assembly of Web documents

12 Content distribution networks  challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users?  option 1: single, large “mega-server”  single point of failure  point of network congestion  long path to distant clients  multiple copies of video sent over outgoing link ….quite simply: this solution doesn’t scale

13 Content distribution networks  challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users?  option 2: store/serve multiple copies of videos at multiple geographically distributed sites (CDN)  enter deep: push CDN servers deep into many access networks close to users used by Akamai, 1700 locations  bring home: smaller number (10’s) of larger clusters in POPs near (but not within) access networks used by Limelight

14 14 Content Delivery Networks

15 CDN: “simple” content access scenario Bob (client) requests video http://netcinema.com /6Y7B23V  video stored in CDN at http://KingCDN.com/NetC6y&B 23V netcinema.com KingCDN.com 1 1. Bob gets URL for for video http://netcinema.com/6Y7B23V from netcinema.com web page 2 2. resolve http://netcinema.com/6Y7B23V via Bob’s local DNS netcinema’s authorative DNS 3 3. netcinema’s DNS returns URL http://KingCDN.com/NetC6y&B 23V 4 4&5. Resolve http://KingCDN.com/NetC6y&B23 via KingCDN’s authoritative DNS, which returns IP address of KIingCDN server with video 5 6. request video from KINGCDN server, streamed via HTTP KingCDN authoritative DNS

16 CDN cluster selection strategy  challenge: how does CDN DNS select “good” CDN node to stream to client  pick CDN node geographically closest to client  pick CDN node with shortest delay (or min # hops) to client (CDN nodes periodically ping access ISPs, reporting results to CDN DNS)  IP anycast  alternative: let client decide - give client a list of several CDN servers  client pings servers, picks “best”  Netflix approach

17 Case study: Netflix  30% downstream US traffic in 2011  owns very little infrastructure, uses 3 rd party services:  own registration, payment servers  Amazon (3 rd party) cloud services: Netflix uploads studio master to Amazon cloud create multiple version of movie (different endodings) in cloud upload versions from cloud to CDNs Cloud hosts Netflix web pages for user browsing  three 3 rd party CDNs host/stream Netflix content: Akamai, Limelight, Level-3

18 Case study: Netflix 1 1. Bob manages Netflix account Netflix registration, accounting servers Amazon cloud Akamai CDN Limelight CDN Level-3 CDN 2 2. Bob browses Netflix video 3 3. Manifest file returned for requested video 4. DASH streaming upload copies of multiple versions of video to CDNs


Download ppt "1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients "

Similar presentations


Ads by Google