Web Cache Consistency. “Requirements of performance, availability, and disconnected operation require us to relax the goal of semantic transparency.”

Slides:



Advertisements
Similar presentations
CS193H: High Performance Web Sites Lecture 17: Rule 14 – Make Ajax Cacheable Steve Souders Google
Advertisements

TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 22 World Wide Web and HTTP.
Amazon CloudFront An introductory discussion. What is Amazon CloudFront? 5/31/20122© e-Zest Solutions Ltd. Amazon CloudFront is a web service for content.
Getting to 2nd Base with your Content Delivery Network Mike Brittain Engineering Architect.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
1 11 Web Caching Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB CACHING  Cache Definition  Goals of Web Caching  Motivations for.
1 Caching in HTTP Representation and Management of Data on the Internet.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Web basics HTTP – – URI/L/Ns – HTML –
Hypertext Transfer Protocol Kyle Roth Mark Hoover.
HTTP HyperText Transfer Protocol Part 2.
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
HTTP HyperText Transfer Protocol.
1 HTTP - Hypertext Transfer Protocol Arthur : Yigal Eliaspur Date :
Chapter 9 Caching, NAT Professor Rick Han University of Colorado at Boulder
HTTP HyperText Transfer Protocol Part 3.
2/9/2004 Web and HTTP February 9, /9/2004 Assignments Due – Reading and Warmup Work on Message of the Day.
Hypertext Transport Protocol CS Dick Steflik.
Web Cache. Introduction what is web cache?  Introducing proxy servers at certain points in the network that serve in caching Web documents for faster.
1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients 
1 3 Web Proxies Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB PROXIES  Web Proxy Definition  Three of the Most Common Intermediaries.
HyperText Transfer Protocol
Krerk Piromsopa. Web Caching Krerk Piromsopa. Department of Computer Engineering. Chulalongkorn University.
Web Caching: Replication on the World Wide Web Jonathan Bulava CSC8530 – Distributed Systems Dr. Paul Schragger.
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Web Caching Dr. Yingwu Zhu. What is Web Caching Introducing proxy servers at certain points in the network that serve in caching Web documents for faster.
TCP/IP Protocols Dr. Sharon Hall Perkins Applications World Wide Web(HTTP) Presented by.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
HyperText Transfer Protocol (HTTP) RICHI GUPTA CISC 856: TCP/IP and Upper Layer Protocols Fall 2007 Thanks to Dr. Amer, UDEL for some of the slides used.
Web Server Design Week 4 Old Dominion University Department of Computer Science CS 495/595 Spring 2010 Martin Klein 2/03/10.
CSE 461 HTTP and the Web. This Lecture  HTTP and the Web (but not HTML)  Focus  How do Web transfers work?  Topics  HTTP, HTTP1.1  Performance Improvements.
1 Caching in HTTP Representation and Management of Data on the Internet.
World Wide Web Caching CS457 Seminar Yutao Zhong 11/13/2001.
CIS679: Lecture 13 r Review of Last Lecture r More on HTTP.
HTTP support for caching & replication. Conditional requests Server executes conditional request. Responds with a message body only if the condition is.
HTTP Protocol Design1 HTTP - timeline r Mar 1990 CERN labs document proposing Web r Jan 1992 HTTP/0.9 specification r Dec 1992 Proposal to add MIME to.
On The Cooperation of Web Clients and Proxy Caches Yiu Fai Sit, Francis C.M. Lau, Cho-Li Wang Department of Computer Science The University of Hong Kong.
Operating Systems Lesson 12. HTTP vs HTML HTML: hypertext markup language ◦ Definitions of tags that are added to Web documents to control their appearance.
HTTP evolution - TCP/IP issues Lecture 4 CM David De Roure
HTTP Here, we examine the hypertext transfer protocol (http) – originally introduced around 1990 but not standardized until 1997 (version 1.0) – protocol.
Web Services. 2 Internet Collection of physically interconnected computers. Messages decomposed into packets. Packets transmitted from source to destination.
27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
EE 122: Lecture 21 (HyperText Transfer Protocol - HTTP) Ion Stoica Nov 20, 2001 (*)
1 COMP 431 Internet Services & Protocols HTTP Persistence & Web Caching Jasleen Kaur February 11, 2016.
System Software Lab. A Scalable Web Cache Consistency Architecture Kim Sangyup SSLAB. EE. KAIST SIGCOMM ’ 99 Haobo Yu, Lee Breslau.
HTTP HyperText Transfer Protocol.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
Week 11: Application Layer 1 Web and HTTP r Web page consists of objects r Object can be HTML file, JPEG image, Java applet, audio file,… r Web page consists.
27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Distributed Web Systems Cookies and Session Tracking Lecturer Department University.
WCDP: A protocol for web cache consistency Renu Tewari IBM Almaden Research Thirumale Niranjan IBM Software Group
A tutorial on Web Caching
Cookies Tutorial Cavisson Systems Inc..
Web fundamentals: Clients, Servers, and Communication
CS6320 – Performance L. Grewe.
WWW and HTTP King Fahd University of Petroleum & Minerals
Affinity Depending on the application and client requirements of your Network Load Balancing cluster, you can be required to select an Affinity setting.
HTTP request message: general format
HTTP Headers.
HTTP and the Dynamic Web
Hypertext Transport Protocol
Web Caching? Web Caching:.
Steps on accessing a web site
CSE 461 HTTP and the Web.
EE 122: HyperText Transfer Protocol (HTTP)
WebDAV Design Overview
World Wide Web Uniform Resource Locator hostname [:port]/path
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Presentation transcript:

Web Cache Consistency

“Requirements of performance, availability, and disconnected operation require us to relax the goal of semantic transparency.” - HTTP 1.1 specification Any caching/replication framework must take steps to ensure that the cache does not deliver old copies of modified objects. Issues for cache consistency in the Web: large number of clients/proxies most static objects don’t change very often weaker consistency requirements Stale information might be OK, as long as it is “not too stale”.

Validation vs. Invalidation Validation Proxy periodically polls server for updates to cached objects How often to poll? (“freshness date”) Sync vs. async Invalidation Server informs proxy if cached object is updated

Validation vs. Invalidation: The Tradeoffs What are the tradeoffs? Scale Consistency quality Performance and poll overhead Fast hit vs. slow hit Does popularity correlate with update rate? Validation “works” today! GET-IF-MODIFIED-SINCE How to set the TTLs or expires headers? Design of a scalable invalidation architecture for the Web is a difficult challenge.

Cache Expiration and Validation HTTP 1.0 cache control Origin server may add a “freshness date” (Expires) response header....or the cache could determine expiration time (TTL) heuristically. Proxy must revalidate cache entry if it has expired. Last-Modified and If-Modified-Since Whose clock do we use for absolute expiration times? Clients ProxyOrigin Server GET x If-Modified-Since m x, Last-Modified m Expires t 304: Not Modified

Consistency: Variations on a Theme Pipeline validations and Piggyback Cache Validations [Krishnamurthy and Wills] Opportunistically“prefetch” validations. Enough traffic to benefit? Coarse granularity: volumes Cluster objects in volumes to reduce the number of validations when update rates are low. Delta encoding [Mogul et al 1997] : fine-grained updates Optimistic deltas: reduce latency of a consistency miss by sending a stale copy from cache, followed by the delta. Nice hack for cookied content.

HTTP 1.1 Specification effort started in W3C, finished in IETF....much later. A number of research works influenced the specification. HTTP 1.0 shows the importance of careful specification. performance persistent connections with pipelining range requests, incremental update, deltas caching cache control headers negotiation of content attributes and encodings content attributes vs. transport attributes transport encodings for transmission through proxies Trailer header and trailer headers

Expiration and Validation in HTTP 1.1 HTTP 1.1 cache control allows origin server to: use relative instead of absolute expiration times (max-age); issue opaque validators (ETag for entity tag) instead of timestamps; Origin server may specify which of several cached entries to use. Clients ProxyOrigin Server GET x If-None-Match v x, ETag v max-age t 304: Not Modified, ETag v Age < t Age = 0

Other 1.1 Cache Control Features Client may specify that no caching is to occur. private or no-store Vary headers allow server to specify that certain request headers must also match if the proxy deems a cached response valid. language, character set, etc. Server may specify that a response is not cacheable. Pragma: no-cache header since HTTP 1.0 Client may explicitly request the proxy to validate the response. Pragma: no-cache Proxy may/should/must tell client the age of a cached response. Age header Proxy may/should/must tell client that it could not validate a non- fresh cached response with the origin server. Warning header

The Role of the Content Developer Use expiration dates where known Limit the scope of cookies If using cookies for personalization, use cache control headers to disable caching on the personalized objects What if you forget? Decompose dynamic pages into cacheable and uncacheable components. Templates [Douglis97] Edge-side includes (Akamai) Base instance [WebExpress]

Cookies HTTP cookies (RFC2109) have brought us a better Web. S optionally includes arbitrary state as a cookie in a response. Cookie is opaque to C, but C saves the cookie. C sends the saved cookie in future requests to S, and possibly to other servers as well. Allows stateful servers for sessions, personalized content, etc. But: cookies raise privacy and security issues. What did S put in that cookie? Can anyone else see it? How much space does it take up on my disk that I paid soooo much for? Cookies may allow third parties who are friends of S 1,..., S N to observe C’s movements among S 1,..., S N. Unverifiable transactions, e.g., DoubleClick and other ad services.

Unverifiable Transactions Users may not know that they are interacting with DoubleClick. Amazon and MyCFO trust DoubleClick, but client is ignorant. The user visits pages at many sites that reference DoubleClick. DoubleClick’s cookie allows it to associate all the requests from a given user. If the browser sends Referer headers, DoubleClick may gather information about all the sites the user visits that reference DoubleClick. mycfo.com Clientdoubleclick, akamai, etc. GET x GET y GET ad Referer mycfo.com amazon.com ad, cookie c ad GET ad, cookie c Referer amazon.com/x

WCDP Sara Sprenkle led a discussion of WCDP, a protocol for server-driven consistency from IBM. Slides for this portion of the class may be found at: It is important to understand the context of the server-driven approach, its role in CDNs, the opportunity to use invalidation, and how WCDP addresses the scalability concerns.