Web Caching Schemes For The Internet – cont. By Jia Wang.

Web Caching Schemes For The Internet – cont. By Jia Wang

Topics Cache resolution/routing Prefetching Cache replacement Cache coherency Other topics

Cache Resolution/Routing Most Web caching schemes: many Web caches scattered over the Internet Main challenge: how to quickly locate the appropriate cache No necessary location among document ’ s cache location Unmanageably large cache routing tables

Cache Resolution/Routing Out-of-date cache routing information leads to cache miss Minimize the cost of a cache miss: ideal cache routing algorithm

Cache Resolution/Routing Common approach – caching distribution tree from popular servers towards high demand sources –Resolution via cache routing table/ hash functions Works well for popular documents

Cache Resolution/Routing What about less popular documents?

Cache Resolution/Routing Hit rate on web caches < 50%

Cache Routing Table Malpani – make a group of caches function as one –Cache is selected arbitrary –In case of miss: use IP multicast (why?) –Redirection

Cache Routing Table

Advantages: –No bottlenecks –No single point of failure

Cache Routing Table Disadvantage: –Overhead Solutions?

Cache Routing Table Harvest – organize caches in hierarchy –Internet Cache protocol: ICP –In case of miss: siblings and upward

Cache Routing Table Adaptive Web Caching – mesh of caches –Distribution trees are built –Overlapping multicast groups –No root node will be overloaded –For less popular objects: long journey

Hashing Function Cache Array Routing Protocol: CARP –“ query-less ” caching by hash function –Based on “ array membership list ” and URL for exact cache location –Proxy removal: reassign 1/n URLs and distribute new hash function

Prefetching Caching documents at proxies improve Web performance with limited benefit Maximum cache hit rate < 50%

Prefetching One way to increase hit rate: anticipate future requests and preload or prefetch

Prefetching Prefetching must be effective (why?) Prefetching can be applied in 3 ways: –Between browser clients and Web Servers –Between proxies and Web Servers –Between browser clients and proxies

Between browser clients and Web Servers Cunha – use a collection of Web clients –How to predict user ’ s future Web accesses from his past Web accesses –Two types of users: net surfer and conservative

Between browser clients and Web Servers Conservative – easy to guess which document will access next –Prefetching is well paid off Net surfer – all documents have equal probability of being accessed –Price to be paid in terms of extra bandwidth is too high

Between proxies and Web Servers Markatos – –Web servers push popular documents to Web proxies (top-10) –Web proxies push popular documents to Web clients –Web servers can anticipate > 40% client ’ s request –Requires cooperation from Web servers

Between proxies and Web Servers Performance: –Top-10 manages to prefetch (up to) 60% of future requests –less than 20% corresponding increase in traffic

Between browser clients and proxies Fan – reduce latency by prefetching between caching proxies and browsers –Relies on the proxy – predict which cached documents a user might reference next –Use idle time between user requests to push documents to the user –Reduce client latency by 23%

Prefetching - summary First two approaches – increase WAN traffic Last approach – affects traffic over modems/LANs

Cache placement/replacement Document placement/replacement algorithm can yield high hit rate Cache placement – not been well studied Cache replacement – can be classified into 3 categories:

Cache placement/replacement Traditional policies Key-based policies Cost-based policies

Cache replacement – traditional policies Least Recently Used – LRU Least Frequently Used – LFU Pitkow/Recker – LRU except if all objects are accessed within the same day

Cache replacement – key-based policies Size – evicts the largest object (why?) LRU-MIN – biased in favor the smaller objects –Evicts the LRU object which has size > S, S/2, S/4 etc.

Cache replacement – key-based policies LRU-Threshold – LRU but objects which have size > Threshold are never cached Lowest Latency First

Cache replacement – cost-based policies GreedyDual-Size – associates a cost with each object –Evicts object with lowest cost/size Server-assisted – models the value of caching an object in terms of its fetching cost, size and cache prices –Evicts object with lowest value

Cache coherency Caches provides lower access latency Side effect: stale pages Every Web cache must update pages in its cache

Cache coherency HTTP commands that assist Web proxies in maintaining cache coherence: HTTP GET Conditional GET: HTTP GET combined with the header IF-Modified-Since Pragma:no-cache Last-Modified:date

Cache coherence mechanisms Current cache coherency schemes provides two types of consistency –Strong cache consistency –Weak cache consistency

Strong cache consistency Client validation – polling every time –Cached resources are potentially out-of- date –If-Modified-Since with each access to proxy –Many 304 responses

Strong cache consistency Server invalidation –Upon detecting a resource change, send invalidation message –Server must keep track of lists of clients –The lists can become out-of-date

Weak cache consistency Piggyback invalidation: three invalidation mechanisms are proposed:

Weak cache consistency The Piggyback Cache Validation: on every communication between proxy to server, the proxy piggybacks a list of cached resources for validation

Weak cache consistency The Piggyback Server Invalidation: on every communication between server to proxy, the server piggybacks a list of resources that have changed since the last access

Weak cache consistency Combination of PSI and PCV: depends on the time since the proxy last requested invalidation –If the time is small: PSI –For longer gaps: PCV

More topics Load balancing –Hot-spot Dynamic data caching –Active cache

Conclusion Web service becomes more popular –More network congestion –More server overloading Web caching – one of the effective techniques Open problems – proxy placement, cache routing, dynamic data caching, fault tolerance, security etc.

Web Caching Schemes For The Internet – cont. By Jia Wang.

Similar presentations

Presentation on theme: "Web Caching Schemes For The Internet – cont. By Jia Wang."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Caching Schemes For The Internet – cont. By Jia Wang.

Similar presentations

Presentation on theme: "Web Caching Schemes For The Internet – cont. By Jia Wang."— Presentation transcript:

Similar presentations

About project

Feedback