Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.

Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer Science Department Federal University of Minas Gerais Brazil

Summary Introduction to Web caching Motivations and goals Evaluation methodology –Performance metrics –Workload description –Caching system simulator Experimental results Conclusions and future work

Web Caching Dramatic growth of the WWW in terms of content, users, servers and complexity Web caching is a common strategy used to: –reduce the traffic over Internet –increase server scalability –diminish the latency in the network Use of caching by the deployment of Web Proxies

Web Caching Web proxies can be seen as intermediaries of the traffic between the HTTP clients and servers Nowadays the Web has a hierarchical topology: Clients Proxies Servers

Web Caching Cache replacement is one of the issues that a proxy should be able to manage: –As the cache has finite size, when it is full, how does a proxy choose a page to remove from its cache? A lot of research has been done to address this question and several cache replacement policies can be found in the literature Key questions: –Is the design of new cache replacement policies needed? –What are the properties that new policies should take advantage of to improve a caching system?

Goals Investigate how much a new caching policy could improve cache system performance Explore the main causes of periods of poor and high performance in caching systems

Evaluation Methodology Evaluation of different metrics over time: –Hit Ratio –Percentage of first-timers –Maximum improvement –Entropy Time intervals of 1, 10 and 100 minutes Use of real workloads

Performance Metric: Hit Ratio Hit ratio is the percentage of requests satisfied by the cache It is most general metric used to evaluate the effectiveness of a caching policy Measuring hit ratio over time to detect periods of variations of performance

Performance Metric: Percentage of First-Timers Caching policies cannot satisfy first-timers –the first-timer has never been requested in the past First-timer is the first request for an object of the trace.

We evaluate the maximum hit ratio a new caching policy can improve over the simple LRU policy Performance Metric: Maximum Improvement The maximum improvement MI is defined as: Maximum improvement over LRU:

Performance Metric: Entropy Entropy measures the concentration of popularity of a request stream The higher the value of the entropy, the lower the concentration of popularity Caching policies should keep objects with high probability of being referenced in the near future Taking n distinct objects with probability p i of occurrence, the entropy H(X) of a request stream is calculated as:

Performance Metric: Entropy Use of the normalized entropy H N : Entropy depends on the number of distinct objects Investigate the influence of popularity on caching performance

Experiment Setup Real traces from proxy caches located at two points of the Web topology: –Closer to clients: Federal University of Minas Gerais (UFMG) –Closer to servers: National Laboratory for Applied Network Research (NLANR) Cache Size: 10% of the number of distinct objects Replacement caching policy: Simple LRU

Workload Description Name University 1University 2NLANR 1NLANR 2 start date01-10-200401-12-200401-18-200501-20-2005 # days210211 # requests1,004,7473,459,5491,207,0753,427,391 distinct objects299,367623,164891,9062,350,215 normalized entropy0.85320.82680.94820.9329 Traces used –Cache warming: University 1, NLANR 1 –Performance evaluation: University 2, NLANR 2 Higher concentration of popularity on university traces (lower entropy) Larger fraction of different objects in the NLANR traces, what diminish significantly the caching performance

Experimental Results: Hit Ratio Higher hit ratio for University trace Strong variation along the time What are the factors that causes the variations on hit ratio? proxy closer to clientsproxy closer to servers

Experimental Results: Percentage of First-Timers Smaller % of first-timers at the proxy closer to clients Correlation coefficient between hit ratio and the percentage of first-timers: -0.857 for the NLANR and -0.962 for the university Caching policies cannot satisfy first-timers, the most important factor for poor and good performance in the analyzed traces proxy closer to clients proxy closer to servers

Experimental Results: Entropy Proxy closer to clients: lower entropy → higher concentration of popularity LRU policy does not take advantage of all locality of reference Correlation coefficient between hit ratio and entropy: -0.787 for the NLANR and -0.453 for the university If we had a caching policy able to filter all the locality (entropy = 1), how much could hit ratio be improved? proxy closer to clientsproxy closer to servers

Experimental Results: Maximum Improvement The hit ratio cannot be significantly improved for the trace closer to clients High number of first-timers diminishing the hit ratio Improving caching performance Reorganization of the hierarchy of caches (cache placement) Caching system able to deal with the first-timers proxy closer to clientsproxy closer to servers

Conclusions and Future Work Summary of main findings –Strong variation of hit ratio along the time –High number of first-timers (higher close to servers) Main cause of low hit ratio –LRU policy is not able to filter the entire locality of a stream Small correlation with hit ratio –The maximum improvement we could obtain over LRU: less than 5 percent closer to clients In average 25 percent closer to servers –Results suggest reorganization of cache topology and a caching system able to deal with the higher number of first-timers Future work –Cache placement: find the optimal cache organization in order to improve the overall system performance –Auto-adaptive cache system able to minimize periods of poor performance

Questions? Fabricio Benevenuto, Fernando Duarte, Virgilio Almeida, Jussara Almeida {fabricio, fernando, virgilio, jussara}@dcc.ufmg.br

Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.

Similar presentations

Presentation on theme: "Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.

Similar presentations

Presentation on theme: "Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer."— Presentation transcript:

Similar presentations

About project

Feedback