Presentation is loading. Please wait.

Presentation is loading. Please wait.

WATCHMAN: A Data Warehouse Intelligent Cache Manager Peter ScheuermannJunho ShimRadek Vingralek Presentation by: Akash Jain.

Similar presentations


Presentation on theme: "WATCHMAN: A Data Warehouse Intelligent Cache Manager Peter ScheuermannJunho ShimRadek Vingralek Presentation by: Akash Jain."— Presentation transcript:

1 WATCHMAN: A Data Warehouse Intelligent Cache Manager Peter ScheuermannJunho ShimRadek Vingralek Presentation by: Akash Jain

2 Motivation n Data Warehouse - Infrequent Updates. - For Data Analysis and Decision Support System (DSS) Queries. - Query Performance is important. n DSS Queries follow hierarchical “Drill Down Analysis” pattern. n Caching at multiple levels. n Queries at higher level are likely to occur frequently in a multiuser environment.

3 Overview n Least Normalized Cost Replacement (LNC-R): - Goal: Minimize response time. (not maximize hit ratio) - Uses profit metric: Considers average rate of reference of the retrieved set, its size and cost of associated query. n Least Normalized Cost Admission (LNC-A): - Goal: Should a retrieved set be admitted in the cache? - Uses profit metric similar to LNC-R. - No reference frequency information available for newly retrieved sets. n LNC-RA: Integration of LNC-R with LNC-A. n Sends hints to buffer manager to improve its hit ratio.

4 Cache Replacement Algorithm: LNC-R n Parameters per retrieved set RS i corresponding to query Q i. -  i : average rate of reference to query Q i. - s i : size of the set retrieved by query Q i. - c i : cost of execution of query Q i. n Maximize Cost Savings Ratio(CSR) defined as: - CSR = (  i c i h i ) / (  i c i r i ) - h i : number of times that references to query Q i were satisfied from cache. - r i : total number of references to query Q i. n Performance metric: - profit(RS i ) =  i * c i / s i

5 Cache Replacement Algorithm: LNC-R (Contd.) n Algorithm: - Assume RS i with size s i to be admitted, s i > available free space. - Sort retrieved sets in cache in ascending order of profit. - Heuristic: Size does matter! n Calculation of  i : - Based on moving average of last K inter-arrival times of requests to RS i. -  i = K / (t - t k ), t: current time, t k : time of K-th reference. n If K references not available, use maximal available. n Give less referenced retrieved set higher priority of replacement

6 Cache Admission Algorithm: LNC-A n Aim: Prevent caching of retrieved sets which may cause response time degradation. n Cache RS i only if profit(RS i ) > profit(C), where C is a set of replacement candidates and profit(C) =  RS i  C (  i * c i / s i ) n WATCHMAN retains the reference information, use moving average to calculate  i n If no previous information present for RS i, use estimated profit defined as e-profit(RS i ) = c i / s i n Replace profit by e-profit in previous expressions.

7 LNC* And LNC-RA n Given: - {RS 1, RS 2, … RS n } = set of retrieved set of all queries. - r 1, r 2,... r n, = retrieved set reference string. - Prob(r i = RS k ) = p k for all i > 1. n Obtain: - I* = { i: RS i is in the cache } where I*  N = { 1, 2,…,n } such that min (  i  N - I* p i * c i ) subject to the constraint  i  I* s i  S where S is the cache size. IS NP-COMPLETE ! n Constrained model: -  i  I* s i = S.

8 LNC* And LNC-RA (Contd.) n LNC* Algorithm: - Sort {RS 1, RS 2, … RS n } in descending order of p i * c i / s i. - Assign I* from the start of the list until cache is full. n LNC-RA approximates LNC*: - p i =  i /  where  =  i  N  i. - Using a sample of last K references, as K  , LNC-RA converges to I* for sufficiently long reference strings.

9 Retained Reference Information Problem n Form of starvation - A new retrieved set in cache is the first candidate for eviction. - Reason: We are not storing the reference information. n Proposal (another paper): - Retain reference information only for certain period after the last reference. - Problems: - Non-consideration of size and cost. - Does not take the cache size into account. n WATCHMAN’s proposal: - Evict reference information of a set whenever the profit associated with the set is smaller than the least profit among all cached sets.

10 Conclusion n Can apply similar ideas in Web caching. n Not such a novel idea. Algorithms are known in theory. n Hints can improve the performance of buffer manager. n LNC-RA improves CSR ratio by a factor of 3when compared with LRU. n LNC-A improve CSR on an average by 19%.


Download ppt "WATCHMAN: A Data Warehouse Intelligent Cache Manager Peter ScheuermannJunho ShimRadek Vingralek Presentation by: Akash Jain."

Similar presentations


Ads by Google