Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proxy-Server Architectures for OLAP Panos Kalnis, Dimitris Papadias THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY.

Similar presentations


Presentation on theme: "Proxy-Server Architectures for OLAP Panos Kalnis, Dimitris Papadias THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY."— Presentation transcript:

1 Proxy-Server Architectures for OLAP Panos Kalnis, Dimitris Papadias THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY

2 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY The Problem  Data warehouses: Large repositories of historical summarized information  Distributed: Centralized or decentralized. Static structure!  WWW: new opportunities to access warehouses. Example:Stock market data  Professional brokers: Access directly the warehouse by special purpose OLAP software  Individual investors around the world: Use web browsers. Slow network? Server overloading? Caching? Interne t Singapore Hong Kong Tokyo London Stock Market Warehouse OLAP clients

3 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY OLAP Cache Servers (OCS)  Similar to WWW Proxy-Servers  Geographically spanned and connected through an arbitrary network  They cache results from OLAP queries  Can derive new results from the cached data  Clients connect to an OCS. If the OCS cannot answer, the query is redirected to a neighbor OCS or to the warehouse  Result: Lower network cost, better scalability, lower response time Interne t Singapore Hong Kong Tokyo London Stock Market Warehouse OLAP clients OCS

4 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY OCS vs. WWW Proxy-Servers  OCS has computational capabilities.  The cache admission and replacement policies are optimized for OLAP operations.  OCS can update its contents incrementally, instead of invalidating the cached data

5 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Background  Data Cube Lattice: Interdependencies among views SELECT P_id, T_id, SUM(Sales) FROM data GROUP BY P_id, T_id  Client-Server OLAP Caching  Watchman: Semantic caching  Dynamat: Stores fragments  Caching chunks  OCSs may use any of these methods  The prototype caches entire views

6 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY System Architecture  Centralized: Query optimization and cache control in a central site (intranet)  Semi-centralized: Only query optimization in central site. Each OCS controls its local cache  Autonomous: All decisions are taken locally (internet)  Multiple levels of caching  Cooperation among OCSs  Physical organization and fragmentation may differ in each OCS

7 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Query Optimizer  A client sends a query q Autonomous policy: i.OCS has the exact answer ii.OCS cannot answer q iii.OCS can derive q Cost = Read + Transfer

8 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Query Optimizer (cont.)  Autonomous: Scalable, easy to implement, high availability.  Large, unstructured, dynamic environments  BUT may produce inefficient plans  Centralized (and semi-centralized):  A central site has global information for all OCSs.  Creates the execution and routing plan for all queries  Low availability, low scalability  Suitable for intranets

9 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Caching Policy: Autonomous  L ower B enefit F irst: Considers interdependencies, but:  Cost() difficult to calculate; If v cannot be answered locally we assume that it is answered by the warehouse  The complexity of LBF grows quadratically with the number of materialized views  We evict a set from the cache if the combined benefit < benefit(u). Select the victim set: Similar idea to [HRU96]

10 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY  All the decisions are taken at the central site  Centralized policy uses S maller P enalty F irst  Experiments show that the difference between SPF and LBF is not significant  In general: A bad decision of the caching algorithm does not affect the performance significantly BUT a bad decision of the optimizer has significant impact Caching Policy: Centralized

11 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Updates  Changes are propagated periodically to the warehouse. It computes deltas for its materialized views  No down time for the OCSs  OCS updates its cache on-demand: Invalidate vs. incrementally update  Deltas are treated as normal data  Deltas are evicted at the end of the update period  Non-updated results are also evicted

12 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Experimental Setup  APB and TPC-H  C max = max Cache as a percentage of the entire cube  1500 queries at each OCS OCS configurationClient-Side-Cache Worst case DCSR vs. C max

13 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Effect of Network Cost  3 OCSs – we vary the speed of the links to the DW  In slow networks, OCSs utilize the contents of their neighbors  In fast networks, many queries reach the warehouse, because the computation cost is lower DCSR vs. C max Warehouse Hit Ratio vs. C max

14 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Autonomous vs. Semi-centralized  Centralized  Semi- Centralized  High tightness or many OCSs  Autonomous  Semi-Centralized DCSR vs. #of OCSs DCSR vs. tightness 100 OCSs

15 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY Conclusions  OCS: Architecture for caching OLAP results  Beneficial for ad-hoc, geographically spanned and possibly mobile users, who sporadically need to access a warehouse  Complimentary to both client-side-cache systems and distributed OLAP approaches  Future work: Prototype on top of a DBMS, support of multiple DWs, finer granularity of cached data, special queries.


Download ppt "Proxy-Server Architectures for OLAP Panos Kalnis, Dimitris Papadias THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY."

Similar presentations


Ads by Google