Cache Tables: Paving the way for an Adaptive Database Cache Mehmet Altınel, Christof Bornhövd, C. Mohan, Hamid Pirahesh, Berthold Reinwald (IBM Almaden.

Slides:



Advertisements
Similar presentations
1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.
Advertisements

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 10: Designing Databases
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Indications in green = Live content Indications in white = Edit in master Indications in blue = Locked elements Indications in black = Optional elements.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
The State of the Art in Distributed Query Processing by Donald Kossmann Presented by Chris Gianfrancesco.
Management Information Systems, Sixth Edition
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Chapter 13 (Web): Distributed Databases
Incremental Maintenance for Non-Distributive Aggregate Functions work done at IBM Almaden Research Center Themis Palpanas (U of Toronto) Richard Sidle.
Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.
G O B E Y O N D C O N V E N T I O N WORF: Developing DB2 UDB based Web Services on a Websphere Application Server Kris Van Thillo, ABIS Training & Consulting.
Technical Architectures
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Distributed DBMSs A distributed database is a single logical database that is physically distributed to computers on a network. Homogeneous DDBMS has the.
Overview Distributed vs. decentralized Why distributed databases
CS240A: Databases and Knowledge Bases Applications of Active Database Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Fundamentals, Design, and Implementation, 9/e Chapter 7 Using SQL in Applications.
1 Minggu 2, Pertemuan 3 The Relational Model Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Chapter 9: The Client/Server Database Environment
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
CS 603 Data Replication in Oracle February 27, 2002.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 1- 1.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Lecture The Client/Server Database Environment
The Client/Server Database Environment
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Lecture On Database Analysis and Design By- Jesmin Akhter Lecturer, IIT, Jahangirnagar University.
Hopkins Storage Systems Lab, Department of Computer Science A Workload-Driven Unit of Cache Replacement for Mid-Tier Database Caching Xiaodan Wang, Tanu.
STORING ORGANIZATIONAL INFORMATION— DATABASES CIS 429—Chapter 7.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
5 Copyright © 2009, Oracle. All rights reserved. Right-Time Data Warehousing with OWB.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Master Thesis Defense Jan Fiedler 04/17/98
Lecture 7 Integrity & Veracity UFCE8K-15-M: Data Management.
Web Caching By Neeraj Agrawal. Caching Caching is widely used for improving performance in many context( e.g processor caches in hardware, buffer pool.
SQL Server 7.0 Maintaining Referential Integrity.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
Association Rule Mining in Peer-to-Peer Systems Ran Wolff Assaf Shcuster Department of Computer Science Technion I.I.T. Haifa 32000,Isreal.
Mobile Data Access1 Replication, Caching, Prefetching and Hoarding for Mobile Computing.
1 Database Management Systems (DBMS). 2 Database Management Systems (DBMS) n Overview of: ä Database Management Components ä Database Systems Architecture.
Methodology – Physical Database Design for Relational Databases.
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
INTRODUCTION lecture1 1. Data base concept Data is a meaningless static value. What does 3421 means? Information is the data you process in a manner that.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
DATABASE REPLICATION DISTRIBUTED DATABASE. O VERVIEW Replication : process of copying and maintaining database object, in multiple database that make.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
Distributed DBMS, Query Processing and Optimization
Chapter 2 Database Environment.
Databases and DBMSs Todd S. Bacastow January 2005.
Practical Database Design and Tuning
The Client/Server Database Environment
Chapter 9: The Client/Server Database Environment
Introduction lecture1.
Data, Databases, and DBMSs
Lecture 1: Multi-tier Architecture Overview
Database System Architectures
Presentation transcript:

Cache Tables: Paving the way for an Adaptive Database Cache Mehmet Altınel, Christof Bornhövd, C. Mohan, Hamid Pirahesh, Berthold Reinwald (IBM Almaden Research Center) Sailesh Krishnamurthy (Computer Science Division,UC Berkeley) Presented by: Umar Farooq Minhas October 04, 2006

2 Motivation Issues Response time Scalability Wide-spread use of Transactional Web Applications (TWA) in enterprise applications Broad range of components e.g. network load balancers, HTTP servers, application servers, …, databases etc. Solutions Caching of static HTML pages Multiple level caches

3 Motivation contd.. Static Caching, Drawbacks TWAs tend to be more & more dynamic High volumes of data Highly personalized contents Run business logic in remote application servers close to end users Reduced response time Reduced load on in-house systems Benefits are limited by the frequency with which remote server needs to access backend DB Proposed Solution: DBCache Allows DB caching at mid-tier nodes, remote data centers and edge servers

4 DBCache: Overview Built using full-fledged DBMS, DB2 Reduced development effort Allows caching of related DB objects Triggers, constraints, indices, stored procedures, … Makes use of existing distributed query execution Provides cache transparency Supports both full-table and partial-table caching On-demand caching Adapts to dynamically changing loads Exploits typical characteristics of TWA queries

5 DBCache: Contributions Database cache model Introduces a new DB object ‘Cache Table’ Dynamic/static caching support Novel query re-write scheme Cache load and maintenance mechanisms

6 Outline Motivation DBCache: Overview Cache Tables Dynamic Cache Model Query Compilation Cache Table Population and Maintenance Performance Evaluation Conclusions & Future Work Discussion

7 Cache Tables A Cache Table is a database object by which an end user can specify that a table (cache table) in a database (cache database) is a cache of a table (backend table) in another database (backend database) Cache Table Cache DB Back end Table Backend DB Two types of cache tables supported: Declarative/Static Cache tables Dynamic Cache tables

8 Declarative/Static Cache Tables When table contents static and known upfront Use declarative cache tables Similar to materialized views Entire table cached in absence of predicate definition Exploits existing materialized view support in DB2

9 Dynamic Cache Tables Populated on-demand Provides adaptability Can choose to cache only “hot” items

10 DBCache Schema Setup Cache schema exact mirror of backend DB schema Each backend DB table represented by Cache Table or Nickname (caching disabled) Requires no change in existing queries Allows caching of other relevant logical and physical objects

11 Outline Motivation DBCache: Overview Cache Tables Dynamic Cache Model Query Compilation Cache Table Population and Maintenance Performance Evaluation Conclusions & Future Work Discussion

12 Dynamic Cache Model Key concepts Cache Keys Defined on cache table column Can be non-unique Must be ‘domain-complete’  Unique/Primary key columns complete by definition  Guarantees correctness of equality predicates

13 Dynamic Cache Model Key concepts contd.. Referential Cache Constraints (RCCs) Defined between any cols of two cache tables Creates a cache-parent/cache-child relationship Guarantees the correctness of equi-join predicates Somewhat similar to referential integrity constraints

14 Dynamic Cache Model Key concepts contd.. Cache Groups Set of related cache tables whose content is (directly or transitively) populated by the values of one or more cache keys of a single cache table, called the root table. Tables reachable by RCC constraints from the root table are called member tables Advantages  Application context recognized more easily  Helps avoiding conflicting cache constraints

15 Dynamic Cache Model Key concepts contd.. Cache Groups contd.. Represented by a directed graph called cache group graph, nodes denote cache tables and edges denote RCCs Direction of an edge for RCC is from a cache-parent to a cache-child Bi-directional edges possible Two or more groups can be overlapping  Captured in connectivity graphs

16 Dynamic Cache Model Issues with Cache Constraints Can cause unexpected cache loads resulting in a phenomena called recursive cache load problem A cache group is called safe if it avoids this problem How to ensure group safety ?

17 Dynamic Cache Model Rules for cache group safety Rule-1: A cache group graph must not include any heterogeneous cycles. Rule-2: A cache table must not have more than one non-unique domain- complete column. A new cache constraint is created only if it doesn’t violate Rule 1 and Rule 2.

18 Outline Motivation DBCache: Overview Cache Tables Dynamic Cache Model Query Compilation Cache Table Population and Maintenance Performance Evaluation Conclusions & Future Work Discussion

19 Query Compilation Declarative Cache Tables Existing materialized view matching mechanism in DB2 is exploited Name switching Dynamic Cache Tables Generate two plans local plan and remote plan Choose at run-time through a switch operator which uses the probe query to decide which leg to execute Janus (two-headed) plan: derived from Roman Mythology God of gates, doors, doorways, beginnings and endings. Month of January ?

20 Query Compilation Constructing a Janus Plan: Initial Query Plan Remote Query Plan Replace Cache Table names with Nicknames 1 2 Generate a probe query by checking all equality predicates that can potentially participate in probe query condition if none found then ABORT ( remote query plan gets executed ) Cloned Input Query Graph Local Query Plan Replace Nicknames with eligible Cache Table names from step Insert switch operator on top of remote, local and probe query plans

21 Outline Motivation DBCache: Overview Cache Tables Dynamic Cache Model Query Compilation Cache Table Population and Maintenance Performance Evaluation Conclusions & Future Work Discussion

22 Cache Table Population & Maintenance Declarative Cache Tables Relies on DPropR utility: IBM’s asynchronous data replication tool Dynamic Cache Tables On-demand loading Cache key values failing probe query are used to extract data Extracted data populated asynchronously by a cache daemon Cache invalidation Generate invalidation messages and send to cache daemon Cache daemon generates and executes deletes against cacheDB Updated rows get loaded with new requests

23 Outline Motivation DBCache: Overview Cache Tables Dynamic Cache Model Query Compilation Cache Table Population and Maintenance Performance Evaluation Conclusions & Future Work Discussion

24 Performance Evaluation Focus: Evaluate overhead of Janus plans for dynamic tables Overhead of probe query and switch operator Overhead of on-demand loading Experimental settings

25 Performance Evaluation Cache Hit Case Janus plan vs. pure local queries Difference gives the overhead for probe query and the switch operator Cache table loaded with all the data from backend table

26 Performance Evaluation Cache Miss Case Janus plan vs. pure remote queries Difference gives the overhead Cache table initially empty

27 Outline Motivation DBCache: Overview Cache Tables Dynamic Cache Model Query Compilation Cache Table Population and Maintenance Performance Evaluation Conclusions & Future Work Discussion

28 Conclusions & Future Work Significant contributions Provides a new frame-work to implement DB caching for TWAs and tends to provide: Seamless integration with current applications Supports static/dynamic cache tables Adapts to the changing workloads in TWAs Re-uses the functionality of a full-fledged DBMS i.e. DB2 What next ? Provide efficient, scalable, zero-admin DBCache Development of new tools to ease deployment Improve adaptability and maintenance

29 Comparison vs. amco05: Relies on asynchronous data propagation utility Not completely transparent May not work for heterogeneous DBMSs Allows stale data vs. gula04: Cache constraints against C&C constraints Doesn’t provide any guarantees of freshness/consistency Relatively more transparent Maintenance-centric vs. query-centric Both deployed as mid-tier level caches Both use a full-fledged DBMS Both use Materialized views Both use two-headed query plans

30 Discussion Is it really that good ? Using full-fledged DBMS at each middle-tier node, drawbacks ? How is data freshness specified/guaranteed ? Is it adaptable ? Weakly ? Strongly ? When can cache constraints become bottleneck ? Size of dynamic cache tables ? Cache replacement policies/cleansing mechanisms? Caching of other physical & logical DB Objects ? Updates to those objects in backend DB? Message traffic between Cache Daemon & Backend DB ? Very frequent updates in backend DB Local updates ? Flaws in performance evaluation ?