Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale Daniel Ling Senior Solution Specialist Information Management Software, IBM HK.

Similar presentations

Presentation on theme: "Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale Daniel Ling Senior Solution Specialist Information Management Software, IBM HK."— Presentation transcript:


2 Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale Daniel Ling Senior Solution Specialist Information Management Software, IBM HK

3 Now everything Oracle got on database, DB2 has it and works better OracleDB2 Oracle Enterprise EditionDB2 Enterprise Edition Oracle Standard EditionDB2 Workgroup Edition Oracle Standard One Edition DB2 Express Edition Oracle Times Ten In- Memory database DB2 Solid DB In-Memory database Oracle XMLDB2 Pure XML database Oracle LiteDB2 Everyplace Resilience and HA Oracle RAC (active-active share disk) DB2 PureScale (active-active share disk) Oracle Data Guard (read on standby) DB2 HADR (read on standby) Oracle StreamDB2 Queue Replication …..…… Tools Oracle Range Partitioning DB2 Table Partitioning No OfferDB2 Database Partitioning Oracle Advance Compression DB2 Row Compression Oracle Label SecurityDB2 Label Security Oracle Database and Audit Vault DB2 Guardium database security monitoring Oracle Enterprise Manager DB2 Performance Optimization Feature Oracle Automatic Memory Management DB2 Self-Tuning Memory Manager Oracle Automatic Storage Management DB2 Automatic Storage Oracle Data MaskingDB2 Optim Data Privacy Oracle Real Application Testing DB2 Optim Test Data Management No OfferDB2 Optim Data Archiving

4 Database High Availability Options

5 tx Active Server tx Integrated with Tivoli System Automation cluster manager (included in both DB2 Enterprise and DB2 Workgroup without charge ) - Node Failure Detection - Disk takeover - IP takeover - Restart DB2 Server Based Failover - i.e. most OS clustering

6 DB2 (HADR) – database log shipping HA  Redundant copy of the database to protect against site or storage failure  Support for Rolling Upgrades  Failover in under 15 seconds –Real SAP workload with 600 SAP users – database available in 11 sec.  100% performance after primary failure  Included in DB2 Enterprise and DB2 Workgroup without charge tx Network Connection Standby Server HADR Keeps the two servers in sync Automatic Client Reroute Client application automatically resumes on Standby tx Standby Server Primary Server DB2 High Availability Disaster Recovery (HADR) enables highly available database standby Fail over in minute DB2 & MSSQL (2 modes) – Syn and Asyn DB2 ( 3 modes ) – Semi syn – Min delay in performance – Assure integrity

7 Customers need a highly scalable, flexible solution for the growth of their information with the ability to easily grow existing applications Critical IT Applications Need Reliability and Scalability  Down-time is Not Acceptable –Any outage means lost revenue and permanent customer loss –Today’s distributed systems need reliability  Local Databases are Becoming Global –Successful global businesses must deal with exploding data and server needs –Competitive IT organizations need to handle rapid change

8 Introduce DB2 pureScale (Active-Active share disk) Unlimited Capacity – Buy only what you need, add capacity as your needs grow Application Transparency – Avoid the risk and cost of application changes Continuous Availability – Deliver uninterrupted access to your data with consistent performance

9 DB2 pureScale Architecture Leverages the global lock and memory manager technology from z/OS Automatic workload balancingShared Data InfiniBand network & DB2 Cluster Services Cluster of DB2 nodes running on Power servers Integrated Cluster Manager Now available on -AIX InfiniBand - Intel InfiniBand -Intel Ethernet - AIX Ethernet target 2Q 11

10 The Key to Scalability and High Availability Efficient Centralized Locking and Caching – As the cluster grows, DB2 maintains one place to go for locking information and shared pages – Optimized for very high speed access DB2 pureScale uses Remote Direct Memory Access (RDMA) to communicate with the powerHA pureScale server No IP socket calls, no interrupts, no context switching Results – Near Linear Scalability to large numbers of servers – Constant awareness of what each member is doing If one member fails, no need to block I/O from other members Recovery runs at memory speeds Group Buffer Pool CF PowerHA pureScale Group Lock Manager Member 1

11 Application Servers and DB2 Clients Recover Instantaneously From Node Failure - Using RDMA (Remote Direct Memory Access)  Protect from infrastructure related outages –Redistribute workload to surviving nodes immediately –Completely redundant architecture –Recover in-flight transactions on failing node in as little as 15 seconds including detection of the problem

12 Minimize the Impact of Planned Outages Identify MemberDo Maintenance Bring node back online Keep your system up – During OS fixes – HW updates – Administration

13 Online Recovery Log Shared Data Log CF DB2 CF DB2 pureScale design point is to maximize availability during failure recovery processing When a database member fails, only in- flight data remains locked until member recovery completes – In-flight = data being updated on the failed member at the time it failed Time to row availability – <20 seconds % of Data Available Time (~seconds) Only data in-flight updates locked during recovery Database member failure 100 50

14 Compare to traditional technology – data availability DB2 pureScale Oracle RAC Shared Disk Member failure 1 Recovery performed on another host Node failure 1 Lock remaster/rebuild Lock requests frozen until lock state ‘rediscovered’ from log of failed node Another machine performs recovery 3 3a More random disk I/Os needed 3a CF services most page requests from memory Survivors can get locks throughout In-flight Transactions Global Lock State 3 In-flight Transactions Global Lock State 2a 2 2 2 This example assumes about 5% of the database data was being updated on the database node that failed, at the time of the failure. Database member failure % of Data Available Time (~seconds) 100 Oracle RAC Shared Disk: partial or full “freeze” for global lock state re-master/re-build pureScale: only data that was being updated on failed database member is (temporarily) locked In-flight Transactions 2a 2 In-flight Transactions Hot pages CF Global Lock State

15 Automatic Workload Balancing and Routing Shared Data Log Run-time load information used to automatically balance load across the cluster (as in Z sysplex) – Load information of all members kept on each member – Piggy-backed to clients regularly – Used to route next connection or optionally next transaction to least loaded member – Routing occurs automatically (transparent to application) Failover: load of failed member evenly distributed to surviving members automatically – Once the failed member is back online, fallback does the reverse Affinity-based routing, and failover and fallback also possible Shared Data Transaction logs Data Clients Shared Data Transaction logs Data

16 Reduce System Overhead by Minimizing Inter-node Communication DB2 pureScale’s central locking and memory manager minimizes communication traffic Other database software require CPU intensive communication between all servers in a cluster DB2 pureScale grows efficiently as servers are added Other database software waste more and more CPU as they grow

17 The Result 64 Members 95% Scalability 16 Members Over 95% Scalability 2, 4 and 8 Members Over 95% Scalability 32 Members Over 95% Scalability 88 Members 90% Scalability 112 Members 89% Scalability 128 Members 84% Scalability Validation testing includes capabilities to be available in future releases.

18 IT Needs to Adapt in Hours…Not Months  Application Changes are Expensive –Changes to handle more workload volume can be costly and risky –Developers rarely design with scaling in mind –Adding capacity should be stress free  Handling Change is a Competitive Advantage  Dynamic Capacity is not the Exception –Over-provisioning to handle critical business spikes is inefficient –IT must respond to changing capacity demand in days, not months Businesses need to be able grow their infrastructure without adding risk

19 DB2 now has built in Oracle Compatibility Oracle DBMS  DB2 9.7 Oracle Concurrency Control  No change Oracle SQL  No change Oracle PL/SQL  No Change Oracle Packages  No Change Oracle Built-in packages  No Change Oracle JDBC  No Change Oracle SQL*Plus Scripts  No Change Changes are the exception. Not the rule. THIS IS WHY WE CALL IT ENABLEMENT AND NOT PORT ! PL/SQL = Procedural Language/Structured Query Language

20 Concurrency prior to DB2 v9.7 blocksReaderWriter ReaderNo WriterNoYes blocksReaderWriter ReaderNoNo * WriterYes Enabling Oracle application to DB2 required significant effort to re-order table access to avoid deadlocks * In default isolation DB2 keeps no rows locked while scanning Oracle default / DB2 9.7 default – Statement level snapshot DB2 before 9.7 – Cursor stability

21 21 SQL Procedure Language (SQL PL) enhancements Advancements in DB2 PL/SQL New SQL, stored procedures and triggers PL/SQL COMPILER PL/SQL (Procedural Language/Structured Query Language) is Oracle Corporation's procedural extension language for SQL and the Oracle relational database. PL/SQL's general syntax resembles that of Ada. (extended from Pascal ) : source:

22 DB2 now allow both Shared-disk or Shared-Nothing scale out design -Shared-Nothing DB2 Database Partitioning Feature Balance each node with dedicated CPU, memory and storage -Shared-Disk DB2 preScale Feature Balance CPU node with shared disk and memory Best for Data Warehouse Best for Transaction

23 10GB Ethernet Foundation Corporate network part large tables Partitioned Database Model  Database is divided into multiple partitions  Partitions are spread across data modules  Each Partition has dedicated resources – cpu, memory, disk  Parallel Processing occurs on all partitions and is coordinated by the DBMS  Single system image to user and application SQL Small tables Parallel Processing Across Data Modules

24 Parallel Query Processing table_a Catalog table_b Part1Part2Part3PartN Coord Read ARead B Join Sum Optimize Get statistics AB Join Sum AB Join Sum AB Join Sum AB Join Sum sum=10sum=12sum=13sum=11 connect select sum(x) from table_a,table_b where a = b 46 sum(…) Agent

25 Automatic Data Distribution Database Partition 1Partition 2Partition 3 Table HASH (trans_id) DISTRIBUTE BY Insert/Load CREATE TABLE sales(trans_id, col2, col3, …) DISTRIBUTE BY (trans_id)

26 26 Hash Partitioning – “Divide and Conquer” P 1P 2P 3 With IBM’s DB2 w/Data Partitioning Feature (DPF), the query may still read most of the data, but now this work can be attacked on all nodes in parallel.

27 27 SELECT NAME,TOTAL_SPEND,LOYALTY_TIER from CUSTOMERS where REGION= and MONTH=‘Mar’ Range (Table) Partitioning Reduces I/O Jan Feb Mar P 1 P 2 P 3

28 28 Multi Dimensional Clustering Reduces I/O P 1 Jan Feb Mar P 2P3 I/O problem solver With MDC, data is further clustered by multiple attributes Now even less I/O is done to retrieve the records of interest Less I/O per query leads to more concurrency

Download ppt "Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale Daniel Ling Senior Solution Specialist Information Management Software, IBM HK."

Similar presentations

Ads by Google