Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale

Similar presentations


Presentation on theme: "Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale"— Presentation transcript:

1

2 Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale
Daniel Ling Senior Solution Specialist Information Management Software, IBM HK NOTE TO SPEAKER: Your VITAL role. It’s vital that IBM Business Analytic Executive Keynote’s are delivered confidently and effortlessly. IBM Business Analytics has staked out a “leadership” position by virtue of our organic development of our technologies, acquisitions, and business leadership and this presentation includes many of the key thoughts that make the case for IBM Business Analytics (all of our capabilities) as the “ultimate partner”. Presentation Objectives. The presentation objectives are designed to educate, influence, create urgency, and clarify why IBM is a company with whom you should take next steps with. Presentation Support. If you have any questions or comments in preparation or following delivery, please contact Doug or Tony Acknowledgements. A special thanks to both Christoph and Paul for their contributions to this keynote. MAIN POINT: IBM is committed to transforming the FINANCE (function) through ANALYTICS. SPEAKER NOTES: Thanks _________. It’s a real pleasure to be with you and to join our other presenters on this action packed agenda today. On behalf of all my colleagues at IBM Business Analytics, let me say that we are all very excited to be here to share our 7th annual installment of the Finance Forum; Finance Forum is our premier global event that begins each February and travels to more than 65 cities around the world. It’s all about practical approaches, actionable insights, and valuable information about innovations that can help you improve visibility, insight and control over the levers of performance, governance and risk in volatile times. It’s really just one SIGN OF OUR COMMITMENT TO DRIVING BUSINESS IMPACT THROUGH ANALYTICS, over the next several months we will reach TENS OF THOUSANDS OF YOUR COLLEAGUES. Now, I know many of you have been customers for many years and are using our solutions in several performance management processes within your organization right now. And still others are hearing about IBM Business Analytics for virtually the first time. Those of you who are IBM customers today, I want to thank you once again for the trust you have placed in our company and our solutions. To those of you who are new to IBM, a special welcome, thanks for being with us. We are glad you chose to join in and are confident your time with us will be valuable. <<END>> One of the things you should consider is to spend a moment to connect with the audience here with a personal, client/customer anecdote. Have a strong anecdote to tell about Business Analytics that grabs the audience in your local geo, language and local experience. If it’s about building for growth…tell that story, if it’s about working smarter to get home to families AND protecting the success / future of the company, tell that story. Consider using the “black screen” technique, simply press the B key during the slide show to turn the screen black to instantly pause the presentation and change the screen to black. Press B again to remove the black screen and continue your presentation. This: Automatically focuses the audience’s attention to the speaker. Provides a sure way to emphasize a strong point or tell an important story. 2

3 Now everything Oracle got on database, DB2 has it and works better
Oracle Enterprise Edition DB2 Enterprise Edition Oracle Standard Edition DB2 Workgroup Edition Oracle Standard One Edition DB2 Express Edition Oracle Times Ten In-Memory database DB2 Solid DB In-Memory database Oracle XML DB2 Pure XML database Oracle Lite DB2 Everyplace Resilience and HA Oracle RAC (active-active share disk) DB2 PureScale Oracle Data Guard (read on standby) DB2 HADR Oracle Stream DB2 Queue Replication ….. …… Tools Oracle Range Partitioning DB2 Table Partitioning No Offer DB2 Database Partitioning Oracle Advance Compression DB2 Row Compression Oracle Label Security DB2 Label Security Oracle Database and Audit Vault DB2 Guardium database security monitoring Oracle Enterprise Manager DB2 Performance Optimization Feature Oracle Automatic Memory Management DB2 Self-Tuning Memory Manager Oracle Automatic Storage Management DB2 Automatic Storage Oracle Data Masking DB2 Optim Data Privacy Oracle Real Application Testing DB2 Optim Test Data Management DB2 Optim Data Archiving

4 Database High Availability Options
NOTE TO SPEAKER: Your VITAL role. It’s vital that IBM Business Analytic Executive Keynote’s are delivered confidently and effortlessly. IBM Business Analytics has staked out a “leadership” position by virtue of our organic development of our technologies, acquisitions, and business leadership and this presentation includes many of the key thoughts that make the case for IBM Business Analytics (all of our capabilities) as the “ultimate partner”. Presentation Objectives. The presentation objectives are designed to educate, influence, create urgency, and clarify why IBM is a company with whom you should take next steps with. Presentation Support. If you have any questions or comments in preparation or following delivery, please contact Doug or Tony Acknowledgements. A special thanks to both Christoph and Paul for their contributions to this keynote. MAIN POINT: IBM is committed to transforming the FINANCE (function) through ANALYTICS. SPEAKER NOTES: Thanks _________. It’s a real pleasure to be with you and to join our other presenters on this action packed agenda today. On behalf of all my colleagues at IBM Business Analytics, let me say that we are all very excited to be here to share our 7th annual installment of the Finance Forum; Finance Forum is our premier global event that begins each February and travels to more than 65 cities around the world. It’s all about practical approaches, actionable insights, and valuable information about innovations that can help you improve visibility, insight and control over the levers of performance, governance and risk in volatile times. It’s really just one SIGN OF OUR COMMITMENT TO DRIVING BUSINESS IMPACT THROUGH ANALYTICS, over the next several months we will reach TENS OF THOUSANDS OF YOUR COLLEAGUES. Now, I know many of you have been customers for many years and are using our solutions in several performance management processes within your organization right now. And still others are hearing about IBM Business Analytics for virtually the first time. Those of you who are IBM customers today, I want to thank you once again for the trust you have placed in our company and our solutions. To those of you who are new to IBM, a special welcome, thanks for being with us. We are glad you chose to join in and are confident your time with us will be valuable. <<END>> One of the things you should consider is to spend a moment to connect with the audience here with a personal, client/customer anecdote. Have a strong anecdote to tell about Business Analytics that grabs the audience in your local geo, language and local experience. If it’s about building for growth…tell that story, if it’s about working smarter to get home to families AND protecting the success / future of the company, tell that story. Consider using the “black screen” technique, simply press the B key during the slide show to turn the screen black to instantly pause the presentation and change the screen to black. Press B again to remove the black screen and continue your presentation. This: Automatically focuses the audience’s attention to the speaker. Provides a sure way to emphasize a strong point or tell an important story. 4

5 Server Based Failover - i.e. most OS clustering
tx tx Integrated with Tivoli System Automation cluster manager (included in both DB2 Enterprise and DB2 Workgroup without charge ) - Node Failure Detection - Disk takeover - IP takeover - Restart DB2 The first solution is Server Based Failover. In this solution you have one copy of the database that is connected to two servers. Only one server is active and a cluster manager will do the node failure detection and in the event of a failover will start DB2 on the standby machine and allow transactions to continue. Here is a graphic showing the sequence of events that occur if you are using. Note that this chart builds so if you are viewing this in hardcopy, it will be a bit more difficult to understand. But here is the sequence Transactions run against the active server If the server fails, the cluster manager will detect it and start running the takeover scripts First it takes over the storage Then it does an IP address takeover Then it restarts DB2 (which perform crash recovery) Then the database is available to process transactions The amount of time required to perform this takeover typically varies between 1 and 5 minutes. It is a very good solution for cost conscious customers that are looking for a good high availability solution without much labor to manage and with little added costs. Active Server

6 DB2 (HADR) – database log shipping HA
Redundant copy of the database to protect against site or storage failure Support for Rolling Upgrades Failover in under 15 seconds Real SAP workload with 600 SAP users – database available in 11 sec. 100% performance after primary failure Included in DB2 Enterprise and DB2 Workgroup without charge Automatic Client Reroute Client application automatically resumes on Standby DB2 High Availability Disaster Recovery (HADR) enables highly available database standby Fail over in minute DB2 & MSSQL (2 modes) Syn and Asyn DB2 ( 3 modes ) Semi syn Min delay in performance Assure integrity Network Connection HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with RAC). The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server. tx tx tx tx tx tx HADR Keeps the two servers in sync Primary Server Standby Server Standby Server

7 Critical IT Applications Need Reliability and Scalability
Local Databases are Becoming Global Successful global businesses must deal with exploding data and server needs Competitive IT organizations need to handle rapid change Customers need a highly scalable, flexible solution for the growth of their information with the ability to easily grow existing applications In today’s very difficult business environment IT organizations are under pressure to handle increasingly complex and difficult workloads. Adding to the challenge is that line of businesses expect IT to keep the systems always up and have reliable performance, even under huge swings in workload volume. The IT landscape has changed in the past 10 years with IT systems and customers being global rather local. This has created a number of problems as peak times can now come in waves from around the globe and batch windows have shortened. This volume increase has driven an explosion in data volumes and server horsepower for even simple applications. Leading IT organizations are able to handle this change and react as quickly as the business requires. More back office systems are also becoming front office as they are tied into online web portals and call centers. Any outage of these systems now means lost revenue, and potentially losing customers forever. The business no longer tolerates any downtime as they know they are losing revenue for every second IT is down. This increased focus on up-time has lead to more demands on IT for making sure that their databases never go down. Customers need a highly scalable, flexible solution for the growth of their information – with the ability to easily grow existing applications Down-time is Not Acceptable Any outage means lost revenue and permanent customer loss Today’s distributed systems need reliability 7

8 Introduce DB2 pureScale (Active-Active share disk)
Unlimited Capacity Buy only what you need, add capacity as your needs grow Application Transparency Avoid the risk and cost of application changes Continuous Availability Deliver uninterrupted access to your data with consistent performance

9 DB2 pureScale Architecture
Automatic workload balancing Cluster of DB2 nodes running on Power servers Leverages the global lock and memory manager technology from z/OS Based on industry leading System z data sharing architecture, DB2 pureScale integrates IBM technologies to keep your critical systems available all the time. It includes Automatic workload balancing to ensure that no node in the system is over loaded. DB2 will actually route transactions or connections to the least heavily used server. DB2 pureScale is built on the most reliable UNIX system available – Power Systems. Other platforms will be available in the future The technology for globally sharing locks and memory is based on technology from z/OS. Tivoli System Automation has been integrated deeply into DB2 pureScale. It is installed and configured as part of the DB2 installation process and DBAs and system administrators never even know its there. The networking infrastructure leverages Infiniband and all additional clustering software is included as pat of DB2 pureScale installation. The core of system is a shared disk architecture. Integrated Cluster Manager Now available on AIX InfiniBand Intel InfiniBand Intel Ethernet AIX Ethernet target 2Q 11 InfiniBand network & DB2 Cluster Services Shared Data

10 The Key to Scalability and High Availability
Efficient Centralized Locking and Caching As the cluster grows, DB2 maintains one place to go for locking information and shared pages Optimized for very high speed access DB2 pureScale uses Remote Direct Memory Access (RDMA) to communicate with the powerHA pureScale server No IP socket calls, no interrupts, no context switching Results Near Linear Scalability to large numbers of servers Constant awareness of what each member is doing If one member fails, no need to block I/O from other members Recovery runs at memory speeds Member 1 Member 1 Member 1 Now for the first time, the capabilities found in the coupling facility are available on a non-mainframe platform. By providing centralized locking and centralized caching, DB2 pureScale delivers a scale-out active-active solution that is second to none. DB2 pureScale is designed for modern hardware capabilities, including InfiniBand™ and Remote Direct Memory Access (RDMA), which will be discussed in more detail in the next section. Centralized locking and caching are provided by the PowerHA pureScale cluster acceleration facility (henceforth referred to as the CF) in DB2 pureScale. By placing hot data in a centralized buffer pool, and by accessing that data without costly IP socket calls the scalability of the solution is greatly improved. As well, by centralizing lock information, the CF is aware at all times what data pages are in process of being updated by any member in the cluster. As such, if any member fails, the CF holds all the page locks from the failed member to speed recovery processing. There is no need to lock out other nodes from accessing the shared disk during recovery processing as is the case with Oracle (more on that later). CF Group Buffer Pool PowerHA pureScale Group Lock Manager

11 Recover Instantaneously From Node Failure
- Using RDMA (Remote Direct Memory Access) Protect from infrastructure related outages Redistribute workload to surviving nodes immediately Completely redundant architecture Recover in-flight transactions on failing node in as little as 15 seconds including detection of the problem Application Servers and DB2 Clients DB2 pureScale enables you to recover instantaneously from unplanned failures. In situations where one or more members fail, due to HW or SW problems; the workload balancer will automatically recognize which members are available and will send new transactions automatically across the surviving members without interruption to your application. At the same time, DB2 pureScale does a fast crash recovery on the failing node, normally in 15 seconds including detection time. When crash recovery starts, locks on data that was being read on the failing node will be released immediately, and the locks for the data that was in-flight (updated, deleted, inserted) will be released within an average time of 15 seconds of recovery. We can do such a fast recovery with DB2 pureScale due to the centralized lock management, that enables us to crash recover into the member logs fast, maximizing data availability . Your application will continue to operate with no interruption, as the failed transactions will be re-executed on an available member. Keeping in mind that there is no special coding needed to run your application against a DB2 pureScale cluster; from an application point of view, it is like connecting to any other DB2 Database, and your application will handle the transaction failure in the cluster the same as a transaction failure on a single EE node. This makes rolling DB2 pureScale on your environment very easy.

12 Keep your system up Minimize the Impact of Planned Outages
During OS fixes HW updates Administration Bring node back online Do Maintenance Identify Member DB2 pureScale gives you 0 downtime for planned outage, where you are planning any HW or OS maintenance or any type of administration. It enables you to do rolling maintenance across your cluster, where you can drain a given member from the cluster to do the maintenance on it; what draining does is it stops any new transactions from coming to the member, and allows the existing transactions on the member to complete. Once the drain and all running transactions complete, you can then do the maintenance needed on the member for as long as needed. Once you’re done with the work that needs to be done, you can bring the member back into the cluster, and the workload balancer starts distributing new transactions to it. All this is done with transparency to the application, with no interruption to your environment, and with simple one-command to drain and re-add the member to the cluster. Once the work is done on the member, you can execute rolling maintenance on other members as needed, when needed.

13 Online Recovery DB2 pureScale design point is to maximize availability during failure recovery processing When a database member fails, only in-flight data remains locked until member recovery completes In-flight = data being updated on the failed member at the time it failed Time to row availability <20 seconds DB2 DB2 DB2 DB2 Log Log Log Log CF Shared Data CF % of Data Available Time (~seconds) Only data in-flight updates locked during recovery Database member failure 100 50 The second key feature of DB2 pureScale is the high availability it provides. Again the secret to its success is the centralized locking and caching. When one member fails, all other members in the cluster can continue to process transactions. The only data that is unavailable are the actually pages that were being updated in flight when the member failed. And if those pages are hot then they will be in the CF memory which means the recovery of pages needed by other members will be very fast.

14 In-flight Transactions
Compare to traditional technology – data availability DB2 pureScale Oracle RAC Shared Disk 1 Member failure 1 Node failure In-flight Transactions 2a 2 Hot pages CF Global Lock State In-flight Transactions In-flight Transactions In-flight Transactions Survivors can get locks throughout Global Lock State Global Lock State Global Lock State 2 3 3a 2 Lock remaster/rebuild Lock requests frozen until lock state ‘rediscovered’ from log of failed node 2 Recovery performed on another host 3 Another machine performs recovery 2a CF services most page requests from memory 3a More random disk I/Os needed How does the availability of data during a member failure with DB2 pureScale compare to a failure on a traditional software share disk implementation? This slide contains animation. When it first appears you will see the number of steps for DB2 pureScale and a traditional software share disk implementation contrasted. On the slide a restart light for DB2 pureScale is shown. The first thing to notice is that the number of steps is less: On DB2 pureScale 1) the member fails 2) the member is started on a guest host (restart light) and member recovery is performed to unlock the in-flight data – this does require that the members transaction log be read and the in-flight changes are backed out of the affected pages, but these pages will most likely be found in the group buffer pool (2a) and there will be no I/O required to read them. The messages required to move the pages around are much faster, typically 30 ms or less, due to the use of InfiniBand and uDAPL. On traditional software shared disk implementations 1) the node crashes. 2) the locks held by the failed node have to be re-mastered/rebuilt from the log of the failed node requiring a partial or full “freeze” for the global lock state. 3) Then another node performs the recovery operation to back out in-flight changes for the failed node; this process requires more random I/Os to disk to read the changed pages synchronously. The messaging between the nodes to complete the recovery operation are generally much slower, in the 100 ms range. Only after the lock list have been re-mastered and back out is completed is all the data in the cluster available again. <press page down> On the graph we see the impact on data availability with DB2 pureScale. Most of the data is continues to be available to the surviving members while member recovery of the failed member is being completed. Once in-flight changes have been backed out, all the data is again available. In contrast, in a traditional software share disk implementation all or a portion of the data will be unavailable during the time it takes to rebuilt the lock master information. Once this is done, the data for the failed node will continue to be unavailable until in-flight transactions have been backed out, and all the data is available again. During a DB2 pureScale member failure much less data is unavailable, and the time it takes to make all the data available to the surviving members is much shorter. pureScale: only data that was being updated on failed database member is (temporarily) locked This example assumes about 5% of the database data was being updated on the database node that failed, at the time of the failure. Database member failure % of Data Available Time (~seconds) 100 Oracle RAC Shared Disk: partial or full “freeze” for global lock state re-master/re-build 14

15 Automatic Workload Balancing and Routing
Run-time load information used to automatically balance load across the cluster (as in Z sysplex) Load information of all members kept on each member Piggy-backed to clients regularly Used to route next connection or optionally next transaction to least loaded member Routing occurs automatically (transparent to application) Failover: load of failed member evenly distributed to surviving members automatically Once the failed member is back online, fallback does the reverse Affinity-based routing, and failover and fallback also possible Clients Clients As we said earlier another important component of the pureScale solution is the ability to: Balance workload from the clients across the members in the cluster. Automatically reroute client connections to another member in the event of a member failure. To accomplish this run-time load information is used to automatically balance the workload across the members in the cluster. Load information about all members is kept on each member – this information is updated approximately every 30 seconds??????. As clients run SQL and results are returned from the database, the load profile for all the members is piggy-backed to the clients regularly. This information is then used to determine the routing of the next connection, or optionally the next transaction, to the least loaded member. This routing occurs automatically and is completely transparent to the application. In the event of a member failure, the transaction load of the failed member is evenly distributed over the surviving members. Once the failed member is back online, the clients receive notification that it is available again via the process of piggy-backing workload information to them, and they will automatically start to route work to the restarted member until workload is balanced across all the members. Affinity-based routing, and failover and fallback are also available. Need to get a better understanding of with this is and what needs to be done to document it. Log Shared Data Transaction logs Data Shared Data Transaction logs Data Shared Data 15

16 Reduce System Overhead by Minimizing Inter-node Communication
DB2 pureScale’s central locking and memory manager minimizes communication traffic Other database software require CPU intensive communication between all servers in a cluster DB2 pureScale grows efficiently as servers are added Other database software waste more and more CPU as they grow 16

17 2, 4 and 8 Members Over 95% Scalability
The Result 128 Members 84% Scalability 112 Members 89% Scalability 88 Members 90% Scalability 2, 4 and 8 Members Over 95% Scalability 64 Members 95% Scalability 32 Members Over 95% Scalability 16 Members Over 95% Scalability To demonstrate the scalability of DB2 pureScale, the lab set up a configuration comprised of 128 members (note that for server consolidation environments it is possible to put multiple members on an SMP server). A workload was created where the read to write ratios are typically 90:10. As well, to prove the scalability of architecture, the application has no cluster awareness. In fact the application updates or selects a random row and therefore every row in the database will be touched by all members in the cluster (we did this to show that locality of data is not as essential for scaling as with other shared disk architectures) The results of this 128 member test show that there is near linear scaling even out to 128 members in the cluster. Up to 64 members in the cluster, the scalability (compared to the 1 member result) is still above 95% and at 128 members the scalability was at 84%. Note that this is a validation of the architecture and includes some capabilities under development that will not be in the December GA code. Validation testing includes capabilities to be available in future releases.

18 IT Needs to Adapt in Hours…Not Months
Handling Change is a Competitive Advantage Dynamic Capacity is not the Exception Over-provisioning to handle critical business spikes is inefficient IT must respond to changing capacity demand in days, not months Businesses need to be able grow their infrastructure without adding risk The constant market change has also made it almost impossible for IT to be able to accurately forecast workload volumes years in advance. Any large firms do their capacity planning on a bi-annual base and buy their hardware and software based on 2 or 3 year projections. This has lead most firms to over allocate hardware and software so that they can handle both peak times and growth. Being able to handle the changing workload volumes has definitely become a competitive advantage. IT groups that can respond in weeks to change requests win business and those who cant spend more time in the planning process then they do implementing the change. Those with long fixed processes for adjusting to capacity changes often strong with sudden success of the business as they cant add capacity quickly enough to react to customer demand. Retail firms or manufacturing firms would love for IT to react in days to their success, not the current response of months or even quarters. Many firms use other vendor’s clustering technologies to handle capacity growth. This can work for very simple clusters but as they grow the adding of servers for additional scaling can cause a lot of breakages in application behavior. The DBAs and application developers have to look at re distributing their data or workloads to ensure that the system can scale. This can work but is time consuming , risky and costly. Developers rarely design their applications with scale in mind so the tuning can sometimes be really tough. Adding capacity shouldn't be a stressful or risky event. With all the turmoil in the market it is incredibly difficult for IT to keep up. Being able to handle this change is a big competitive advantage as many companies lose customers during improperly handled peak times. Many companies over provision their systems for peak times to ensure customers are not turned away during the busiest business times of the year The issue is that handling the peaks can lead to expensive systems sitting idle throughout the rest of the year. This can be very expensive and wasteful. Application changes to handle these changes, or to adapt to new clustering technologies, can be time consuming and expensive Changing applications to handle workload spikes also requires a great deal of effort, especially considering developers rarely design their applications to scale. Adding capacity to a system should be a stress free event. Application Changes are Expensive Changes to handle more workload volume can be costly and risky Developers rarely design with scaling in mind Adding capacity should be stress free 18

19 DB2 now has built in Oracle Compatibility
Oracle DBMS DB2 9.7 Oracle Concurrency Control No change Oracle SQL Oracle PL/SQL No Change Oracle Packages Oracle Built-in packages Oracle JDBC Oracle SQL*Plus Scripts Want to have 90% of the objects work 100% of the time (no change). (as opposed of 100% of the objects require 10% of change) Goal for ISV: “One source” JDBC: Oracle has made extensions to support their proprietary types. E.g. ref-cursors or VARRAY PACKGAE: Not to be confused with DB2 PACKGAGE. PACKAGE == MODULE Changes are the exception. Not the rule. THIS IS WHY WE CALL IT ENABLEMENT AND NOT PORT ! PL/SQL = Procedural Language/Structured Query Language

20 Concurrency prior to DB2 v9.7
Oracle default / DB2 9.7 default Statement level snapshot blocks Reader Writer No Yes DB2 before 9.7 Cursor stability blocks Reader Writer No No* Yes * In default isolation DB2 keeps no rows locked while scanning Enabling Oracle application to DB2 required significant effort to re-order table access to avoid deadlocks

21 SQL Procedure Language (SQL PL) enhancements
Advancements in DB2 PL/SQL New SQL, stored procedures and triggers PL/SQL COMPILER Objective: Explain how is it possible for DB2 to actually support PL/SQL! First we summarized the main problems: (1)semantics and (2) syntax differences between PL/SQL and SQL PL languages. The first element in the solution is that DB2 introduced NATIVE support to most of the Oracle features that used to be a pain during the porting process. This is to show that DB2 has added many new features to Cobra that close the gap between Oracle and DB2 capabilities. Thus, showing that we didn’t just embedded our MTK and emulate the Oracle features like before. The second element and final ingredient solves the syntax problem… Even though you could convert your PL/SQL to the SQL PL syntax… why bother? With the PL/SQL compiler you can keep a single source and reduce even more the porting costs of your application and future versions. In the next slides we’ll explore in more details the PL/SQL support in DB2 along with the new features that make it possible. PL/SQL made its first appearance in Oracle Forms v3. A few years later, it was included in the Oracle Database server v7 (as database procedures, functions, packages, triggers and anonymous blocks) followed by Oracle Reports v2. 1988: Oracle RDBMS version 6 came out with support for PL/SQL embedded within Oracle Forms v3 (version 6 could not store PL/SQL in the database proper), 1992: Oracle version 7 appeared with support for referential integrity, stored procedures and triggers. As of DB2 Version 7.2, a subset of SQL PL is supported in SQL functions and trigger bodies. ~2001 PL/SQL (Procedural Language/Structured Query Language) is Oracle Corporation's procedural extension language for SQL and the Oracle relational database. PL/SQL's general syntax resembles that of Ada. (extended from Pascal ) : source:

22 DB2 now allow both Shared-disk or Shared-Nothing scale out design
Best for Transaction Best for Data Warehouse Shared-Disk DB2 preScale Feature Balance CPU node with shared disk and memory Shared-Nothing DB2 Database Partitioning Feature Balance each node with dedicated CPU, memory and storage

23 Parallel Processing Across Data Modules
Partitioned Database Model Database is divided into multiple partitions Partitions are spread across data modules Each Partition has dedicated resources – cpu, memory, disk Parallel Processing occurs on all partitions and is coordinated by the DBMS Single system image to user and application Foundation Corporate network SQL Small tables 10GB Ethernet part part part part large tables 23

24 Parallel Query Processing
connect select sum(x) from table_a,table_b where a = b 46 Get statistics Read A Read B Join Sum Optimize Catalog Coord Agent sum(…) sum=10 sum=12 sum=13 sum=11 This slide shows how the shared nothing InfoSphere Warehouse system executes a query in parallel. Animation on the slide shows the steps. First the user issues a query which is submitted to the DB2 coordinator partition, which optimizes the query based on it’s knowledge of the data partitioning and the database statistics from the catalog. It then assigns work to each of the partitions to execute on it’s partition of the data. In this way, the work is divided up and executed in parallel. Each partition does a local join of the tables, sums the answer for it’s partition, and the passes the answer back to the coordinator to assemble the final result which is then passed back to the user. The query is executed much faster this way, then a traditional non-partitioned approach. Part1 Part2 Part3 PartN A B Join Sum table_a table_b

25 Automatic Data Distribution
HASH (trans_id) DISTRIBUTE BY Insert/Load CREATE TABLE sales(trans_id, col2, col3, …) DISTRIBUTE BY (trans_id) Database Partition 1 Partition 2 Partition 3 Table Hash partitioning is used to evenly spread the data across partitions. In this way, data is spread across all partitions for maximum parallel processing. The hash partitioning is done automatically by DB2 every time a row is added to the database either through an insert or the load utility. DB2 automatically determines which partition the row should be placed in. It does this by taking the value of the “partitioning key” ( a column or set of columns in the table defined as the partitioning key) and running it through a hashing algorithm, resulting in a value between 0 and 32k, which then corresponds to a data partition. The row is then added to the corresponding partition. Note that the partitioning key is defined in the create table statement, by the “Distribute by” clause.

26 Hash Partitioning – “Divide and Conquer”
With IBM’s DB2 w/Data Partitioning Feature (DPF), the query may still read most of the data, but now this work can be attacked on all nodes in parallel. P 1 P 2 P 3 26 26

27 Range (Table) Partitioning Reduces I/O
SELECT NAME,TOTAL_SPEND,LOYALTY_TIER from CUSTOMERS where REGION= and MONTH=‘Mar’ P 1 P 2 P 3 Range Jan Range Hash Feb With table partitioning, all data in the same user defined range is consolidated in the same data partition. The database can read just the appropriate partition. This illustration has only 3 ranges, but real tables have dozens or hundreds. So range partitioning yields a significant savings in IO for many business intelligence style queries. Range Mar 27 27

28 Multi Dimensional Clustering Reduces I/O
I/O problem solver With MDC, data is further clustered by multiple attributes Now even less I/O is done to retrieve the records of interest Less I/O per query leads to more concurrency Hash P 1 P 2 P3 Range MDC Jan Range Feb With MDC, data is further clustered by additional attributes. Now even less IO is done to retrieve the records of interest. Plus, table partitioning enables easy roll-in/roll-out of data. Range Mar 28 28


Download ppt "Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale"

Similar presentations


Ads by Google