Extreme Performance with Oracle Data Warehousing

Slides:



Advertisements
Similar presentations
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any.
Advertisements

From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen OSullivan, Manager of Data and Grid Technologies April 2009.
© 2009 IBM Corporation Data Warehouse Solutions on System z - Doing more with what you have! - Doing more with what you have! Beth Hamel Product Manager.
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
Exadata Embracing Change What is familiar and what is new? The statements and opinions expressed here are my own and do not necessarily represent those.
Scalable Multi-Access Flash Store for Big Data Analytics
1 Copyright © 2012 Oracle and/or its affiliates. All rights reserved. Convergence of HPC, Databases, and Analytics Tirthankar Lahiri Senior Director, Oracle.
Exadata Goals Ideal Oracle Database Platform
Exadata for Oracle DBAs Arup Nanda Longtime DBA and now DMA.
Barry Hodges Senior Solution Architect, Sales Consulting, Oracle NZ The Best of Oracle Open World 2008.
Copyright © SoftTree Technologies, Inc. DB Tuning Expert.
Blazing Queries: Using an Open Source Database for High Performance Analytics July 2010.
© 2010 TIBCO Software Inc. All Rights Reserved. Confidential and Proprietary. TIBCO Spotfire Application Data Services TIBCO Spotfire European User Conference.
<Insert Picture Here>
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.
Exadata Distinctives Brown Bag New features for tuning Oracle database applications.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Oracle for Data Warehousing
Oracle Exadata for SAP.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1.
Living with Exadata Presented by: Shaun Dewberry, OS Administrator, RDC Tom de Jongh van Arkel, Database Administrator, RDC Komaran Hansragh, Data Warehouse.
Big Data Working with Terabytes in SQL Server Andrew Novick
10 REASONS Why it makes a good option for your DB IN-MEMORY DATABASES Presenter #10: Robert Vitolo.
High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.
1. Aim High with Oracle Real World Performance Andrew Holdsworth Director Real World Performance Group Server Technologies.
A Fast Growing Market. Interesting New Players Lyzasoft.
Oracle Data Warehouse Strategic Update Ray Roccaforte.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
Dot2Data Solutions Pvt. Ltd. A Databases Services Consultancy  It is like an appliance containing - Storage, Flash Disks, Database Servers, Infinib and.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Designing a Data Warehouse
Extreme Performance Data Warehousing
Base Content Slide Larry Ellison CEO, Oracle
Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.
© 2009 Oracle Corporation. S : Slash Storage Costs with Oracle Automatic Storage Management Ara Vagharshakian ASM Product Manager – Oracle Product.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Oracle BIWA SIG Basics Worldwide association of 2000 professionals interested in Oracle Database-centric business intelligence, data warehousing, and analytical.
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
The Sun Oracle Database Machine Barry Hodges Senior Solution Architect Oracle New Zealand.
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
5 Database Features Every DBA Needs to Know About THT11267 Doug Chamberlain - Principal Product Manger, Oracle Copyright © 2014, Oracle and/or its affiliates.
Oracle Challenges Parallelism Limitations Parallelism is the ability for a single query to be run across multiple processors or servers. Large queries.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
September 2011Copyright 2011 Teradata Corporation1 Teradata Columnar.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Oracle Advanced Compression – Reduce Storage, Reduce Costs, Increase Performance Session: S Gregg Christman -- Senior Product Manager Vineet Marwah.
Never Down? A strategy for Sakai high availability Rob Lowden Director, System Infrastructure 12 June 2007.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
Interoperability. Proven Data and BI Platform *Source: Gartner Magic Quadrant for Business Intelligence Platforms, 2008 & Gartner Magic Quadrant for Data.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
WHAT EXACTLY IS ORACLE EXALYTICS?. 2 What Exactly Is Exalytics? AGENDA Exalytics At A Glance The Exa Family Do We Need Exalytics? Hardware & Software.
Introduction to Exadata X5 and X6 New Features
Peter Idoine Managing Director Oracle New Zealand Limited.
Exadata Distinctives 988 Bobby Durrett US Foods. What is Exadata? Complete Oracle database platform Disk storage system Unique to Exadata – intelligent.
Oracle Exalytics Business Intelligence Machine Eshaanan Gounden – Core Technology Team.
JET INFOSYSTEMS The main approach to Big Data parallel processing: Oracle way Aleksey Struchenko Database Department Leader.
Supervisor : Prof . Abbdolahzadeh
Informix Red Brick Warehouse 5.1
HPE Persistent Memory Microsoft Ignite 2017
Exadata for Oracle DBAs
Blazing-Fast Performance:
Database System Architectures
Presentation transcript:

Extreme Performance with Oracle Data Warehousing <Insert Picture Here> Extreme Performance with Oracle Data Warehousing Andreas Katsaris Data warehouse and BI Practice Manager, Arisant LLC

Oracle #1 for Data Warehousing 6th August 2007: Oracle announced that it has been named the leading data warehouse platform tools vendor based on 2006 software revenues by analyst firm IDC. Oracle extended its lead in this market with nearly 33 percent market share and almost $1.87 billion in software revenue for 2006 according to IDC's report, "Worldwide Data Warehouse Platform Tools 2006 Vendor Shares." In addition, Oracle's data warehouse platform tools software revenue grew at 13.1 percent -- faster than the overall category growth rate of 12.5 percent. The analyst firm classifies the data warehousing tools software market as being comprised of Data Warehouse Generation and Data Warehouse Management tools. IDC also determined Oracle was the leading vendor in the Data Warehouse Management category with nearly 41 percent (40.9 percent) market share and close to $1.8 billion in revenues for 2006. Its nearest competitor in the Data Warehouse Management tools market trailed by almost half, with 22.8 percent market share and approximately $997.4 million in revenues. "Once again, Oracle was the market share leader in data warehouse platform tools," said Dan Vesset, Vice President of Business Analytics Research at IDC. "In 2006, Oracle showed good revenue growth across its data warehousing tools. The future of the data warehouse market remains strong as companies look to integrate and analyze structured, and increasingly unstructured, content across the enterprise. Oracle is well positioned to deliver the platform tools and new capabilities such as Spatial Information Management that companies need in order to drive better visibility within their organizations." IDC's "Worldwide Data Warehouse Platform Tools 2006 Vendor Shares" is accessible at: http://www.oracle.com/corporate/analyst/reports/infrastructure/bi_dw/207851.pdf Source: IDC, July 2009 – “Worldwide Data Warehouse Management Tools 2008 Vendor Shares”

Gartner Magic Quadrant for BI Platforms 2008 2009

Market Position Recognized as a leader since OWB 10.2 10,000+ companies rely on OWB Optimized for Oracle environments 2005

Oracle Data Warehousing Complete, Open and Integrated BI Applications Standard components Certified configurations Comprehensive security Higher availability Easier to manage Lower cost of ownership BI Tools Data Models Database Operating System Smart Storage

Distributed Data Marts and Servers Expensive Data Warehouse Architecture Data Mining Online Analytics ETL

Consolidate onto a Data Warehouse Single Source of Truth on Low-Cost Servers and Storage Marts Data Mining Online Analytics ETL Oracle Database 11g

Choice of Data Warehouse Solutions Reference Configurations Custom Solutions Database Machine Complete system, including software, servers, networking and storage Flexibility for the most demanding data warehouse Best-practice configurations for data warehousing

Drastically Simplified Deployments Database Machine eliminates the complexity of deploying database systems Months of configuration, troubleshooting, tuning Database Machine is ready on day one Pre-built, tested, standard, supportable configuration Runs existing applications unchanged Extreme performance out of the box Months to Days http://www.oracle.com/technology/products/bi/db/dbmachine/ds_db_machine.pdf

Best Data Warehouse Machine Massively parallel high volume hardware to quickly process vast amounts of data Exadata runs data intensive processing directly in storage Most complete analytic capabilities OLAP, Statistics, Spatial, Data Mining, Real-time transactional ETL, Efficient point queries Powerful warehouse specific optimizations Flexible Partitioning, Bitmap Indexing, Join indexing, Materialized Views, Result Cache Dramatic new warehousing capabilities OLAP ETL Data Mining New

Sun Oracle Database Machine Hardware Improvements Same architecture as previous Database Machine Same number and type of Servers, CPUs, Disks New Faster Latest Technologies 80% Faster CPUs Xeon 5500 Nehalem 100% Faster Networking 40 Gb InfiniBand 50% Faster Disk Throughput 6 Gb SAS Links 200% Faster Memory DDR3 DRAM Better 33% More SAS Disk Capacity 600 GB SAS Disks 1.10.2.11 Exadata Simulation For a given workload, you can now simulate the possible benefits in I/O interconnect throughput that can be obtained from migration to Exadata architecture. SQL Performance Analyzer, a feature of Oracle Real Application Testing, allows simulation to be performed on a non-Exadata installation without needing to provision the Exadata system. The SQL Performance Analyzer Exadata simulation feature can be used to identify workloads that are good candidates for Exadata migration. This feature simplifies simulation and testing of workloads for Exadata migration system change without requiring provisioning of Exadata hardware. 100% More SATA Disk Capacity 2 TB SATA Disks 125% More Memory 72 GB per DB Node 100% More Ethernet Connectivity 4 Ethernet links per DB Node New Plus Flash Storage!

Exadata Database Processing in Storage Exadata storage servers implement data intensive processing in storage Row filtering based on “where” predicate Column filtering Join filtering Incremental backup filtering Storage Indexing Scans on encrypted data Data Mining model scoring 10x reduction in data sent to DB servers is common No application changes needed Processing is automatic and transparent Even if cell or disk fails during a query

Simple Query Example SUM Exadata Storage Grid Oracle Database Grid What were my sales yesterday? Optimizer Chooses Partitions and Indexes to Access Exadata Storage Grid Oracle Database Grid Scan compressed blocks in partitions/indexes Retrieve sales amounts for Sept 24 Select sum(sales) where Date=’24-Sept’ SUM 10 TB scanned 1 GB returned to servers 13

Query Throughput Uncompressed Data Flash 50 Query Throughput Uncompressed Data Flash storage more than doubles scan throughput 50 GB/sec Smart Knows when to avoid caching based on object reusability and size Accepts user directives at table, index and segment level Combined with Columnar compression Up to 50 TB of data fits in flash Queries on compressed data run up to 500 GB/sec Flash storage Exadata Smart Flash Cache intelligently caches data from the Oracle Database replacing slow mechanical I/O operations to disk with very rapid flash memory operations. Oracle is using flash PCIe cards in Exadata V2 – not flash disks. By using flash PCIe cards, Oracle's solution doesn't have a slow disk controller limiting flash performance. Exadata storage delivers close to 1GB/sec of throughput from each flash card, scales that performance linearly across the 4 cards in every Exadata Storage Server. The Oracle flash cache is smart because it knows when to avoid trying to cache data that will never be reused or will not fit in the cache. Also, Oracle allows the user to provide directives at the database table, index and segment level to ensure that specific application data is kept in flash. Tables can be moved in and out of flash with a simple command, without the need to move the table to different tablespaces, files or LUNs like you would have to do with traditional storage with flash disks. Columnar compression In Exadata V2, Oracle introduced a new generation of compression technologies called Exadata Hybrid Columnar Compression that enables much more compression than was ever possible 3 HITACHI USP V TERADATA 2550 NETEZZA TwinFin 12 SUN ORACLE Database Machine

Exadata Hybrid Columnar Compression Data is stored by column and then compressed the decompression of the data is offloaded to Exadata eliminating CPU overhead on the database servers Query Mode for data warehousing Optimized for speed 10X compression ratio is typical Scans improve proportionally Archival Mode for infrequently accessed data Optimized to reduce space 15X compression is typical Up to 50X for some data Up To 50X Smart Scan of Hybrid Columnar Compressed Tables Another new feature of Oracle Database 11g Release 2 is Hybrid Columnar Compressed Tables. These new tables offer a high degree of compression for data that is bulk loaded and queried. Smart Scan processing of Hybrid Columnar Compressed Tables is provided and column projection and filtering are performed within Exadata. In addition, the decompression of the data is offloaded to Exadata eliminating CPU overhead on the database servers. Given the typical ten-fold compression of Hybrid Columnar Compressed Tables, this effectively increases the I/O rate ten-fold compared to uncompressed data. the decompression of the data is offloaded to Exadata eliminating CPU overhead on the database servers

Exadata Storage Index Transparent I/O Elimination with No Overhead Table Index Exadata Storage Indexes maintain summary information about table data in memory Store MIN and MAX values of columns Typically one index entry for every MB of disk Eliminates disk I/Os if MIN and MAX can never match “where” clause of a query Completely automatic and transparent A B C D 1 3 5 8 Min B = 1 Max B =5 Min B = 3 Max B =8 With Oracle Database 11g Release 2, several new powerful Smart Scan and offload capabilities are provided with Exadata storage. These include: Storage Indexing technology, Smart Scan offload of new Hybrid Columnar Compressed Tables, Smart Scan offload of encrypted tablespaces and columns, and offload of data mining model scoring. Storage Indexing Storage Indexes are a very powerful capability provided in Exadata storage that helps avoid I/O operations. The Exadata Storage Server Software creates and maintains a Storage Index in Exadata memory. The Storage Index keeps track of minimum and maximum values of columns for tables stored on that cell. When a query specifies a WHERE clause, but before any I/O is done, the Exadata software examines the Storage Index to determine if rows with the specified column value exists in the cell by comparing the column value to the minimum and maximum values maintained in the Storage Index. If the column value is outside the minimum and maximum range, scan I/O for that query is avoided. Many SQL Operations will run dramatically faster because large numbers of I/O operations are automatically replaced by a few in-memory lookups. To minimize operational overhead, Storage Indexes are created and maintained transparently and automatically by the Exadata Storage Server Software. Select * from Table where B<2 - Only first set of rows can match 16

Data is 10x Smaller, Scans are 2000x faster Benefits Multiply 10 TB of user data Requires 10 TB of IO 1 TB with compression 100 GB with partition pruning Subsecond On Database Machine 20 GB with Storage Indexes 5 GB Smart Scan on Memory or Flash Data is 10x Smaller, Scans are 2000x faster

Database Machine Success Database Machine is succeeding in all geographies and industries against every competitor

Oracle’s Data Integration Strategy Pervasive Data Integration Embed Data Integration within Oracle Database Integrated, optimized and Best for Oracle Database Easiest way to load external information into Oracle Database Provide Comprehensive Data Integration Comprehensive Heterogeneous Technology Foundation Integrated Runtime, Data Management Tools and Administration Best of Breed: significant architectural differentiators vs. competitors “Hot Pluggable”: broad support of sources & packaged applications Pre-Integrate Solutions for Oracle Portfolio Make data integration pervasive with Lower Cost & Complexity

ODI – Enterprise Edition Bundling E-LT Forces Legacy Sources Oracle Data Integrator & Oracle Warehouse Builder Data Warehouse E-LT Transformation vs. E-T-L Application Sources Real Time Change Data Capture Planning System High Speed Batch Data Movement OLTP Sources Set-based Data Transformations 20

OWB 11g Release 2 Heterogeneous connectivity with high performance native code Extensible heterogeneous connectivity Connect to any JDBC, ODBC or gateway enabled data store Enhanced support for Change Data Capture and real-time extracts Advanced ERP/CRM/MDM connectors Call any web service as a data provider, publish extracts as web services High-performance native code for any platform Use extensible code templates to create native code Leverage the database engine for best performance Use native un-loaders/loaders for extremely fast bulk loading Enhanced Developer Productivity New declarative user interface based on Fusion Client Platform Rich metadata support for end-to-end data lineage and impact analysis Oracle BI EE metadata generation ODI EE license bundle includes the OWB product (licensed ETL components) such as code templates. For Oracle Business Intelligence Enterprise Edition (OBI EE), this feature allows derivation of ready-to-use physical, business model and presentation layer metadata from a data warehouse design, visualization and maintenance of the derived objects from within OWB, and deployment of the derived objects in the form of an RPD file that can be loaded into OBI EE. MDM – Master Data Management 21 21

High-Level Roadmap Unified platform that is a superset of OWB and ODI OWB/ODI investments are fully protected No forced migrations Natural upgrade path No regressions in functionality Training and support to continue for both product releases 10gR1 10gR2 11gR1 11gR2 OWB Unified dev team Common pricing “ CY 2012” ODI 11gR2 10gR3 11gR1

Oracle Runs the Largest Databases Website Personalization 500,000,000 unique users 200 Terabyte Data Warehouse Meteorological Research 220+ Terabyte Oracle database World’s largest database on Linux How are these companies successful using Oracle?

Star Query Optimization Specific Data Warehouse Access Methods Q: What was the total number of umbrellas sold in Boston during the month of May 2008 ? Customers Times Sales Products Channel

Oracle Partitioning Optimized performance Comprehensive partitioning strategies to address business problems One consistent way to manage all your data Not just for data warehouse and high-end OLTP New partitioning features bring partitioning to every application Reduced total cost of ownership Place less used data on lower cost storage Proven functionality in 8th generation Experience comes with age and customer usage

The Concept of Partitioning Simple yet powerful SALES SALES SALES Europe USA Jan Feb Jan Feb Large Table Difficult to Manage Partition Divide and Conquer Easier to Manage Improve Performance Composite Partition Higher Performance More flexibility to match business needs

Partition for Performance Partition Pruning Q: What was the total sales amount for May 20 and May 21 2009? Sales Table 5/17 5/18 Only the 2 relevant partitions are read Select sum(sales_amount) From SALES Where sales_date between to_date(‘05/20/2009’,’MM/DD/YYYY’) And to_date(‘05/22/2009’,’MM/DD/YYYY’); 5/19 5/20 5/21 5/22

Partition to Manage Data Growth SALES TABLE (7 years) 2003 2008 2009 95% Less Active 5% Active Low End Storage Tier 2-3x less per terabyte High End Storage Tier 28

Partitioning - 11g enhancements Partition Advisor New composite combinations list/range, range/range, list/hash, list/list Interval Partitioning Automatic creation of range-based partitions REF Partitioning Partition detail table based on the master-table key Virtual-Column Based Partitioning Partition based on an expression

Interval Partitioning Minimizes periodic partition maintenance (No need to create new partitions) Partition segments allocated as soon as new data arrives Local indexes are created and maintained as well Requires at least one range partition Range key value determines the range high point Partitioning key can only be a single column, and either DATE or NUMBER datatype CREATE TABLE sales(…) PARTITION BY RANGE (sales_date) INTERVAL(NUMTOYMINTERVAL(1, 'MONTH')) ( PARTITION p1 VALUES LESS THAN (TO_DATE('1-2-2006', 'DD-MM-YYYY')) );

ILM – Information Lifecycle Management Why Bother Compliance Performance Cost Data Maintenance 5% - most active 35% - less active 60% - historical

ILM – Information Lifecycle Management Implementation of different tiers of storage Consider Oracle ILM Assistant (free!) Leverages Oracle Partitioning Uses Lifecycle Definitions Calculates storage costs & savings Simulates the impact of partitioning on a table Advises how to partition a table Generates scripts to move data when required

In-Memory Parallel Execution New In-Memory Parallel Execution Challenge Traditionally Parallel Execution takes advantage of the IO capacity of a system Disk speeds are not keeping up with Moore’s law while CPU and Memory are Solution In-Memory Parallel Execution harness the memory capacity of the entire system An affinity algorithm places fragments of a object in memory on different RAC nodes 1.3.4 Improved Performance and Scalability The following sections describe new and improved performance and scalability capabilities in this release. 1.3.4.1 In-Memory Parallel Execution Traditionally, parallel execution has enabled organizations to manage and access large amounts of data by taking full advantage of the I/O capacity of the system. In-memory parallel execution harnesses the aggregated memory in a system to enhance query performance by minimizing or even completely eliminating the physical I/O needed for a parallel operation. Oracle automatically decides if an object being accessed using parallel execution benefits from being cached in the SGA (buffer cache). The decision to cache an object is based on a well defined set of heuristics including size of the object and the frequency that it is accessed. In an Oracle RAC environment, Oracle maps fragments of the object into each of the buffer caches on the active instances. By creating this mapping, Oracle knows which buffer cache to access to find a specific part or partition of an object to answer a given SQL query. In-memory parallel query harnesses the aggregated memory in a system for parallel operations, enabling it to scale out with the available memory for data caching as the number of nodes in a cluster increases. This new functionality optimizes large parallel operations by minimizing or even completely eliminating the physical I/O needed because the parallel operation can now be satisfied in memory. 1.3.4.3 The DBMS_PARALLEL_EXECUTE Package The DBMS_PARALLEL_EXECUTE package provides subprograms to allow a specified INSERT, UPDATE, DELETE, MERGE, or anonymous block statement to be applied in parallel chunks. The statement must have two placeholders that define the start and end limit of a chunk. Typically, these are values for the rowid or a surrogate unique key in a large table. But, when an anonymous block is used, the block can interpret the values arbitrarily. The package has subprograms to define ranges that cover the specified table. These include rule-based division of a table's rowid or key range and support user-defined methods. The SQL statement together with the set of chunk ranges define a task. Another subprogram starts the task. Each task is processed using a scheduler job and automatically commits when it completes. Progress is logged. Untried, successful, and failed chunks are flagged as such on task completion or interruption. Another subprogram allows the task to resume to try untried and failed chunks. Many scenarios require the bulk transformation of a large number of rows. Using an ordinary SQL statement suffers from the all-or-nothing effect. In the common case, where the transformation of one row is independent of that of other rows, it is correct to commit every row that is transformed successfully and to roll back every row where the transformation fails. Some customers have implemented schemes to achieve this from scratch, using the Oracle Scheduler and suitable methods to record progress. This package provides a supported solution and adds database-wide manageability through new catalog views for parallel task metadata. The package is especially useful in online application upgrade scenarios to apply a crossedition trigger to all the rows in the table on which it is defined.

In-Memory Parallel Execution Determine the size of the table being looked at Table is a good candidate for In-Memory Parallel Execution SQL statement Fragments of Table are read into each node’s buffer cache Only parallel server on the same RAC node will access each fragment Read into the buffer cache on any node Table is extremely small Always use direct read from disk Table is extremely Large

In-Memory Parallel Execution QphH: 1 TB TPC-H A single database machine has over 400GB of memory usable for caching Database release 11.2 introduces parallel query processing on memory cached data Harnesses memory capacity of entire database cluster for queries Foundation for world record 1TB TPC-H Exadata Hybrid Columnar Compression enables multi-terabyte tables or partitions to be cached in memory TPC-H 1TB, 11gR1 on Superdome (04/29/09) 64 cores, 768 disks (146GB 15K RPM) In-memory execution algorithms cache partitions in memory on different DB nodes Parallel servers (aka PQ Slaves) are then executed on the corresponding nodes Faster than in-memory specialized startups Memory has 100x more bandwidth than Disk Source: Transaction Processing Council, as of 9/14/2009: Oracle on HP Bladesystem c-Class 128P RAC, 1,166,976 QphH@1000GB, $5.42/QphH@1000GB, available 12/1/09. Exasol on PRIMERGY RX300 S4, 1,018,321 QphH@1000GB, $1.18/QphH@1000GB, available 08/01/08. ParAccel on SunFire X4100 315,842 QphH@1000GB, $4.57 /QphH@1000GB, available 10/29/07. 35

Automatic Degree of Parallelism Auto DOP Optimizer derives the DOP from the statement based on resource requirements for all scans operations Applies to all types of statements Query, DML, or DDL Explain plan has been enhanced to show DOP selected SQL Tune now uses Auto DOP to recommend parallelism 1.3.4.2 Minimal Effort Parallel Execution - Auto Degree of Parallelism (DOP) and Queuing When activated, Oracle determines the optimal degree of parallelism (DOP) for any given SQL operation based on the size of the objects, the complexity of a statement, and the existing hardware resources. The database compensates for wrong or missing user settings for parallel execution, ensuring a more optimal resource consumption and overall system behavior. See Also: 1.3.4.3 The DBMS_PARALLEL_EXECUTE Package The DBMS_PARALLEL_EXECUTE package provides subprograms to allow a specified INSERT, UPDATE, DELETE, MERGE, or anonymous block statement to be applied in parallel chunks. The statement must have two placeholders that define the start and end limit of a chunk. Typically, these are values for the rowid or a surrogate unique key in a large table. But, when an anonymous block is used, the block can interpret the values arbitrarily. The package has subprograms to define ranges that cover the specified table. These include rule-based division of a table's rowid or key range and support user-defined methods. The SQL statement together with the set of chunk ranges define a task. Another subprogram starts the task. Each task is processed using a scheduler job and automatically commits when it completes. Progress is logged. Untried, successful, and failed chunks are flagged as such on task completion or interruption. Another subprogram allows the task to resume to try untried and failed chunks. Many scenarios require the bulk transformation of a large number of rows. Using an ordinary SQL statement suffers from the all-or-nothing effect. In the common case, where the transformation of one row is independent of that of other rows, it is correct to commit every row that is transformed successfully and to roll back every row where the transformation fails. Some customers have implemented schemes to achieve this from scratch, using the Oracle Scheduler and suitable methods to record progress. This package provides a supported solution and adds database-wide manageability through new catalog views for parallel task metadata. The package is especially useful in online application upgrade scenarios to apply a crossedition trigger to all the rows in the table on which it is defined.

Parallel Statement Queuing Statement is parsed and oracle automatically determines DOP SQL statements If not enough parallel servers available queue statements When the required number of parallel servers become available the first stmt on the queue is dequeued and executed 128 16 32 64 128 16 32 64 FIFO Queue If enough parallel servers available execute immediately 8

Summary management Materialized Views Separate database object Stores pre-calculated and aggregated results Database supports sophisticated transparent query rewrite Queries don't change – run against base tables Join-backs, additional aggregations etc. supported Database supports incremental fast refresh Based on the query definition

Enhanced MV Refresh Refresh time for conventional insert, aggregate MV MAV_q5, TPCD schema, 30GB Fast refresh of a materialized view is now significantly faster due to reducing the time spent on log handling. This provides significantly reduced maintenance time and more fast refreshes are possible. On average 30 – 40% better refresh performance in 11g Release 2

Summary Management Today Region Date Product Channel SQL Query Relational Star Schema Summaries Sales by Region Sales by Date Sales by Product Sales by Channel Query Rewrite Query Options Query star schema (fact+dimensions) Build mviews on star schema Too many combinations Build cubes

Cube Organized Materialized Views SQL Query Summaries Region Date Query Rewrite Automatic Refresh Product Channel OLAP Cube

Cubes and Cube Organized MViews Stored in special db areas called Analytic Workspaces (which are stored in BLOBs) Manipulated via Analytic Workspaces Manager tool 11g- Best of both worlds Rewrite and refresh features of regular MVs Performance benefits of OLAP cubes Expose OLAP cube as a relational object access via SQL CUBE_TABLE function searches cube using SQL

Cubes and Cube Organized MViews Query cube as if it was a relational object SQL> explain plan for 2 select * from table(cube_table('GLOBAL.PRICE_CUBE')); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------- Plan hash value: 3184667476 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | 0 | SELECT STATEMENT | | 2000 | 195K| 29 (0)| 00:00:01 | | 1 | CUBE SCAN PARTIAL OUTER| PRICE_CUBE | 2000 | 195K| 29 (0)| 00:00:01 | Refresh using DBMS_MVIEW.REFRESH

Customer Case Study Reporting Application Cube-organized MVs replaced table-based MVs Time to build aggregate data reduced by 89% Longer running queries reduced from 5 minutes to 12 seconds Transparent access to cube-MV No changes to reporting applications

Advanced Compression Table Compression Table Scan Performance: 2x faster Storage Savings: 2x smaller DML Performance: 5% slower CREATE TABLE SALES_FACT (…) COMPRESS FOR ALL OPERATIONS; RMAN Compression ~40% faster than compressed backups in 10g Slightly better compression ratio than in 10g RMAN> CONFIGURE COMPRESSION ALGORITHM ‘zlib’; RMAN> backup as COMPRESSED BACKUPSET database archivelog all; Data Pump Compression expdp hr FULL=y DUMPFILE=dpump_dir:full.dmp COMPRESS;

Advanced Compression

SQL Query Result Cache Caching of query results or PL/SQL function calls DML/DDL against dependent database objects invalidates cache Candidate queries access many, many rows return few rows (small result set) executed many times

SQL Query Result Cache result_cache_mode init.ora parameter AUTO (optimizer uses repetitive executions to determine if query will be cached) MANUAL (need use /*+ RESULT_CACHE */ hint in queries) FORCE (All results are stored in cache) result_cache_max_size init.ora parameter default is dependent on other memory settings (0.25% of memory_target or 0.5% of sga_target or 1% of shared_pool_size) 0 disables result cache never >75% of shared pool (built-in restriction)

Q&A Thank you for attending If you have follow-up questions I will be here for the rest of the day or can be contacted by email -andreas.katsaris@arisant.com Dba_tab_modifications EXPORT/IMPORT_TABLE_STATS