Turbocharge SQL Performance with Oracle Database 12c Philip Moore Senior Data Architect and Developer.

Slides:



Advertisements
Similar presentations
Phoenix We put the SQL back in NoSQL James Taylor Demos:
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Supervisor : Prof . Abbdolahzadeh
Exadata Distinctives Brown Bag New features for tuning Oracle database applications.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Big Data Working with Terabytes in SQL Server Andrew Novick
SQL Server Accelerator for Business Intelligence (SSABI)
A Fast Growing Market. Interesting New Players Lyzasoft.
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Dimensional Modeling Business Intelligence Solutions.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Database Management: Getting Data Together Chapter 14.
© 2005 by Prentice Hall 1 Chapter 6: Physical Database Design and Performance Modern Database Management 7 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
Putting the Sting in Hive Page 1 Alan F.
Extreme Performance Data Warehousing
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class.
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
1.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Oracle Challenges Parallelism Limitations Parallelism is the ability for a single query to be run across multiple processors or servers. Large queries.
September 2011Copyright 2011 Teradata Corporation1 Teradata Columnar.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
OnLine Analytical Processing (OLAP)
RBO RIP George Lumpkin Director Product Management Oracle Corporation Session id:
1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 4 Logical & Physical Database Design
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
Student Centered ODS ETL Processing. Insert Search for rows not previously in the database within a snapshot type for a specific subject and year Search.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP SQL SERVER Database Administration.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
IT 5433 LM4 Physical Design. Learning Objectives: Describe the physical database design process Explain how attributes transpose from the logical to physical.
Database Design: Solving Problems Before they Start! Ed Pollack Database Administrator CommerceHub.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Scaling PostgreSQL with GridSQL. Who Am I? Jim Mlodgenski – Co-organizer of NYCPUG – Founder of Cirrus Technologies – Former Chief Architect of EnterpriseDB.
Supervisor : Prof . Abbdolahzadeh
Plan for Final Lecture What you may expect to be asked in the Exam?
Just Enough Database Theory for Power Pivot / Power BI
SQL Server Statistics and its relationship with Query Optimizer
Pure Columnar technology
Chapter 6 - Database Implementation and Use
IBM DATASTAGE online Training at GoLogica
Blazing-Fast Performance:
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Enhance BI Applications and Simplify Development
Physical Database Design
Microsoft SQL Server 2014 for Oracle DBAs Module 7
Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
Applying Data Warehouse Techniques
Introduction of Week 14 Return assignment 12-1
SQL Performance for DBAs
Presentation transcript:

Turbocharge SQL Performance with Oracle Database 12c Philip Moore Senior Data Architect and Developer

222 © 84.51° 2015 | Public We believe in making people’s lives easier by putting the customer at the center of everything we do WHO IS 84.51°

333 © 84.51° 2015 | Public 84.51° HELPS COMPANIES PUT THE CUSTOMER AT THE CENTER OF EVERY DECISION With insights from the data, we personalize the experience, making your business more relevant to consumers who matter most Second, we work with you to change the organization, embedding the principle of shopper first In so doing, we grow measurable brand value for our clients

444 © 84.51° 2015 | Public PLACING THE CUSTOMER AT THE CENTER WORKS JV Partnership began in % 0%

555 © 84.51° 2015 | Public 5 WHO AM I? Philip Moore Exadata Data Warehouse Architect / Lead Oracle Developer – 84.51° 15+ years Oracle data warehousing (since Oracle 8i) 5+ years with Oracle Exadata (v2, x2, x3) Oracle Certified Professional ‑ PL/SQL Developer ‑ 10g DBA Oracle SQL Certified Expert ? Experience with: SQL Server Teradata Netezza Greenplum PostgreSQL

666 Before we start on 12c… Let’s Talk about: 84.51’s Oracle Data Warehouse

777 © 84.51° 2015 | Public 7 ETL Analyst ad-hoc queries (no-SLA) Loyalty Campaign Mailer ETL Products –– The Shop Pre-canned queries with SLA Oracle Exadata runs the heart of our Data Warehouse. It has enabled us to deliver things we could not do before. We use an Active-Active Disaster Recovery configuration –– using a dual ETL/ELT strategy to load data into our Primary site and to the D/R and Products site Primary SiteD/R and Products Site WE USE ORACLE EXADATA ETL

888 © 84.51° 2015 | Public 8 Fact Table 200+ Billion rows Logically — We use a traditional Kimball star schema Example client model: Customers 300+ Million rows Terminals 500+ rows Stores 15,000+ rows Dates 4,700+ rows Products 8+ Million rows Data Warehouse Point of Sale Data Model

999 © 84.51° 2015 | Public 9 Physically — we use the following strategies For Dimensions Oracle Advanced Compression (COMPRESS FOR OLTP) Primary Keys are enforced — in RELY mode Oracle DIMENSION objects Exadata Hybrid Columnar Compression (HCC) — QUERY HIGH We will put the latest 2 years (“Hot” data) into the In-Memory Column Store Partition by RANGE on our date field, 1 week per partition “Soft” Referential Integrity for Foreign Key relationships NUMBER for Numbers DATE for Dates VARCHAR2 for Characters For Facts Proper data types for our data NOT NULL and CHECK constraints to validate data and help CBO PHYSICAL MODEL STRATEGIES FOR EXADATA

10 Now Let’s Talk Oracle 12c New Features that will speed up the Data Warehouse

11 © 84.51° 2015 | Confidential 12c – New Data Warehousing Features These new features will accelerate your Data Warehouse SQL queries Oracle In-Memory Column Store In-Memory Aggregation (Vector Group By) Attribute Clustering Adaptive Optimization Approximate Aggregation

12 © 84.51° 2015 | Public 12  Query plans are not longer set in stone!  Before 12c – the Optimizer’s decision was locked – meaning that no matter how bad its cardinality estimates were for each phase of execution – it continued down the ill-advised path.  The Optimizer can now change path in mid-query execution – for: o Join Paths –Helps recover from stale stats causing nested loop joins with large driving tables for example. o Parallel Query Distribution Methods (Broadcast / Hash) –This helps your query recover from high skew – preventing an unbalanced  This feature will take queries than run in hours in 11gR2 run in minutes or less! ADAPTIVE OPTIMIZATION The Cost-Based Optimizer just got a whole lot smarter!

13 © 84.51° 2015 | Public 13  Data is organized in columnar, compressed format  Ideal for Data Warehousing (Decision Support Systems) – with analytical workloads  Oracle In-Memory excels at: o Predicate Filtering (due to SIMD instructions and IMCU “pruning”) o Bloom Filters for large joins o Full Table Scans o Additive Aggregation (hopefully non-additive as well soon)  You will typically see 10-50x performance improvement for non-Exadata, and about 5-10x with Exadata. Note: with Exadata – In-Memory doesn’t shine until you have a highly concurrent workload…  With Engineered Systems – you can choose how the data is spread across RAC Nodes: o Distribution (default) – (best for fact tables) o Duplication – consider this for Dimension tables ORACLE IN-MEMORY COLUMN STORE Revolutionizing Analytical Query Performance

14 © 84.51° 2015 | Public 14  Also known as “I.M.A.” – or “Vector Group By”  The methodology was taken from Oracle OLAP – and fitted into In-Memory with use by SQL for the first time  The data is aggregated as it is scanned – reducing TEMP usage to zero in most cases – allowing for blazing fast aggregation for additive measures.  The decision is Cost-Based – and by default only kicks in if your fact table has 10 million or more rows.  You can force Vector Group By in your star aggregation query by using the “/*+ vector_transform */” hint  We have seen 16x performance improvement with Vector Group By as compared to the “traditional” Hash Group By aggregation method. IN-MEMORY AGGREGATION A Smarter way to Aggregate

15 © 84.51° 2015 | Public 15  For the first time – the way data is stored and sorted in a table (or partition) is declarative  Any direct-path INSERTs or “ALTER TABLE … MOVE” commands which load or reorganize the table will obey the table-level attribute clustering directives (stored in the data dictionary)  Storing data that logically belongs together helps improve performance, compression ratios, and improved concurrency for users of your system  You can use “Join Attribute Clustering” to organize your fact data by dimensional attributes. This can greatly speed up star queries.  You can choose between two options: o Linear (basic ORDER BY) – allowing for better compression – benefitting In-Memory as well as Oracle Exadata Hybrid Columnar Compression o Interleaved – a.k.a “Z-Order Curve Fitting” – ideal for hierarchical aggregation via star queries  Can be combined with Zone Maps (on Exadata) – to facilitate zone pruning and dramatically reduce I/O as a result ATTRIBUTE CLUSTERING Storing similar “fact” rows together

16 © 84.51° 2015 | Public 16 ATTRIBUTE CLUSTERING Example – from Oracle 12c Data Warehousing Guide documentation For Linear-Ordered tables – the data is simply sorted in the order of the column(s) you specify For Interleaved-Ordered tables – the data is arranged so that the data is contiguous in a multi- dimensional manner – meaning that contiguous regions contain data for similar “Country” and “Category” values

17 © 84.51° 2015 | Public 17  This technique helps solve the difficult: “Count Distinct Problem” – see:  HyperLogLog provides about 97%+ accuracy while using far fewer resources than traditional “COUNT (DISTINCT expr)” aggregation  HyperLogLog is distributable – meaning that it can be merged between parallel slaves – effectively making Distinct Counts “additive” in nature.  APPROX_COUNT_DISTINCT provides nearly 10x query performance improvement when used for star queries which perform Distinct Counts with hierarchical aggregation  The more Distinct Counts you do in your query – the more it shines  This feature is near and dear to my heart – after learning about HyperLogLog – I requested (via an SR enhancement request via “My Oracle Support”) that Oracle adopt the “APPROX_COUNT_DISTINCT” function based upon Flajolet’s algorithm.  Oracle put the customer first and graciously added this revolutionary new feature  Many more approximate features could (will) be coming to an Oracle release near you! APPROXIMATE AGGREGATION Using the brilliant: HyperLogLog algorithm

18 THANK YOU!