INTRODUCING SQL SERVER 2012 COLUMNSTORE INDEXES Exploring and Managing SQL Server 2012 Database Engine Improvements.

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

SQL Server 2012 Data Warehousing Deep Dive Dejan Sarka, SolidQ
SQL SERVER 2012 XVELOCITY COLUMNSTORE INDEX Conor Cunningham Principal Architect SQL Server Engine.
Big Data Working with Terabytes in SQL Server Andrew Novick
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Project Management Database and SQL Server Katmai New Features Qingsong Yao
Module 6 Implementing Table Structures in SQL Server ®2008 R2.
The Baker’s Dozen Business Intelligence 13 Tips for the SQL Server Columnstore Index Kevin S. Goff Microsoft SQL Server MVP.
Agenda 10 Key SQL 2012 BI Innovations BI Semantic Model Project ‘Apollo’ Vertipaq xVelocity in SQL 2012.
Making Data Warehouse Easy Conor Cunningham – Principal Architect Thomas Kejser – Principal PM.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.
Module 9: Managing Schema Objects. Overview Naming guidelines for identifiers in schema object definitions Storage and structure of schema objects Implementing.
Module 8 Improving Performance through Nonclustered Indexes.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Columnstore Indexes in SQL Server 2012 Conor Cunningham Principal Architect, Microsoft SQL Server Representing Microsoft Development.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
SQL Data Definition Language (DDL) Using Microsoft SQL Server 1SDL Data Definition Language (DDL)
Table Indexing for the.NET Developer Denny Cherry twitter.com/mrdenny.
Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.
Denny Cherry twitter.com/mrdenny.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
Data Types Lesson 4. Skills Matrix Table A table stores your data. Tables are relational in that they are organized as rows and columns (a matrix). Each.
SQL Server 2005 Implementation and Maintenance Chapter 3: Tables and Views.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 4 Logical & Physical Database Design
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Welcome November 2012 Columnstore Indexes.
Relational Databases and MySQL. Relational Databases Relational databases model data by storing rows and columns in tables. The power of the relational.
October 15-18, 2013 Charlotte, NC Accelerating Database Performance Using Compression Joseph D’Antoni, Solutions Architect Anexinet.
Boosting DWH-Performance with SQL Server 2016 ColumnStore Index.
--A Gem of SQL Server 2012, particularly for Data Warehousing-- Present By Steven Wang.
SQLUG.be Case study: Redesign CDR archiving on SQL Server 2012 By Ludo Bernaerts April 16,2012.
Scott Fallen Sales Engineer, SQL Sentry Blog: scottfallen.blogspot.com.
Execution Plans Detail From Zero to Hero İsmail Adar.
Turbocharge your DW Queries with ColumnStore Indexes Susan Price Senior Program Manager DW and Big Data.

Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
What is your Character Data Type? March 5, 2016 John Deardurff Website:
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Introduction to columnstore indexes Taras Bobrovytskyi SQL wincor nixdorf.
Columnstore Indexing: From SQL Server 2012 to SQL Server 2014
Module 2: Creating Data Types and Tables
Ouch! Our Data Type Choices Did THAT?
SQL Server 2016 JSON Support FOR Data Warehousing
Blazing-Fast Performance:
Migrating a Disk-based Table to a Memory-optimized one in SQL Server
Proper DataType Usage = Guaranteed Better Performance and Accuracy
What is your Character Data Type?
Introduction to columnstore indexes
20 Questions with Azure SQL Data Warehouse
11/29/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Steve Hood SimpleSQLServer.com
TechEd /2/2018 7:32 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
A JSON’s Journey through SQL Server
Microsoft SQL Server 2014 for Oracle DBAs Module 7
The Five Ws of Columnstore Indexes
Data Types Do Matter Start local instance of SQL Start ZoomIt
Adding Lightness Better Performance through Compression
Sunil Agarwal | Principal Program Manager
Four Rules For Columnstore Query Performance
Clustered Columnstore Indexes (SQL Server 2014)
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
SQL Server Columnar Storage
Presentation transcript:

INTRODUCING SQL SERVER 2012 COLUMNSTORE INDEXES Exploring and Managing SQL Server 2012 Database Engine Improvements

IMPROVED DATA WAREHOUSE QUERY PERFORMANCE Columnstore indexes provide an easy way to significantly improve data warehouse and decision support query performance against very large data sets Performance improvements for “typical” data warehouse queries from 10x to 100x Ideal candidates include queries against star schemas that use filtering, aggregations and grouping against very large fact tables 2

WHAT HAPPENS WHEN … You need to execute high performance DW queries against very large data sets? In SQL Server 2008 and SQL Server 2008 R2 o OLAP (SSAS) MDX solution o ROLAP and T-SQL + intermediate summary tables, indexed views and aggregate tables – Inherently inflexible In SQL Server 2012 o You can create a columnstore index on a very large fact table referencing all columns with supporting data types – Utilizing T-SQL and core Database Engine functionality – Minimal query refactoring or intervention o Upon creating the columnstore index, your table becomes “read only” – but you can still use partitioning to switch in and out data OR drop/rebuild indexes periodically 3

HOW ARE THESE PERFORMANCE GAINS ACHIEVED? Two complimentary technologies: Storage o Data is stored in a compressed columnar data format (stored by column) instead of row store format (stored by row). – Columnar storage allows for less data to be accessed when only a sub-set of columns are referenced – Data density/selectivity determines how compression friendly a column is – example “State” / “City” / “Gender” – Translates to improved buffer pool memory usage New “batch mode” execution o Data can then be processed in batches (1,000 row blocks) versus row-by-row o Depending on filtering and other factors, a query may also benefit by “segment elimination” - bypassing million row chunks (segments) of data, further reducing I/O 4

COLUMN VS. ROW STORE Row Store (Heap / B-Tree) ProductIDOrderDateCost Column Store (values compressed) 5 data page 1000 ProductIDOrderDateCost data page 1001 ProductID data page 2001 OrderDate … … … … … … … … data page 2000 data page 2002 Cost

BATCH MODE Allows processing of 1,000 row blocks as an alternative to single row- by-row operations Enables additional algorithms that can reduce CPU overhead significantly Batch mode “segment” is a partition broken into million row chunks with associated statistics used for Storage Engine filtering Batch mode can work to further improve query performance of a columnstore index, but this mode isn’t always chosen: Some operations aren’t enabled for batch mode: o E.g. outer joins to columnstore index table / joining strings / NOT IN / IN / EXISTS / scalar aggregates Row mode might be used if there is SQL Server memory pressure or parallelism is unavailable Confirm batch vs. row mode by looking at the graphical execution plan 6

COLUMNSTORE FORMAT + BATCH MODE VARIATIONS Performance gains can come from a combination of: Columnstore indexing alone + traditional row mode in QP Columnstore indexing + batch mode in QP Columnstore indexing + hybrid of batch and traditional row mode in QP 7

CREATING A COLUMNSTORE INDEX T-SQL SSMS 8

GOOD CANDIDATES FOR COLUMNSTORE INDEXING Table candidates: Very large fact tables (for example – billions of rows) Larger dimension tables (millions of rows) with compression friendly column data If unsure, it is easy to create a columnstore index and test the impact on your query workload Query candidates (against table with a columnstore index): Scan versus seek (columnstore indexes don’t support seek operations) Aggregated results far smaller than table size Joins to smaller dimension tables Filtering on fact / dimension tables – star schema pattern Sub-set of columns (being selective in columns versus returning ALL columns) Single-column joins between columnstore indexed table and other tables 9

DEFINING THE COLUMNSTORE INDEX Index type Columnstore indexes are always non-clustered and non-unique They cannot be created on views, indexed views, sparse columns They cannot act as primary or foreign key constraints Column selection Unlike other index types, there are no “key columns” o Instead you choose the columns that you anticipate will be used in your queries o Up to 1,024 columns – and the ordering in your CREATE INDEX doesn’t matter o No concept of “INCLUDE” o No 900 byte index key size limit Column ordering Use of ASC or DESC sorting not allowed – as ordering is defined via columnstore compression algorithms 10

SUPPORTED DATA TYPES Supported data types Char / nchar / varchar / nvarchar o (max) types, legacy LOB types and FILESTREAM are not supported Decimal/numeric o Precision greater than 18 digits NOT supported Tinyint, smallint, int, bigint Float/real Bit Money, smallmoney Date and time data types o Datetimeoffset with scale > 2 NOT supported 11

MAINTAINING DATA IN A COLUMNSTORE INDEX Once built, the table becomes “read-only” and INSERT/UPDATE/DELETE/MERGE is no longer allowed ALTER INDEX REBUILD / REORGANIZE not allowed Other options are still supported: Partition switches (IN and OUT) Drop columnstore index / make modifications / add columnstore index UNION ALL (but be sure to validate performance) 12

LIMITATIONS Columnstore indexes cannot be used in conjunction with Change Data Capture and Change Tracking Filestream columns (supported columns from same table are supported) Page, row and vardecimal storage compression Replication Sparse columns Data type limitations Binary / varbinary / ntext / text / image / varchar (max) / nvarchar (max) / uniqueidentifier / rowversion / sql_variant / decimal or numeric with precesion > 18 digits / CLR types / hierarchyid / xml / datetimeoffset with scale > 2 You can prevent a query from using the columnstore index using the IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX query hint 13

ACCELERATING DATA WAREHOUSE QUERIES WITH SQL SERVER 2012 COLUMNSTORE INDEXES Demo 14

SUMMARY SQL Server 2012 offers significantly faster query performance for data warehouse and decision support scenarios 10x to 100x performance improvement depending on the schema and query o I/O reduction and memory savings through columnstore compressed storage o CPU reduction with batch versus row processing, further I/O reduction if segmentation elimination occurs Easy to deploy and requires less management than some legacy ROLAP or OLAP methods o No need to create intermediate tables, aggregates, pre-processing and cubes Interoperability with partitioning For the best interactive end-user BI experience, consider Analysis Services, PowerPivot and Crescent 15