SQL Server Parallel Data Warehouse: Supporting Large Scale Analytics José Blakeley, Software Architect Database Systems Group, Microsoft Corporation.

Slides:

Advertisements

Similar presentations

Supervisor : Prof . Abbdolahzadeh

Advertisements

2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.

Microsoft Data Warehouse Vision Massive Scalability at Low Cost Improved Business Agility and Alignment Democratized Business Intelligence Hardware.

Doug Lanman Data Warehousing SSP North Central, Midwest and Heartland Districts SQL Server Data Warehousing.

High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.

A Fast Growing Market. Interesting New Players Lyzasoft.

Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)

C-Store: Introduction to TPC-H Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 20, 2009.

6.814/6.830 Lecture 8 Memory Management. Column Representation Reduces Scan Time Idea: Store each column in a separate file GM AAPL.

Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.

Microsoft Ignite /16/2017 5:47 PM

Tuning Relational Systems I. Schema design  Trade-offs among normalization, denormalization, clustering, aggregate materialization, vertical partitioning,

Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.

MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3

SQL on Hadoop CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.

Extreme Performance Data Warehousing

Making Data Warehouse Easy Conor Cunningham – Principal Architect Thomas Kejser – Principal PM.

Hive – A Warehousing Solution Over a Map-Reduce Framework Presented by: Atul Bohara Feb 18, 2014.

Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.

Confidential ODBC May 7, Features What is ODBC? Why Create an ODBC Driver for Rochade? How do we Expose Rochade as Relational Transformation.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte

SQL Server 2005 Performance Enhancements for Large Queries Joe Chang

Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.

SQL Server Warehousing (Fast Track 4.0 & PDW)

Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.

Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.

A Paradigm Shift in Database Optimization: From Indices to Aggregates Presented to: The Data Warehousing & Data Mining mini-track – AMCIS 2002 as Research-in-Progress.

SQL Server Data Warehousing Overview

Hive : A Petabyte Scale Data Warehouse Using Hadoop

Getting Started With Ingres VectorWise

DBI332 ilikesql brianwmitchelll UNSTRUCTURED UNBALANCED UNPREDICTABLE.

Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.

Communicating with the Outside. Overview Package several SQL statements within one call to the database server Embedded procedural language (Transact.

Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.

OnLine Analytical Processing (OLAP)

CS Data Warehouse & Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:

1 Recovery Tuning Main techniques Put the log on a dedicated disk Delay writing updates to the database disks as long as possible Setting proper intervals.

Indexing HDFS Data in PDW: Splitting the data from the index VLDB2014 WSIC、Microsoft Calvin

CS Operating System & Database Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:

© 1999 FORWISS FORWISS MISTRAL Performance of TPC-D Benchmark and Datawarehouses Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical.

 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.

Srik Raghavan Principal Lead Program Manager Kevin Cox Principal Program Manager SESSION CODE: DAT206.

2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Welcome November 2012 Vorstellung Parallel.

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

INTRODUCING SQL SERVER 2012 COLUMNSTORE INDEXES Exploring and Managing SQL Server 2012 Database Engine Improvements.

Generalized Hash Teams for Join and Group-By Alfons Kemper Donald Kossmann Christian Wiesner Universität Passau Germany.

Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.

Rushabh Mehta Managing Director (India) | Solid Quality Mentors

SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.

Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.

SMP MPP with PDW ** Workload requirements usually drive the architecture decision.

October 15-18, 2013 Charlotte, NC Accelerating Database Performance Using Compression Joseph D’Antoni, Solutions Architect Anexinet.

Microsoft Analytics Platform System Stefan Cronjaeger, Microsoft.

SQL Server 2008 R2 Parallel Data Warehouse: Under the Hood Brian Mitchell Senior Premier Field Engineer.

Ignite in Sberbank: In-Memory Data Fabric for Financial Services

Scaling PostgreSQL with GridSQL. Who Am I? Jim Mlodgenski – Co-organizer of NYCPUG – Founder of Cirrus Technologies – Former Chief Architect of EnterpriseDB.

…the secret sauce! Diagrams and video from Microsoft white papers and slide decks.

Integration of Oracle and Hadoop: hybrid databases affordable at scale

Data warehouse and OLAP

IBM DATASTAGE online Training at GoLogica

Data Warehousing: SQL Server Parallel Data Warehouse AU3 update

SQL Server: A Data Platform for Large-Scale Applications

A developers guide to Azure SQL Data Warehouse

Blazing-Fast Performance:

Server & Tools Business

A developers guide to Azure SQL Data Warehouse

20 Questions with Azure SQL Data Warehouse

Applying Data Warehouse Techniques

Presentation transcript:

SQL Server Parallel Data Warehouse: Supporting Large Scale Analytics José Blakeley, Software Architect Database Systems Group, Microsoft Corporation

SQL Server PDW Overview 3/18/2011JHU DIR March 20112

Workload Types 3/18/2011JHU DIR March 2011  Online Transaction Processing (OLTP)  Balanced read-update ratio (60%-40%)  Fine-grained inserts and updates  High transaction throughput e.g., 10s K/s  Usually very short transactions e.g., 1-3 tables  Sometimes multi-step e.g., financial  Relatively small data sizes e.g., few TBs  Data Warehousing and Business Analysis (DW)  Read-mostly (90%-10%)  Few updates in place, high-volume bulk inserts  Concurrent query throughput e.g., 10s K / hr  Per query response time < 2 s  Snowflake, star schemas are common e.g., 5-10 tables  Complex queries (filter, join, group-by, aggregation)  Very large data sizes e.g., 10s TB - PB Day-to-day business Analysis over historical data 3

SQL Server Parallel Data Warehouse  Shared-nothing, distributed, parallel DBMS  Built-in data and query partitioning  Provides single system view over a cluster of SQL Servers  Appliance concept  Software + hardware solution  Choice of hardware vendors (e.g., HP, Dell, NEC)  Optimized for DW workloads  Bulk loads (1.2 – 2.0 TB/hr)  Sequential scans (700 TB in 3hr)  Scale from 10 Terabytes to Petabytes  1 rack manages ~40 TB  1 PB will need ~25 racks 3/18/2011JHU DIR March 20114

Hardware Architecture 3/18/2011JHU DIR March 2011 Compute Nodes Dual Infiniband Control Nodes Active / Passive Spare Compute Node Dual Fiber Channel Client Drivers (ODBC, OLE- DB, ADO.NET) ETL Load Interface Corporate Backup Solution Data Center Monitoring 2 Rack Appliance 5

Software Architecture 3/18/2011JHU DIR March 2011 Compute Nodes Compute Node Query Tool MS BI (AS, RS) MS BI (AS, RS) Control Node 3 rd Party Tools DWSQL Landing Zone Node Internet Explorer SQL Server DW Authentication DW Configuration DW Schema TempDB SQL Server User Data Data Movement Service PDW Engine IIS Admin Console Data Access (OLEDB, ODBC, ADO.NET, JDBC) 6

Key Software Functionality 3/18/2011JHU DIR March 2011  PDW Engine  Provides single system image  SQL compilation  Global metadata and appliance configuration  Global query optimization and plan generation  Global query execution coordination  Global transaction coordination  Authentication and authorization  Supportability (HW and SW status info via DMVs)  Data Movement Service  Data movement across the appliance  Distributed query execution operators  Parallel Loader  Runs from the Landing Zone  SSIS or command line tool  Parallel Database Copy  High performance data export  Enables Hub-Spoke scenarios  Parallel Backup/Restore  Backup files stored on Backup Nodes  Backup files may be archived into external device/system 7

Query Processing  SQL statement compilation  Parsing, validation, optimization  Builds an MPP execution plan  A sequence of discrete parallel QE “steps”  Steps involve SQL queries to be executed by SQL Server at each compute node  As well as data movement steps  Executes the plan  Coordinates workflow among steps  Assembles the result set  Returns result set to client 3/18/2011JHU DIR March 20118

3/18/2011JHU DIR March /20/ ,000,048,306 rows 4,500,000,000 rows 450,000,000 rows 600,000,000 rows Example DW Schema 30,000,000 rows 25 rows 5 rows 2,400,000,000 rows SELECT TOP 10 L_ORDERKEY, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE, O_ORDERDATE, O_SHIPPRIORITY FROMCUSTOMER, ORDERS, LINEITEM WHERE C_MKTSEGMENT = 'BUILDING' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE < ‘ ' AND L_SHIPDATE > ‘ ' GROUP BY L_ORDERKEY, O_ORDERDATE, O_SHIPPRIORITY ORDER BY REVENUE DESC, O_ORDERDATE 9

Example – Schema TPCH 3/18/2011JHU DIR March Customer Table -- distributed on c_custkey CREATE TABLE customer ( c_custkey bigint, c_name varchar(25), c_address varchar(40), c_nationkey integer, c_phone char(15), c_acctbal decimal(15,2), c_mktsegment char(10), c_comment varchar(117)) WITH (distribution=hash(c_custkey)) ; Orders Table CREATE TABLE orders ( o_orderkey bigint, o_custkey bigint, o_orderstatus char(1), o_totalprice decimal(15,2), o_orderdate date, o_orderpriority char(15), o_clerk char(15), o_shippriority integer, o_comment varchar(79)) WITH (distribution=hash(o_orderkey)) ; Customer Table -- distributed on c_custkey CREATE TABLE customer ( c_custkey bigint, c_name varchar(25), c_address varchar(40), c_nationkey integer, c_phone char(15), c_acctbal decimal(15,2), c_mktsegment char(10), c_comment varchar(117)) WITH (distribution=hash(c_custkey)) ; Orders Table CREATE TABLE orders ( o_orderkey bigint, o_custkey bigint, o_orderstatus char(1), o_totalprice decimal(15,2), o_orderdate date, o_orderpriority char(15), o_clerk char(15), o_shippriority integer, o_comment varchar(79)) WITH (distribution=hash(o_orderkey)) ; LineItem Table -- distributed on l_orderkey CREATE TABLE lineitem ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_linenumber bigint, l_quantity decimal(15,2), l_extendedprice decimal(15,2), l_discount decimal(15,2), l_tax decimal(15,2), l_returnflag char(1), l_linestatus char(1), l_shipdate date, l_commitdate date, l_receiptdate date, l_shipinstruct char(25), l_shipmode char(10), l_comment varchar(44)) WITH (distribution=hash(l_orderkey)) ; LineItem Table -- distributed on l_orderkey CREATE TABLE lineitem ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_linenumber bigint, l_quantity decimal(15,2), l_extendedprice decimal(15,2), l_discount decimal(15,2), l_tax decimal(15,2), l_returnflag char(1), l_linestatus char(1), l_shipdate date, l_commitdate date, l_receiptdate date, l_shipinstruct char(25), l_shipmode char(10), l_comment varchar(44)) WITH (distribution=hash(l_orderkey)) ; 10

Example - Query 3/18/2011JHU DIR March 2011 SELECT TOP 10 L_ORDERKEY, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT))AS REVENUE, O_ORDERDATE, O_SHIPPRIORITY FROM CUSTOMER, ORDERS, LINEITEM WHERE C_MKTSEGMENT = 'BUILDING' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE < ‘ ' AND L_SHIPDATE > ‘ ' GROUP BY L_ORDERKEY, O_ORDERDATE, O_SHIPPRIORITY ORDER BY REVENUE DESC, O_ORDERDATE Ten largest “building” orders shipped since March 5,

Example – Execution Plan 3/18/2011JHU DIR March Step 1: create temp table at control node CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_664]] ( [l_orderkey] BIGINT, [REVENUE] DECIMAL(38, 4), [o_orderdate] DATE, [o_shippriority] INTEGER ); Step 2: create temp tables at all compute nodes CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_665]_[PARTITION_ID]] ( [l_orderkey] BIGINT, [l_extendedprice] DECIMAL(15, 2), [l_discount] DECIMAL(15, 2), [o_orderdate] DATE, [o_shippriority] INTEGER, [o_custkey] BIGINT, [o_orderkey] BIGINT ) WITH ( DISTRIBUTION = HASH([o_custkey]) ); Step 3: SHUFFLE_MOVE SELECT [l_orderkey], [l_extendedprice], [l_discount], [o_orderdate], [o_shippriority], [o_custkey], [o_orderkey] FROM [dwsys].[dbo].[orders] JOIN [dwsys].[dbo].[lineitem] ON ([l_orderkey] = [o_orderkey]) WHERE ([o_orderdate] < ‘ ' AND [o_orderdate] >= ‘ :00:00.000') INTO Q_[TEMP_ID_665]_[PARTITION_ID] SHUFFLE ON (o_custkey); Step 1: create temp table at control node CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_664]] ( [l_orderkey] BIGINT, [REVENUE] DECIMAL(38, 4), [o_orderdate] DATE, [o_shippriority] INTEGER ); Step 2: create temp tables at all compute nodes CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_665]_[PARTITION_ID]] ( [l_orderkey] BIGINT, [l_extendedprice] DECIMAL(15, 2), [l_discount] DECIMAL(15, 2), [o_orderdate] DATE, [o_shippriority] INTEGER, [o_custkey] BIGINT, [o_orderkey] BIGINT ) WITH ( DISTRIBUTION = HASH([o_custkey]) ); Step 3: SHUFFLE_MOVE SELECT [l_orderkey], [l_extendedprice], [l_discount], [o_orderdate], [o_shippriority], [o_custkey], [o_orderkey] FROM [dwsys].[dbo].[orders] JOIN [dwsys].[dbo].[lineitem] ON ([l_orderkey] = [o_orderkey]) WHERE ([o_orderdate] < ‘ ' AND [o_orderdate] >= ‘ :00:00.000') INTO Q_[TEMP_ID_665]_[PARTITION_ID] SHUFFLE ON (o_custkey); Step 4: PARTITION_MOVE SELECT [l_orderkey], sum(([l_extendedprice] * (1 - [l_discount]))) AS REVENUE, [o_orderdate], [o_shippriority] FROM [dwsys].[dbo].[customer] JOIN tempdb.Q_[TEMP_ID_665]_[PARTITION_ID] ON ([c_custkey] = [o_custkey]) WHERE [c_mktsegment] = 'BUILDING' GROUP BY [l_orderkey], [o_orderdate], [o_shippriority] INTO Q_[TEMP_ID_664]; Step 5: Drop temp tables at all compute nodes DROP TABLE tempdb.Q_[TEMP_ID_665]_[PARTITION_ID]; Step 6: RETURN result to client SELECT TOP 10 [l_orderkey], sum([REVENUE]) AS REVENUE, [o_orderdate], [o_shippriority] FROM tempdb.Q_[TEMP_ID_664] GROUP BY [l_orderkey], [o_orderdate], [o_shippriority] ORDER BY [REVENUE] DESC, [o_orderdate] ; Step 7: Drop temp table at control node DROP TABLE tempdb.Q_[TEMP_ID_664]; Step 4: PARTITION_MOVE SELECT [l_orderkey], sum(([l_extendedprice] * (1 - [l_discount]))) AS REVENUE, [o_orderdate], [o_shippriority] FROM [dwsys].[dbo].[customer] JOIN tempdb.Q_[TEMP_ID_665]_[PARTITION_ID] ON ([c_custkey] = [o_custkey]) WHERE [c_mktsegment] = 'BUILDING' GROUP BY [l_orderkey], [o_orderdate], [o_shippriority] INTO Q_[TEMP_ID_664]; Step 5: Drop temp tables at all compute nodes DROP TABLE tempdb.Q_[TEMP_ID_665]_[PARTITION_ID]; Step 6: RETURN result to client SELECT TOP 10 [l_orderkey], sum([REVENUE]) AS REVENUE, [o_orderdate], [o_shippriority] FROM tempdb.Q_[TEMP_ID_664] GROUP BY [l_orderkey], [o_orderdate], [o_shippriority] ORDER BY [REVENUE] DESC, [o_orderdate] ; Step 7: Drop temp table at control node DROP TABLE tempdb.Q_[TEMP_ID_664]; 12

Data Movement Operations 3/18/2011JHU DIR March 2011  SHUFFLE_MOVE  Distributed  Distributed data exchange across the appliance  Result is a distributed table hashed on some column  PARTITION_MOVE  Union of distributed partitions across compute nodes into a table in the control node  MASTER_MOVE  Replicate a table from the control node to all compute nodes  BROADCAST_MOVE  Distributed  Replicated data exchange across appliance  Unconditional shuffle to all compute nodes  Combines PARTITION_MOVE and MASTER_MOVE in one step  TRIM_MOVE  Distribute a replicated table by trimming each copy  Since all the nodes have same copy of the replicated tables the idea is that nodes keep the values that belong to the distributions in that node  REPLICATE_MOVE  Moves a replicated table from 1 to N compute nodes. 13

Customer Experience 3/18/2011JHU DIR March 2011  Query speed is generally in ballpark with mainstream competition – Sometimes much faster  Mixed workload handling is good – concurrency of multi-queries, loads plus queries  Customers like remote table copy – mechanism to export entire data marts to SMP SQL Server  Fast time to solution – power up, create databases, define tables, load and query. 14

Microsoft Column-store Technology VertiPaq and VertiScan In-memory BI (IMBI) Slides by Amir Netz JHU DIR March 20113/18/201115

In-Memory BI Technology  Developed by SQL Analysis Services (OLAP) team  Column-based storage and processing  Only touch the columns needed for the query  Compression (VertiPaq)  Columnar data is more compressible than row data  Fast in-memory processing (VertiScan)  Filter, grouping, aggregation, sorting JHU DIR March 20113/18/201116

How VertiPaq Compression Works Read Raw Data Dictionary Encoding Value Encoding Bit Packing Run Length Encoding (RLE) Phase I: Encoding Phase II: Compression Convert to uniform representation (Integer Vectors) Encoding is per column Minimize storage space Compression is per 8M row segments 2x – 10x size reduction1x – 2x size reduction 2x – 4x size reduction~100x size reduction Compression Analysis 5%-25% of data75%-95% of data Organize by Columns VertiPaq Compression 3/18/ Hybrid RLE JHU DIR March 2011

3/18/2011JHU DIR March /20/2015 Star Join Schema 34 rows 436,892,631 rows 41 rows 13,517 rows 118 rows SELECT FE0.RegionName, FL0.FiscalYearName, SUM (A.ActualRevenueAmt) FROM TECSPURSL00 A JOIN SalesDate L ON A.SalesDateID = L.SalesDateID JOIN UpperGeography UG ON A.TRCreditedSubsidiaryId = UG.CreditedSubsidiaryID JOIN Region FE0 ON UG.CreditedRegionID = FE0.RegionID JOIN FiscalYear FL0 ON L.FiscalYearID = FL0.FiscalYearID GROUP BY FE0.RegionName, FL0.FiscalYearName APPX – 1TB

Column-Store on APPX 3/18/2011JHU DIR March  Response time < 2s common  Smaller variance in response time  more predictable query performance

THANKS! 3/18/2011JHU DIR March