Download presentation
Presentation is loading. Please wait.
Published byHandoko Kartawijaya Modified over 6 years ago
2
A developers guide to Azure SQL Data Warehouse
Microsoft Build 2016 11/23/2018 3:39 AM A developers guide to Azure SQL Data Warehouse James Rowland-Jones (JRJ) © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
3
IoT Scenario Persist Export Query Stream Import Write
4
What is SQL DW? (and when to use it)
5
Analytical workloads Store large volumes of data
Microsoft Build 2016 11/23/2018 3:39 AM Analytical workloads Store large volumes of data Consolidate disparate data into a single location Shape, model, transform and aggregate data Perform query analysis across large datasets Ad-hoc reporting across large data volumes All using simple SQL constructs “SQL on SQL” © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
6
Unsuitable workloads Operational workloads (OLTP)
High frequency reads & writes Large numbers of singleton selects High volumes of single row inserts Procedural ETL Row by row processing needs Incompatible formats (JSON, XML)
7
Logical Overview Control Compute Storage Microsoft Build 2016
11/23/2018 3:39 AM Logical Overview Compute Control Storage © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
8
Distributed queries Query Result Control Compute Storage
Microsoft Build 2016 11/23/2018 3:39 AM Distributed queries Query Result Control Compute Storage © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
9
Fully managed PaaS Microsoft Build 2016 11/23/2018 3:39 AM
© 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
10
Geo-redundant Microsoft Build 2016 11/23/2018 3:39 AM
© 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
11
Connectivity Windows or Linux ODBC JDBC ADO.NET PHP
Microsoft Build 2016 11/23/2018 3:39 AM Connectivity Windows or Linux ODBC JDBC ADO.NET PHP © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
12
Summary Scale-out distributed query engine
Microsoft Build 2016 11/23/2018 3:39 AM Summary Scale-out distributed query engine De-coupled storage from compute Fully managed Completely elastic Platform as a Service (PaaS) Petabyte scale Leveraging cloud ecosystem Broad range of connectivity options © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
13
Provisioning SQL DW
14
Demo: Provisioning
15
Summary Full provisioning experience Partial provisioning experience
PowerShell Azure portal REST API Partial provisioning experience T-SQL CREATE DATABASE T-SQL scale with ALTER DATABASE sys.database_service_objectives shows current configuration sys.dm_operation_status shows progress performing provisioning operations
16
Designing tables Microsoft Build 2016 11/23/2018 3:39 AM
© 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
17
HASH DISTRIBUTION 02 01 01 03 HASH ( ) 01 02 03 04 05 06 07 08 09 10
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
18
Demo: Table design
19
Sizing for partitioning
Microsoft Build 2016 11/23/2018 3:39 AM Sizing for partitioning Factors 1 TiB Scenario 50 TiB Scenario Size of dataset 1 TiB 50 TiB Distribution Count 60 Compression Ratio 5 Skew? No GB per distribution 3.3 166.7 # Partitions 36 GB per partition 0.092 4.62 © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
20
Guidance Keep column definitions strongly typed
Distribution key is read only Aim for a not null definition for distribution key Ensure columns are defined consistently Especially true for the distribution key Hash distribution optimises data layout JOIN and GROUP BY columns are often the best candidates Partition for data management Consider columns used in the WHERE clause Evaluate query date ranges (month, quarter) as part of the partitioning strategy
21
Loading data
22
Demo: Loading Data
23
Trickle loading guidance
>= 102,400 rows per distribution = Columnstore <102,400 rows per distribution = Row storage Assuming even distribution: 6,144,000+ rows required in Bulk Insert 500 Rows/Sec 1000 2000 Load threshold exceeded (hours) <3.5 hours < 2 hours < 1 hour
24
Loading compressed text
Guidance Evenly split the data into multiple files One file per reader DWU Readers Writers DW100 8 60 DW200 16 DW300 24 DW400 32 DW500 40 DW600 48 DW1000
25
Other loading methods Azure Data Factory SSIS Bcp
3rd party data loading tools
26
Resources PolyBase loading: http://aka.ms/acom-polybase-load
Re-visit Build on Channel 9. Continue your education at Microsoft Virtual Academy online.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.