Download presentation
Presentation is loading. Please wait.
1
Azure SQL DWH: Tips and Tricks for developers
Sergiy Lunyakin Azure SQL DWH: Tips and Tricks for developers
2
Sponsors!
3
About me I’m Ukrainian DWH/BI Consultant at ITMagination
Data Platform MVP, MCSE BI, MCSA Cloud Platform Leader of Speaker at SQL Conferences Organizer of SQLSaturday Lwow Contacts: @slunyakin
4
Agenda What is Azure SQL DW Architecture of Azure SQL DW Limitations
Check compatibility Handling cross-database query Handling Identity Handling ANSI Update/Delete/Merge/SCD Handling Compute columns Handling Cursor
5
What is Azure SQL DW Microsoft Azure Platform as a Service
It’s a Massively Parallel Processing system (MPP) Distributed Compute and Distributed Storage Scale up and down in several minutes Pause compute resources Supports a subset of T-SQL Join with external data in Azure Blob Storage/Data Lake
6
Architecture of Azure SQL DW
Dist_DB_1 Dist_DB_2 Dist_DB_15 Dist_DB_16 Dist_DB_17 Dist_DB_30 Dist_DB_46 Dist_DB_47 Dist_DB_60 … … … … … …
7
Logical Overview Control Compute Storage Microsoft Build 2016
4/10/2019 1:58 AM Logical Overview Compute Control Storage © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
8
Distributions Distribution – SQL Database which stores one or more distributed table Splits data table to 60 buckets through compute nodes Hash distributed table * Round-Robin distributed table * Replicate table - New type of table * Selecting the right distribution method is key for good performance
9
Distributed queries Query Result Control Compute Storage
Microsoft Build 2016 4/10/2019 1:58 AM Distributed queries Query Result Control Compute Storage Scale-out distributed query engine © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
10
Distributed Query SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ;
SELECT SUM(*) FROM dbo.[FactInternetSales] ; Control Compute SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ;
11
Limitations Primary/Foreign Keys Identity Computed Columns Triggers
Cross-database joins Sequences Cursors MERGE ANSI joins on updates/deletes More limitations:
12
Check compatibility Data warehouse migration utility Free tool
Helps to identify unsupported features Helps to identify HASH distribution column Migrate scheama Migrate data (BCP tool)
13
Cross-database query Azure SQL DW doesn’t support cross-database query. Use ELT approach. Separate schemas. Use External tables as staging tables.
14
CTAS CTAS is super-charched version of SELECT...INTO Parallelized
Better for Data import Data copy Workarounds CREATE TABLE [dbo].[FactInternetSales_new] WITH ( DISTRIBUTION = ROUND_ROBIN , CLUSTERED COLUMNSTORE INDEX ) AS SELECT * FROM [dbo].[FactInternetSales];
15
Identity Handle it on source side IDENTITY property Explicit import
Doesn’t support CTAS Custom Identity with ROW_NUMBER
16
ANSI JOINS Update/Del/Merge
Update/Delete doesn’t support JOINS in FROM Use CTAS for preparing interim table with JOINS Use CTAS for Merge workaround Split Merge to operation steps and use UNION ALL Use interim table for big number of steps Use partitioning for big tables, don’t reload the whole table
17
Compute columns Handle it in a source system Use CTAS during import
Create a View Use explicit data type and nullability check in you calculation expressions Wrong data during migration Schema error during partition switch
18
Cursor Use WHILE for lopping Prepare a list of elements as a table
Loop through this list using While clause and variable Do some action
19
Summary MPP PaaS Service in Azure Cloud
Storing and processing huge amount of structure data Limitation: Identity, ANSI JOINS, MERGE CTAS - Super-charged version of SELECT...INTO CTAS good way for workarounds Better reload data with CTAS than Row-By-Row operations
20
Sponsors!
21
The end
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.