Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is the Azure SQL Datawarehouse?

Similar presentations


Presentation on theme: "What is the Azure SQL Datawarehouse?"— Presentation transcript:

1 What is the Azure SQL Datawarehouse?
Vitor Fava What is the Azure SQL Datawarehouse?

2 Agenda What is a DataWarehouse? What is the SQL DataWarehouse? SQL DataWarehouse Architecture Managing a SQL DataWarehouse

3 Vitor Fava MVP Data Platform SQL Server Database Consultant at Pythian
Module 0: Introduction Course ####y Vitor Fava MVP Data Platform SQL Server Database Consultant at Pythian Welcome students to the course and introduce yourself. Provide a brief overview of your background to establish credibility. Ask students to introduce themselves and provide their backgrounds, product experience, and expectations of the course. Record student expectations on a whiteboard or flip chart that you can reference during class.

4 What is a DataWarehouse?
A large store of data accumulated from a wide range of sources within a company and used to guide management decisions;

5 What is the SQL DataWarehouse?
Azure SQL Data Warehouse is a massively parallel processing (MPP) cloud-based, scale-out, relational database capable of processing massive volumes of data;

6 What is the SQL DataWarehouse?
Combines the SQL Server relational database with Azure cloud scale-out capabilities; Decouples storage from compute; Enables increasing, decreasing, pausing, or resuming compute; Integrates across the Azure platform; Utilizes SQL Server Transact-SQL (T-SQL) and tools; Complies with various legal and business security requirements such as SOC and ISO;

7 Predictable and scalable performance With Data Warehouse Units
Allocation of resources to your SQL Data Warehouse is measured in Data Warehouse Units (DWUs); DWUs are a measure of underlying resources like CPU, memory, IOPS, which are allocated to your SQL Data Warehouse; Increasing the number of DWUs increases resources and performance;

8 Massively parallel processing architecture (MPP)
SQL Data Warehouse is a massively parallel processing (MPP) distributed database system; Behind the scenes, SQL Data Warehouse spreads your data across many shared-nothing storage and processing units; The data is stored in a Premium locally redundant storage layer on top of which dynamically linked Compute nodes execute queries; SQL Data Warehouse takes a "divide and conquer" approach to running loads and complex queries; Requests are received by a Control node, optimized for distribution, and then passed to Compute nodes to do their work in parallel;

9 Massively parallel processing architecture (MPP)
Grow or shrink storage size independent of compute; Grow or shrink compute power without moving data; Pause compute capacity while leaving data intact, only paying for storage; Resume compute capacity during operational hours;

10 Massively parallel processing architecture (MPP)

11 Distributions A distribution is the basic unit of storage and processing for parallel queries that run on distributed data; When SQL Data Warehouse runs a query, the work is divided into 60 smaller queries that run in parallel; A data warehouse with maximum compute resources has one distribution per Compute node; A data warehouse with minimum compute resources has all the distributions on one compute node.

12 Distributions Hash-distributed tables; Round-robin distributed tables;
Replicated Tables;

13 Hash-distributed tables
Each row belongs to one distribution; A deterministic hash algorithm assigns each row to one distribution; The number of table rows per distribution varies as shown by the different sizes of tables;

14 Hash-distributed tables

15 Round-Robin distribution tables
A round-robin distributed table distributes data evenly across the table but without any further optimization; A distribution is first chosen at random and then buffers of rows are assigned to distributions sequentially; It is quick to load data into a round-robin table, but query performance can often be better with hash distributed tables;

16 Round-Robin distribution tables

17 Replicated tables A replicated table provides the fastest query performance for small tables; A table that is replicated caches a full copy of the table on each compute node; Replicated tables are best utilized with small tables;

18 Replicated tables

19 Built on SQL Server Uses T-SQL syntax for many operations;
It also supports a broad set of traditional SQL constructs, such as stored procedures, user-defined functions, table partitioning, indexes, and collations; Contains various newer SQL Server features, including clustered columnstore indexes, PolyBase integration, and data auditing (complete with threat detection); Certain T-SQL language elements that are less common for data warehousing workloads, or are newer to SQL Server, may not be currently available;

20 Tools to import data AZCopy; BCP; SSIS; RedGate; PolyBase;

21 Management Tools Azure Portal; SQL Server Data Tools Powershell;
SQLCMD;

22 Conclusion It works with the MPP architecture to guarantee the speed in order fulfillment; You can scale storage and processing power independently; We can pause the service when it is not needed and lower the costs; Support for T-SQL, which facilitates the use of the service;

23 Just like Jimi Hendrix …
We love to get feedback Please complete the session feedback forms

24 SQLBits - It's all about the community...
Please visit Community Corner, we are trying this year to get more people to learn about the SQL Community, equally if you would be happy to visit the community corner we’d really appreciate it.

25 Obrigado / Thank You !!!!! Vitor Fava


Download ppt "What is the Azure SQL Datawarehouse?"

Similar presentations


Ads by Google