Presentation is loading. Please wait.

Presentation is loading. Please wait.

Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016.

Similar presentations


Presentation on theme: "Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016."— Presentation transcript:

1 Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016

2 Big Picture Provides a scalable, T-SQL language extension for combining data from both universes

3 PolyBase Use Cases

4 PolyBase Across the Enterprise SQL Product Load DataQuery DataAge Out Data HadoopWASBHadoopWASBHadoopWASB SQL Server 2016 YYYYYY Analytic Platform System (APS)Y YYYYY Azure SQL DW NYNNY

5 The Hadoop Ecosystem

6 Initially: MapReduce for insights from HDFS-resident data Recently: SQL-like data warehouse technologies on HDFS e.g. Hive, Impala, HAWQ, Spark/Shark Hadoop Evolution

7 All the interest in Big Data Increased number and variety of data sources that generate large quantities of data. Realization that data is “too valuable” to delete. Dramatic decline in the cost of hardware, especially storage.

8 PolyBase View

9

10

11 Step 1: Setup a Hadoop Cluster Hortonworks or Cloudera Distributions Hadoop 2.0 or above Linux or Windows On premise or in Azure

12 Or Azure Storage Account Azure Storage Blob (ASB) exposes an HDFS layer PolyBase reads and writes from ASB using Hadoop APIs No compute push-down support for ASB

13 Step 2: Install SQL Server Select PolyBase feature Adds new PolyBase services - PolyBase Engine - PolyBase Data Movement Service (DMS) Pre-requisite: download and install JRE

14 1. Install multiple SQL Server instances with PolyBase. Step 3: Scale-out 14 Head Node PolyBase Engine PolyBase DMS PolyBase Engine 2. Choose one as Head Node. 3. Configure remaining as Compute Nodes a.Run sp_polybase_join_group b.Restart PolyBase DMS

15 After Step 3 PolyBase Scale-out Group Head node is the SQL Server instance to which queries are submitted Compute nodes are used for scale out query processing for data in HDFS or Azure

16 Step 4 - Choose Hadoop flavor Latest Hadoop distributions supported in SQL16 RTM Cloudera CHD 5.5 on Linux Hortonworks 2.3 on Linux & Windows Server What happens under the covers? Loading the right client jars to connect to Hadoop distribution -- different numbers map to various Hadoop flavors -- example: value 4 stands for HDP 2.0 on Windows or ASB, value 5 for HDP 2.0 on Linux, value 6 for CHD 5.1/5.5 on Linux, value 7 for HDP 2.1/2.2/2.3 on Linux/Windows or ASB 7

17 After Step 4

18 PolyBase Design

19 Under-the-hood

20 Uses Hadoop RecordReaders/RecordWriters to read/write standard HDFS file types HDFS bridge in DMS

21 Under-the-hood

22 Namenode (HDFS) Hadoop Cluster File System Data moves between clusters in parallel SQL16

23 Under-the-hood

24 Creating External Tables Once per Hadoop Cluster Once per File Format HDFS File Path

25 Creating External Tables (secure Hadoop) Once per Hadoop User HDFS File Path Once per File Format Once per Hadoop Cluster per user

26 Under-the-hood


Download ppt "Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016."

Similar presentations


Ads by Google