Presentation is loading. Please wait.

Presentation is loading. Please wait.

My Data Wandered Lonely As A Cloud: Azure Data Factory Julie Smith SQL Server MVP Innovative

Similar presentations


Presentation on theme: "My Data Wandered Lonely As A Cloud: Azure Data Factory Julie Smith SQL Server MVP Innovative"— Presentation transcript:

1 My Data Wandered Lonely As A Cloud: Azure Data Factory Julie Smith SQL Server MVP Innovative Architects @juliechix

2 www.InnovativeArchitects.com 2 About Me Julie Smith IA Ambassador SQL Server MVP Julie.Smith@innovativearchitects. com @JulieChix Datachix.com

3 www.InnovativeArchitects.com 3 INNOVATIVE ARCHITECTS #WorkSomeplaceAwesome

4 My Data Flew amongst the Clouds quickly and without errors! 4

5 My Data Wandered Lonely As A Cloud

6 Background Here’s a story…

7 Small Business 7

8 Our plan: NO SS anything Power BI 8

9 What is Azure Data Factory? Azure Data Factory is a cloud service that orchestrates, manages, and monitors the integration and transformation of structured and unstructured data from on-premises and cloud sources at scale.

10 Most like…. SSIS DTS Between other cloud services and On Prem Sources, Destinations, Transformations

11 Is it just SSIS in the Cloud? 11

12 Another kind of MVP Minimally Viable Product Big Data Scenario 12

13 Where Portal.azure.com New>Data+Analytics>Data Factory

14 Three Main Elements Linked Services Datasets Pipeline Activities 14

15 Linked Services 15 Data Stores Data Gateways

16 Data Gateway for On Prem 16

17 Data Gateways & ADF 17 Supplies key Install Gateway on each On Prem resource (server, laptop, etc) A resource can only store one key for use by ADF, so that usually means there can be only data factory

18 Data Gateways & ADF 18

19 Data Gateways & ADF 19

20 Data Stores Contain credentials and connection information for Sources and Destinations. An On Prem Data Store MUST reference a Data Gateway 20

21 After you set up the gateway, set up your linked services. They have to have a gateway if they are going to an on prem source or destination.

22 After you set up the gateway, set up your linked services. When you pick SQL Server (on prem). You HAVE to have a gateway:

23 Azure Data Stores Don’t Require Gateway

24 Types of connections for Data Stores: 24

25 Datasets reference Data Stores: 25

26 Once you have Data Stores Datasets (reference specific schemas in data store, table, csv definition etc) Pipelines Activities, such as copy, or execute stored procedure 26

27 Getting around ADF Interface 27

28 Author and Deploy 28

29 Diagram 29

30 Author and Deploy 30 You can only copy and paste in a draft. So use clone option a great deal. Clone, edit, deploy.

31 JSON pronounced Jay-Sahn JavaScript Object Notation http://json.org/

32 JSON JSON is built on two structures: A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array. { } An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence. [ ] JavaScript Object Notation http://json.org

33 JSON in ADF, Dataset Example { "name": "OnPremActorSrce", "properties": { "published": false, "type": "SqlServerTable", "linkedServiceName": "NorthWindStg", "typeProperties": { "tableName": "Actor" }, "availability": { "frequency": "Day", "interval": 1 }, "policy": { "externalData": { "retryInterval": "00:01:00", "retryTimeout": "00:10:00", "maximumRetry": 3 }

34 JSON specific to ADF https://msdn.microsoft.com/en- us/library/azure/dn835050.aspx

35 Activities Within the pipeline:

36 Pipelines Activities Copy 36

37 Slices Each unit of data consumed and produced by an activity run is called a data slice. They have StartTime and EndTime and those are accessible to the pipeline activity via ADF System Variables: "sqlReaderQuery": "$$Text.Format('select * from MyTable where timestampcolumn >= \\'{0:yyyy-MM-dd HH:mm}\\' AND timestampcolumn < \\'{1:yyyy-MM-dd HH:mm}\\'', WindowStart, WindowEnd)"

38 Weird Things One Data Factory. So your diagram gets messy. Goes against SSIS best practices of one package per destination. Scheduling is clumsy. Pipeline and destination have to be in sync in their availability. Pipeline is where the main scheduling occurs. Why is the schedule in the same place as the integration logic? If you don’t use system variables for slices, then you wind up with a slice being the same every day.

39 Scripting Reference https://msdn.microsoft.com/en-us/library/azure/dn835050.aspx As of July 16 th BEWARE

40 Visual Studio Extension Azure SDK 2.7 and above for Visual Studio 2013 You get templates You can reverse engineer You can connect to your factory and deploy from VS Came out JULY 22, 2015

41 Visual Studio Extension

42 Customer Case Studies https://azure.microsoft.com/en-us/documentation/articles/data- factory-customer-case-studies/ https://azure.microsoft.com/en-us/documentation/articles/data- factory-customer-case-studies/

43 Data Management Gateway Configuration Manager http://www.microsoft.com/en-us/download/details.aspx?id=39717 Instructions on use: https://azure.microsoft.com/en- us/documentation/articles/data-factory-move-data-between-onprem-and- cloud/#using-the-data-gateway-step-by-step-walkthroughhttps://azure.microsoft.com/en- us/documentation/articles/data-factory-move-data-between-onprem-and- cloud/#using-the-data-gateway-step-by-step-walkthrough For on prem machines. Load the Gateway on the machine. Then go to the Azure Data Factory. Create the Linked Service Gateway there. Get the key from the ADF linked service, copy and paste it into the final step of the Gateway setup on the On Prem Machine. The Gateway is for the entire server. The entire machine. The Linked service will use that gateway for other things and must be configured for each service i.e. Sql databases. Be patient. Refresh rate is slow and can make it seem like it didn’t work when it did.

44 Data Management Gateway Configuration Manager http://www.microsoft.com/en-us/download/details.aspx?id=39717 Instructions on use: https://azure.microsoft.com/en- us/documentation/articles/data-factory-move-data-between- onprem-and-cloud/#using-the-data-gateway-step-by-step- walkthroughhttps://azure.microsoft.com/en- us/documentation/articles/data-factory-move-data-between- onprem-and-cloud/#using-the-data-gateway-step-by-step- walkthrough For dev purposes, for you own machine. Use Express Set up. It will take about 10 minutes, but it works. You’ll have the data management on your laptop bam.

45 Learning Path https://azure.microsoft.com/en-us/documentation/learning- paths/data-factory/https://azure.microsoft.com/en-us/documentation/learning- paths/data-factory/

46 Resources Wee Hyong Tok’s webcast Reza Rad’s blog


Download ppt "My Data Wandered Lonely As A Cloud: Azure Data Factory Julie Smith SQL Server MVP Innovative"

Similar presentations


Ads by Google