Presentation is loading. Please wait.

Presentation is loading. Please wait.

AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October 2015 1.

Similar presentations


Presentation on theme: "AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October 2015 1."— Presentation transcript:

1 AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner Josh.Sivey@Neudesic.com October 2015 1

2 © Copyright 2015, Neudesic. All rights reserved. Agenda 2 What is Azure Data Factory? (ADF) What are we going to build today? Azure Data Factory Artifact Overview Demos Monitoring and Troubleshooting

3 © Copyright 2015, Neudesic. All rights reserved. What is Azure Data Factory 3 Azure Data Factory is a cloud-based data integration service that automates moving and transforming data. Compose data processing, storage, and movement services to create and manage analytics pipelines Rich, simple end-to-end pipeline monitoring and management Initially focused on Azure and hybrid movement to/from on premises SQL Server.

4 © Copyright 2015, Neudesic. All rights reserved. What are we going to build? 4 http://azpasscomment.azurewebsites.net/ storage blob Azure SQL Database SQL database (on-premises) Data Management Gateway Azure Data Factory copy and transform comments ingest copy Power BI Dashboard visualize (just for fun)

5 © Copyright 2015, Neudesic. All rights reserved. Azure Data Factory Artifact Overview 5

6 © Copyright 2015, Neudesic. All rights reserved. Linked Services 6 Linked services define the information needed for ADF to connect to external resources. Linked services are used for two purposes: To represent a data store including: Azure Storage, Azure SQL, Azure SQL Data Warehouse, Azure DocumentDB SQL Server, Oracle, File System, DB2, MySQL, Teradata, PostgreSQL, Sybase To represent a compute resource that can host the execution of an Activity. For example, the “HDInsightHiveActivity” executes on an HDInsight Hadoop cluster.

7 © Copyright 2015, Neudesic. All rights reserved. Datasets 7 Datasets are named references to the input or output data of an Activity. Datasets identify structures within different data stores including tables, files, folders, and documents.

8 © Copyright 2015, Neudesic. All rights reserved. Activities 8 Activities define the actions to perform on your data. Each activity takes zero or more datasets as inputs and produces one or more datasets as outputs. An activity is a unit of orchestration in Azure Data Factory. Available Transformation ActivitiesCompute environment HiveHDInsight [Hadoop] PigHDInsight [Hadoop] MapReduceHDInsight [Hadoop] Hadoop StreamingHDInsight [Hadoop] Machine Learning Batch ExecutionAzure VM Stored ProcedureAzure SQL DotNetHDInsight [Hadoop] or Azure Batch Available Data Movement Activities Copy

9 © Copyright 2015, Neudesic. All rights reserved. Pipelines 9 Pipelines are a logical grouping of Activities. They are used to group activities into a unit that together perform a task. Activities grouped into a single Pipeline can be deployed, scheduled, or deleted as one single unit instead of managing each individual activity independently.

10 © Copyright 2015, Neudesic. All rights reserved. Data Management Gateway 10 The Data Management Gateway allows secure access to on-premises data sources No corporate firewall changes (Gateway uses HTTP based connections) Encrypt credentials for your on-premises data stores with your certificate Parallel data transfer, resilient to network issues with auto retry logic. Considerations: A single gateway instance is tied to only one Azure Data Factory Only one instance of Data Management Gateway can be installed on a single machine

11 © Copyright 2015, Neudesic. All rights reserved. Develop and Deploy Azure Data Factories 11 You need an Azure Subscription ADF artifacts can be developed and deployed in 3 ways: Using Visual Studio In the Azure Portal editor Via PowerShell Let’s look at the Azure Portal editor

12 © Copyright 2015, Neudesic. All rights reserved. Demo Time! 12 http://azpasscomment.azurewebsites.net/ storage blob Azure SQL Database SQL database (on-premises) Data Management Gateway Azure Data Factory copy and transform comments ingest copy Power BI Dashboard visualize (just for fun)

13 © Copyright 2015, Neudesic. All rights reserved. More Demo Time! – Custom DotNet Activity 13 http://azpasscomment.azurewebsites.net/ storage blob External Sentiment Scoring Web Service Azure Data Factory copy and enrich comments ingest Power BI Dashboard visualize (just for fun)

14 © Copyright 2015, Neudesic. All rights reserved. Monitoring 14 The Diagram View provides a “single pane of glass” to monitor and manage the Data Factory and its assets. Status and State information can be viewed. Activity runs and logging information can be viewed. Pipelines can be suspended if an issues is found and resumed once the issue is corrected. Logging information can be queried using PowerShell cmdlets.

15 © Copyright 2015, Neudesic. All rights reserved. Azure Data Factory Closing Thoughts 15 Azure Data Factory (like the entire Azure Platform) is consumption based Only pay for the storage and compute that is used As a developer, I can use ADF and the Azure Platform to quickly create solutions without needing to wait for servers to be created and software to be licensed and installed Azure services can scale up/down compute based on price/performance needs. I was very cost sensitive when creating the demos for this presentation, I used small compute and database sizes. The entire demo with prep was less than $10 dollars. Azure solutions can scale up if (for example) we had millions of data comments instead of just dozens.

16 © Copyright 2015, Neudesic. All rights reserved. Appendix 16 Learning path for Azure Data Factory https://azure.microsoft.com/en-us/documentation/learning-paths/data- factory/https://azure.microsoft.com/en-us/documentation/learning-paths/data- factory/ Monitor and manage Azure Data Factory pipelines https://azure.microsoft.com/en-us/documentation/articles/data-factory- monitor-manage-pipelines/https://azure.microsoft.com/en-us/documentation/articles/data-factory- monitor-manage-pipelines/ Azure Data Market Text Analytics service https://datamarket.azure.com/dataset/amla/text-analytics https://datamarket.azure.com/dataset/explore/amla/text-analytics


Download ppt "AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October 2015 1."

Similar presentations


Ads by Google