Architecture of modern data warehouse

Slides:



Advertisements
Similar presentations
Information managers are seeking innovative DBMS’s which are able to handle large data volumes in new ways or to optimize existing products and processes.
Advertisements

AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October
An Introduction To Big Data For The SQL Server DBA.
Business Insights Play briefing deck.
Energy Management Solution
DATA Storage and analytics with AZURE DATA LAKE
BUILD BIG DATA ENTERPRISE SOLUTIONS FASTER ON AZURE HDINSIGHT
Connected Infrastructure
4/18/2018 6:56 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Data Platform and Analytics Foundational Training
Data Platform Modernization
Big Data Enterprise Patterns
5/9/2018 7:28 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS.
Connected Living Connected Living What to look for Architecture
Data Platform and Analytics Foundational Training
Smart Building Solution
Examine information management in Cortana Intelligence
The story of an IoT solution
Parcel Tracking Solution Parcel Tracking What to look for Architecture
Orchestrating Data and Services with Azure Data Factory
Microsoft Azure: The only consistent Hybrid Cloud
Enable the Hybrid Data Platform
Smart Building Solution
Connected Living Connected Living What to look for Architecture
Microsoft Build /22/ :52 PM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Connected Infrastructure
Data Platform and Analytics Foundational Training
Remote Monitoring solution
Energy Management Solution
Add intelligence to Dynamics AX with Cortana Intelligence suite
What is business intelligence?
Exploring Azure Event Grid
Microsoft for the Modern Data Estate
9/21/2018 3:41 AM BRK3180 Architect your big data solutions with SQL Data Warehouse & Azure Analysis Services Josh Caplan & Matt Usher Program Managers.
Turning back time … … to 1998.
Welcome! Power BI User Group (PUG)
Overview of Azure Data Lake Store
11/9/2018 5:08 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
What is the Azure SQL Datawarehouse?
Yellowfin: An Azure-Compatible Business Intelligence Platform That Connects People with Their Data for Better Decision Making MICROSOFT AZURE APP BUILDER.
Data Platform Modernization
Microsoft Connect /22/2018 9:50 PM
DeFacto Planning on the Powerful Microsoft Azure Platform Puts the Power of Intelligent and Timely Planning at Any Business Manager’s Fingertips Partner.
Accelerate Your Self-Service Data Analytics
Welcome! Power BI User Group (PUG)
Azure's Performance, Scalability, SQL Servers Automate Real Time Data Transfer at Low Cost MINI-CASE STUDY “Azure offers high performance, scalable, and.
Near Real Time ETLs with Azure Serverless Architecture
Orchestration and data movement with Azure Data Factory v2
XtremeData on the Microsoft Azure Cloud Platform:
Azure Data Lake for First Time Swimmers
THR1171 Azure Data Integration: Choosing between SSIS, Azure Data Factory, and Azure Databricks Cathrine Wilhelmsen, | cathrinew.net.
Databricks: the new kid on the block
Analytics in the Cloud using Microsoft Azure
Context about the Data Warehouse
Azure Machine Learning on Databricks
Introducing Power BI dataflows
Understanding Azure Data Engineering Options Finding Clarity in a Vast & Changing Landscape Cameron Snapp.
ETL Patterns in the Cloud with Azure Data Factory
Moving your on-prem data warehouse to cloud. What are your options?
Introduction to Azure Data Lake
Data Wrangling for ETL enthusiasts
Customer 360.
Michael French Principal Consultant 5/18/2019
The Modern Data Warehouse and Azure
SQL Server 2019 Bringing Apache Spark to SQL Server
Get your data flowing with Data Flows! and...umm...dataflows.
Visual Data Flows – Azure Data Factory v2
Visual Data Flows – Azure Data Factory v2
Presentation transcript:

Architecture of modern data warehouse Eugene Polonichko, Data Platform MVP, Pass Chapter Leader Architecture of modern data warehouse

Organizers Natalia Pogorelova Andriy Pogorelov Paul Stetsenko

Sponsors

About me Eugene Polonichko has over 7 years of experience with SQL Server. He mainly focused on BI projects (SSAS, SSIS, PowerBI, Cognos, Informatica PowerCenter, Pentaho, Tableau). Eugene is a passionate speaker and SQL community volunteer presenting regularly at PASS SQL Saturday events and local user groups around Ukraine and Europe. Eugene is PASS Chapter Leader and he has a status MVP Data Platform https://www.linkedin.com/in/eugenepolonichko/ https://twitter.com/EvgenPolonichko

Agenda Modern Data Warehouse Microsoft architecture Traditional approach Modern Data warehouse Ten characteristics Microsoft architecture Microsoft Modern Data Warehouse Azure Data Factory Cosmos DB Storage Azure Databricks Azure DWH

Concept of modern data warehouse: Traditional approach Data Intake Data Transformation & Storage Data Consumption & Presentation

Modern data warehouse

Modern Data Warehouse Ingest & Prep Model & Serve Visualize Store 12/2/2019 5:23 AM Modern Data Warehouse Ingest & Prep Model & Serve Visualize Logs (unstructured) Azure Data Factory Azure SQL Data Warehouse Power BI Code-free data ingestion from 85+ data integration connectors Media (unstructured) Azure Databricks (Prep-only) Up to 14x faster and costs 94% less than other cloud providers Leader in the Magic Quadrant for Business Intelligence and Analytics Platforms* At the foundation, customers can build a data lake to store all their data and different data types with Azure Data Lake Storage. To ingest data, customers can do so code-free with over 85 data integration connectors with Azure Data Factory. This empowers customers to do code-free ETL/ELT with any data from any source. Whether the data is in on-premises data sources, other Azure services, or other cloud services, customers can seamlessly author, monitor, and manage their big data pipelines with a visual environment that is easy to use. And once customer ingest that data, they can use Azure Databricks to shape the data formats and prep it using a Notebook—which makes internal collaboration on data more streamlined and efficient. Now, with the data stored, ingested, and prepared, customers can put their data into Azure SQL Data Warehouse. With SQL Data Warehouse, customers now have their data in a industry-leading data warehouse that is up to 14x faster and costs 94% less than other cloud providers. This enables customers to use a cloud data warehouse to handle petabyte-scale analytics workloads with industry-leading query performance and security. And finally, the combination of SQL Data Warehouse and Power BI enables customers to build visualizations on massive amounts of data and ensure that data insights are available to everyone across their organization. Files (unstructured) Business/ custom apps (structured) Store Azure Data Lake Storage High performance data lake available in all 54 Azure regions © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Microsoft Modern Data Warehouse Easily ingest live streaming data for an application using Apache Kafka cluster in Azure HDInsight Bring together all your structured data using Azure Data Factory to Azure Blob Storage. Take advantage of Azure Databricks to clean, transform, and analyze the streaming data, and combine it with structured data from operational databases or data warehouses. Use scalable machine learning/deep learning techniques, to derive deeper insights from this data using Python, R or Scala, with inbuilt notebook experiences in Azure Databricks. Leverage native connectors between Azure Databricks and Azure SQL Data Warehouse to access and move data at scale. Build analytical dashboards and embedded reports on top of Azure Data Warehouse to share insights within your organization and use Azure Analysis Services to serve this data to thousands of users. Power users take advantage of the inbuilt capabilities of Azure Databricks and Azure HDInsight to perform root cause determination and raw data analysis. Take the insights from Azure Databricks to Cosmos DB to make them accessible through real time apps.

Azure HDInsight Kafka Azure HDInsight is a managed, full- spectrum, open-source analytics service for enterprises. HDInsight is a cloud service that makes it easy, fast, and cost-effective to process massive amounts of data. HDInsight also supports a broad range of scenarios, like extract, transform, and load (ETL); data warehousing; machine learning; and IoT. Apache Kafka is an open-source, distributed streaming platform. It's often used as a message broker, as it provides functionality similar to a publish-subscribe message queue.

Azure Data Factory Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data.

Azure Data Factory Azure Self- hosted Azure - SSIS

Storage Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data. Blob storage offers three types of resources: The storage account. A container in the storage account A blob in a container

Azure Databricks Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.

Azure Databricks

Azure Data Warehouse Azure SQL Data Warehouse is a massively parallel processing (MPP) cloud-based, scale-out, relational database capable of processing massive volumes of data. Combines the SQL Server relational database with Azure cloud scale-out capabilities. Decouples storage from compute. Enables increasing, decreasing, pausing, or resuming compute. Integrates across the Azure platform. Utilizes SQL Server Transact-SQL (T-SQL) and tools. Complies with various legal and business security requirements such as SOC and ISO.

Architecture of SQL Data Warehouse Control node Compute nodes Azure storage Data Movement Service

Cosmos DB Azure Cosmos DB is a globally distributed, multi-model database service. Then learn how to replicate your data across any number of Azure regions and scale your throughput independent from your storage.

Visualization Power BI is a business analytics service by Microsoft. It aims to provide interactive visualizations and business intelligence capabilities with an interface simple enough for end users to create their own reports and dashboards

Links https://www.eckerson.com/articles/ten-characteristics-of-a-modern- data-architecture https://www.sqlchick.com/entries/2017/1/9/defining-the- components-of-a-modern-data-warehouse-a-glossary

Thank you https://www.linkedin.com/in/eugenepolonichko/ https://msolapblog.wordpress.com/ https://twitter.com/EvgenPolonichko