Presentation is loading. Please wait.

Presentation is loading. Please wait.

DATA Storage and analytics with AZURE DATA LAKE

Similar presentations


Presentation on theme: "DATA Storage and analytics with AZURE DATA LAKE"— Presentation transcript:

1 DATA Storage and analytics with AZURE DATA LAKE
Glenn Morris TECHNICAL DIRECtor | Data SCIENTIsT @glennrm

2 AGenda ARCHITECTURES & AZURE Analytics The ANALYTICS pipeline
DISCLAIMER CONFUSION AZURE DATA LAKE REMINDER AGenda

3 DISCLAIMER I am likely to appear to lie to you or tell you something that will change very soon after I say it!

4 The confusion How do you pronounce “Azure”
Takes it’s name from Lapis Lazuli French: Azur as in Cote D’azur Spanish: Azul

5 The data analytics pipeline
Data does not arrive nicely formatted and ready to be consumed. The data pipeline is paramount to understanding any analytics solution. The general process: SOURCE INGEST PROCESS STORAGE DELIVERY

6 Data lakes Storage and Data Storage: infinitely scalable, fault tolerant storage designed to handle massive volumes of data Data: processing engine that can operate on data at the above scale “If you think of a datamart as a store of bottled water - cleansed and packaged and structured for easy consumption - the data lake is a large body of water in a more natural state. The contents of the lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.” James Dixon CEO Pentaho

7 ARCHITECTUREs LAMBDA Pipeline Architecture to reduce the complexity in real time analytics Constrains incremental computation to a small portion of the architecture Hot (mutable) or Cold (immutable) path for dataflow KAPPA Developed to simplify the Lambda Architecture Eliminate the cold path Make all processing happen in near real time streaming mode

8 AZURE ANALYTICS

9 microservices A microservice is a software building block that does one thing and does it well. It can be provisioned on demand, elastically scaled, provides fault tolerance and fail over, and when it is no longer needed can be de-provisioned Autonomous and Isolated • Autonomous: Existing or capable of existing independently; responding, reacting, or developing independently of the whole. • Isolated: Separate from others, happening in different places and at different times.

10 AZURE DATA FACTORY Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data. Just like a manufacturing factory that runs equipment to take raw materials and transform them into finished goods, Data Factory orchestrates existing services that collect raw data and transform it into ready-to-use information.

11 AZURE data lake store Azure Data Lake Store is an enterprise-wide hyper-scale repository for big data analytic workloads. Azure Data Lake enables you to capture data of any size, type, and ingestion speed in one single place for operational and exploratory analytics.

12 AZURE data lake Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages. It removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics. Azure Data Lake works with existing IT investments for identity, management, and security for simplified data management and governance.

13 AZURE data lake analytics
Azure Data Lake Analytics is a new service, built to make big data analytics easy. This service lets you focus on writing, running and managing jobs, rather than operating distributed infrastructure. Instead of deploying, configuring and tuning hardware, you write queries to transform your data and extract valuable insights. 

14 Reminder metres Thank You

15 A Big Thanks to our Sponsors


Download ppt "DATA Storage and analytics with AZURE DATA LAKE"

Similar presentations


Ads by Google