Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation.

Slides:



Advertisements
Similar presentations
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Advertisements

Jovan Milošević Solution Specialist Microsoft Software d.o.o.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Ray Ozzie Chief Software Architect. Applications and Solutions Cloud Infrastructure Services Live Platform Services Global Foundation Services Services.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
R and HDInsight in Microsoft Azure
Web RoleWorker Role At runtime each Role will execute on one or more instances A role instance is a set of code, configuration, and local data, deployed.
MIX 09 4/15/ :14 PM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Feature: Payroll and HR Enhancements © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.
Parametric Sweeps Cluster SOA MPI LINQ to HPC Excel Cluster Deployment Monitoring Diagnostics Reporting Job submission API and portal.
Co- location Mass Market Managed Hosting ISV Hosting.
Business Intelligence Overview Marc Schöni Technical Solution Professional | Business Intelligence Microsoft Switzerland.
Introduction to Big Data and Hadoop Name Title Microsoft Corporation.
Feature: Web Client Keyboard Shortcuts © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
WRF in the Cloud: An introduction to Big Compute on Windows Azure Wenming Ye Research Program Manager Microsoft Research
Session 1.
Built by Developers for Developers…. © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
 Rico Mariani Architect Microsoft Corporation.
HDInsight on Azure and Map-Reduce Richard Conway Windows Azure MVP Elastacloud Limited.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
demo Cloud Storage WA Blobs Schema Management APIs & Portal Web Roles Integration Pipeline 3 rd Party Web Services 3 rd Party Store 3 rd Party.
fs.azure.account.key.accountname enterthekeyvaluehere.
Windows Azure Connect Name Title Microsoft Corporation.
SQL SERVER 2012 FOR THE NEW WORLD OF DATA Doug Leland General Manager SQL Server Marketing.
demo Receive Inventory Export Parse and Normalize.
FonePlus Hugh Teegan Architect Mobile Devices Microsoft Corporation.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Operating System for the Cloud Runs applications in the cloud Provides Storage Application Management Windows Azure ideal for applications needing:
Building Social Games for Windows 8 with Windows Azure Name Title Microsoft Corporation.
Feature: Customer Combiner and Modifier © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.
demo Instance AInstance B Read “7” Write “8”

customer.
demo © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
demo Demo.
Breaking points of traditional approach What if you could handle big data?
demo QueryForeign KeyInstance /sm:body()/x:Order/x:Delivery/y:TrackingId1Z
Windows Azure SQL Data Sync Name Title Microsoft Corporation.
projekt202 © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks.
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.

Patrick Ortiz Global SQL Solution Architect Dell Inc. BIN209.
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
IT Operations Management
S4 Solution Specialist Sales Summit
MIX 09 5/29/ :31 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
The Future of C# The Future of C# and VB 2-577
Enable the Hybrid Data Platform
Microsoft Build /22/ :52 PM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Data Platform and Analytics Foundational Training
IT Operations Management
Microsoft Azure P wer Lunch
Возможности Excel 2010, о которых следует знать
9/21/2018 3:41 AM BRK3180 Architect your big data solutions with SQL Data Warehouse & Azure Analysis Services Josh Caplan & Matt Usher Program Managers.
Server & Tools Business
Title of Presentation 12/2/2018 3:48 PM
12/6/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
8/04/2019 9:13 PM © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Windows Azure Overview
Виктор Хаджийски Катедра “Металургия на желязото и металолеене”
Шитманов Дархан Қаражанұлы Тарих пәнінің
Title of Presentation 5/24/2019 1:26 PM
Server & Tools Business
Presentation transcript:

Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation

Windows Azure Center of Excellence Spotlight Pilots Assessment Architecture and Design Guidance Modern AppsGlobal Scale Design Sessions Global Services Team 10 Senior Cloud Architects Dennis Mulder US, EMEA, APAC 8 Pilots Cloud Apps Champs Services Dennis Mulder, Solution Architect, DesignAssessContactPilots Engage

SocialMobility mobile apps will be downloaded in 2012 = 91% of organizations expect to spend on mobile devices in /2 of companies expect to use internal social network apps in zettabytes in 2012 >80% of new apps in 2012 will be distributed/ deployed on clouds 32% of businesses are likely to invest in BI and analytics in 2012 from infrastructure to application platforms The strategic focus in the cloud will shift in 2012 In 2012, mobile devices will outship PCs by more than 2:1 and generate more revenue than PCs for the first time 85 BILLION Social networking will follow not just people but also appliances, devices and products 34% of CIOs say technology as a service (cloud) will have the most profound effect on the CIO role in the future 2/3 of mobile apps developed in 2012 will integrate with analytics offerings 49% of CIOs rank BI as the top project priority for 2012 Big data Cloud Four megatrends will dominate the next decade

mobile apps will be downloaded in 2012 = 91% of organizations expect to spend on mobile devices in /2 of companies expect to use internal social network apps in zettabytes in 2012 >80% of new apps in 2012 will be distributed/ deployed on clouds 32% of businesses are likely to invest in BI and analytics in 2012 from infrastructure to application platforms The strategic focus in the cloud will shift in 2012 In 2012, mobile devices will outship PCs by more than 2:1 and generate more revenue than PCs for the first time 85 BILLION Social networking will follow not just people but also appliances, devices and products 34% of CIOs say technology as a service (cloud) will have the most profound effect on the CIO role in the future 2/3 of mobile apps developed in 2012 will integrate with analytics offerings 49% of CIOs rank BI as the top project priority for 2012 SocialMobility Big data Microsoft is embracing these megatrends Cloud

How will technology megatrends enable you to save money, drive innovation, grow your business, and attract and retain customers? Rethinking and evolving business strategies Social Big data Mobility Cloud

Why Big Data?

Internet of things Audio / Video Log Files Text/Image Social Sentiment Data Market Feeds eGov Feeds Weather Wikis / Blogs Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates WEB 2.0 Mobile Advertisin g CollaborationeCommerce Digital Marketing Search Marketing Web Logs Recommendation s ERP / CRM Sales Pipeline Payables Payroll Inventory Contacts Deal Tracking Terabytes (10E12) Gigabytes (10E9) Exabytes (10E18) Petabytes (10E15) Velocity - Variety - variability Volume ,000$ $ ,000$ $ Storage/GB ERP / CRM WEB 2.0 Internet of things

Example Scenarios

Excess Data Logs ETL Some Data Data Warehouse

Raw Data “Store it All” Cluster Raw Data “Store it All” Cluster Data Warehouse Logs

Understanding the Basics Move the Compute to the Data

Hadoop Distributed Architecture

Server Files Server

RUNTIME Code

MapReduce – Workflow

Map tasks $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $60 MapperMapper MapperMapper $ $ $ $ $ $ $ $ $ $ $ $ $ $95 DataNode3 DataNode2 DataNode1 Blocks of the Sales file in HDFS Group By Group By (custId, zipCode, amount) One output bucket per reduce task

Reducer Reduce tasks Reducer 53705$ $ $ $ $ $60 MapperMapper 53705$ $ $ $ $ $25 MapperMapper 53705$ $ $ $ $ $ $ $ $ $ $ $15 Sort Sort Sort 53705$ $ $ $ $ $ $ $ $ $ $ $15 SUM 10025$ $ $ $ $30 Done! Shuffle

MapReduce – Workflow

HD Insight

Front end Stream Layer Partition Layer Name Node de Data Node Front end HDFS API DFS (1 Data Node per Worker Role) and Compute Cluster Azure Storage (ASV) … Azure Blob Storage

Distributed Storage (HDFS) Query (Hive) Distributed Processing (MapReduce) HDINSIGHT / HADOOP Eco-System Legend Red = Core Hadoop Blue = Data processing Purple = Microsoft integration points and value adds Orange = Data Movement Green = Packages

Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus… C#, F# Map/Reduce, LINQ to Hive,.NET management clients JavaScript Map/Reduce, Browser hosted console, Node.js management clients PowerShell, Cross Platform CLI tools

TRADITIONAL RDBMSMAPREDUCE Data Size Access Updates Structure Integrity Scaling DBA Ratio

Deploying and Interacting With HDInsight Service demo

Nuget: Hadoop SDK:

Windows Azure Center of Excellence Spotlight Pilots Assessment Architecture and Design Guidance Modern AppsGlobal Scale Design Sessions Global Services Team 10 Senior Cloud Architects Dennis Mulder US, EMEA, APAC 8 Pilots Cloud Apps Champs Services Dennis Mulder, Solution Architect, DesignAssessContactPilots Engage

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.