Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation.

Similar presentations


Presentation on theme: "Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation."— Presentation transcript:

1 Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation

2 Windows Azure Center of Excellence Spotlight Pilots Assessment Architecture and Design Guidance Modern AppsGlobal Scale Design Sessions Global Services Team 10 Senior Cloud Architects Dennis Mulder US, EMEA, APAC 8 Pilots Cloud Apps Champs Services Dennis Mulder, Solution Architect, dmulder@microsoft.com DesignAssessContactPilots Engage

3

4 SocialMobility mobile apps will be downloaded in 2012 = 91% of organizations expect to spend on mobile devices in 2012 1/2 of companies expect to use internal social network apps in 2012 2.7 zettabytes in 2012 >80% of new apps in 2012 will be distributed/ deployed on clouds 32% of businesses are likely to invest in BI and analytics in 2012 from infrastructure to application platforms The strategic focus in the cloud will shift in 2012 In 2012, mobile devices will outship PCs by more than 2:1 and generate more revenue than PCs for the first time 85 BILLION Social networking will follow not just people but also appliances, devices and products 34% of CIOs say technology as a service (cloud) will have the most profound effect on the CIO role in the future 2/3 of mobile apps developed in 2012 will integrate with analytics offerings 49% of CIOs rank BI as the top project priority for 2012 Big data Cloud Four megatrends will dominate the next decade

5 mobile apps will be downloaded in 2012 = 91% of organizations expect to spend on mobile devices in 2012 1/2 of companies expect to use internal social network apps in 2012 2.7 zettabytes in 2012 >80% of new apps in 2012 will be distributed/ deployed on clouds 32% of businesses are likely to invest in BI and analytics in 2012 from infrastructure to application platforms The strategic focus in the cloud will shift in 2012 In 2012, mobile devices will outship PCs by more than 2:1 and generate more revenue than PCs for the first time 85 BILLION Social networking will follow not just people but also appliances, devices and products 34% of CIOs say technology as a service (cloud) will have the most profound effect on the CIO role in the future 2/3 of mobile apps developed in 2012 will integrate with analytics offerings 49% of CIOs rank BI as the top project priority for 2012 SocialMobility Big data Microsoft is embracing these megatrends Cloud

6 How will technology megatrends enable you to save money, drive innovation, grow your business, and attract and retain customers? Rethinking and evolving business strategies Social Big data Mobility Cloud

7 Why Big Data?

8

9

10

11 Internet of things Audio / Video Log Files Text/Image Social Sentiment Data Market Feeds eGov Feeds Weather Wikis / Blogs Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates WEB 2.0 Mobile Advertisin g CollaborationeCommerce Digital Marketing Search Marketing Web Logs Recommendation s ERP / CRM Sales Pipeline Payables Payroll Inventory Contacts Deal Tracking Terabytes (10E12) Gigabytes (10E9) Exabytes (10E18) Petabytes (10E15) Velocity - Variety - variability Volume 1980 190,000$ 2010 0.07$ 1990 9,000$ 2000 15$ Storage/GB ERP / CRM WEB 2.0 Internet of things

12 Example Scenarios

13

14 Excess Data Logs ETL Some Data Data Warehouse

15 Raw Data “Store it All” Cluster Raw Data “Store it All” Cluster Data Warehouse Logs

16 Understanding the Basics Move the Compute to the Data

17 Hadoop Distributed Architecture

18 Server Files Server

19 RUNTIME Code

20 MapReduce – Workflow

21 Map tasks 21 53705$6553705$3053705$1554235$7554235$2202115$1502115$1544313$1044313$2544313$55 553705$15 644313$10 553705$65 054235$22 902115$15 644313$25 310025$95 844313$55 253705$30 102115$15 454235$75 710025$60 MapperMapper MapperMapper 454235$75 710025$60 253705$30 102115$15 10025$60 553705$65 054235$22 553705$15 644313$10 310025$95 844313$55 902115$15 644313$25 10025$95 DataNode3 DataNode2 DataNode1 Blocks of the Sales file in HDFS Group By Group By (custId, zipCode, amount) One output bucket per reduce task

22 Reducer Reduce tasks Reducer 53705$6554235$7554235$22 10025$95 44313$55 10025$60 MapperMapper 53705$3053705$1502115$1502115$1544313$1044313$25 MapperMapper 53705$6553705$30 53705$15 44313$10 44313$25 10025$95 44313$55 10025$6054235$75 54235$22 02115$15 02115$15 Sort Sort Sort 53705$65 53705$30 53705$15 44313$10 44313$25 44313$55 10025$95 10025$60 54235$75 54235$22 02115$15 02115$15 SUM 10025$155 44313$90 53705$110 54235$97 02115$30 Done! Shuffle

23 MapReduce – Workflow

24 HD Insight

25 Front end Stream Layer Partition Layer Name Node de Data Node Front end HDFS API DFS (1 Data Node per Worker Role) and Compute Cluster Azure Storage (ASV) … Azure Blob Storage

26

27 Distributed Storage (HDFS) Query (Hive) Distributed Processing (MapReduce) HDINSIGHT / HADOOP Eco-System Legend Red = Core Hadoop Blue = Data processing Purple = Microsoft integration points and value adds Orange = Data Movement Green = Packages

28 Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus… C#, F# Map/Reduce, LINQ to Hive,.NET management clients JavaScript Map/Reduce, Browser hosted console, Node.js management clients PowerShell, Cross Platform CLI tools

29

30 TRADITIONAL RDBMSMAPREDUCE Data Size Access Updates Structure Integrity Scaling DBA Ratio

31

32 Deploying and Interacting With HDInsight Service demo

33

34

35

36 http://www.windowsazure.com/ http://hadoop.apache.org/ Nuget: http://nuget.org/packages?q=hadoophttp://nuget.org/packages?q=hadoop Hadoop SDK: http://hadoopsdk.codeplex.comhttp://hadoopsdk.codeplex.com

37 Windows Azure Center of Excellence Spotlight Pilots Assessment Architecture and Design Guidance Modern AppsGlobal Scale Design Sessions Global Services Team 10 Senior Cloud Architects Dennis Mulder US, EMEA, APAC 8 Pilots Cloud Apps Champs Services Dennis Mulder, Solution Architect, dmulder@microsoft.com DesignAssessContactPilots Engage

38 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.


Download ppt "Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation."

Similar presentations


Ads by Google