Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tech Evangelist, Microsoft Responsible for Azure Evangelism in Denmark Economics and Statistics background (Aarhus University) Blog:

Similar presentations


Presentation on theme: "Tech Evangelist, Microsoft Responsible for Azure Evangelism in Denmark Economics and Statistics background (Aarhus University) Blog:"— Presentation transcript:

1

2 Tech Evangelist, Microsoft Responsible for Azure Evangelism in Denmark Economics and Statistics background (Aarhus University) Blog: https://sebastianbrandes.com

3 What is Big Data (according to Microsoft)? Hadoop and The Hadoop Ecosystem – some stats Introduction to Microsoft Azure and HDInsight Provisioning a Hadoop cluster in Azure Installing R on the cluster Running MapReduce jobs using R Azure Machine Learning + R Wrapping Up

4 $100 gets you 3 million times more storage in 30 years MIPS/$ M MIPS/$ >5.5 billion (70+% of global population) >2 Billion users Web traffic Exabyte (10 E18) Zettabyte (10 E21) >10 Billion

5 “Big data is a collection of data sets so large and complex that it becomes awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analysis, and visualization.” – Wikipedia

6 VOLUME (Size) VARIETY (Structure) VELOCITY (Speed)

7 Internet of Things Audio / Video Log Files Text/Image Social Sentiment Data Market Feeds eGov Feeds Weather Wikis / Blogs Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates Modern Web Mobile Advertisin g CollaborationeCommerce Digital Marketing Search Marketing Web Logs Recommendation s ERP / CRM Sales Pipeline Payables Payroll Inventory Contacts Deal Tracking Terabytes (10E12) Gigabytes (10E9) Exabytes (10E18) Petabytes (10E15) Velocity - Variety Volume ERP / CRM Modern Web Internet of Things

8 How do I optimize my services based on patterns of weather, traffic, etc.? What’s the social sentiment of my product? How do I better predict future outcomes?

9

10 vs.

11

12

13

14

15

16

17 Per March 2015

18

19

20 The Large Hadron Collider (LHC) is the world's largest and most powerful particle collider, and the largest single machine in the world, built by the European Organization for Nuclear Research (CERN) from 1998 to Wikipedia

21

22

23

24

25

26

27

28

29

30

31 Client(s) Job Tracker Primary Name Node Secondary Name Node Data Node... Masters Slaves Task Tracker Data Node Task Tracker Data Node Task Tracker Data

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46 5 weeks ago!

47

48

49 Integration between the R statistical package and Hadoop’s Distributed File System and MapReduce Computation Engine Moves algorithm execution closer to the data Provides access to lots of high ‐ quality statistical libraries Speeds work by processing in parallel

50

51

52

53

54

55

56

57

58 Sebastian Brandes,


Download ppt "Tech Evangelist, Microsoft Responsible for Azure Evangelism in Denmark Economics and Statistics background (Aarhus University) Blog:"

Similar presentations


Ads by Google