Presentation is loading. Please wait.

Presentation is loading. Please wait.

99s_First_Production_Server.jpg CC-BY-2.0 1996: 10x 4Gb Hard Drives 2000: 5000 Linux PCs Today:

Similar presentations


Presentation on theme: "99s_First_Production_Server.jpg CC-BY-2.0 1996: 10x 4Gb Hard Drives 2000: 5000 Linux PCs Today:"— Presentation transcript:

1

2

3 http://commons.wikimedia.org/wiki/File:Google%E2%80% 99s_First_Production_Server.jpg CC-BY-2.0 1996: 10x 4Gb Hard Drives 2000: 5000 Linux PCs Today: > 2 billion servers (estimated) “I don't think the web would exist without open source and Linux. So there would have been no Google.” — Chris DiBona, Google

4

5 www.revolutionanalytics.com/what-is-r

6 New York Times, June 25 2009 (3 hours after Michael Jackson’s death)

7

8

9 R ≃ Stata Across all fields In economics, Stata dominates (not shown)

10 Rexer Data Miner Survey IEEE Spectrum, July 2014 #9: R Language Popularity IEEE Spectrum Top Programming Languages

11 BIG DATA

12 HINDSIGHT INSIGHT FORESIGHT What happened? Why did it happen? What will happen? How can we make it happen? Traditional BIAdvanced Analytics INFORMATION OPTIMIZATION

13 Drew Conway http://www.dataists.com/2010/09/the-data- science-venn-diagram/ Data Integration Mashups Applications Models Visualization Predictions Uncertainty Problems Data Sources Credibility Effective Data Applications

14

15

16 ETL Marketing channel data Behavioral variables Promotional data Overlay data Exploratory data analysis Time-to-event models GAM survival models Scoring for inference Scoring for prediction 5 billion scores per day per retailer CUSTOM DATA FORMAT CUSTOM VARIABLES (PMML)

17

18 Exposing the expertise of data scientists as APIs Bringing the utility of data science to applications Addressing the Data Science talent gap

19 Azure: Huge infrastructure scale 19 Regions ONLINE…huge datacenter capacity around the world…and we’re growing  100+ datacenters  One of the top 3 networks in the world (coverage, speed, connections)  2 x AWS and 6x Google number of offered regions  G Series – Largest VM available in the market – 32 cores, 448GB Ram, SSD… Operational Announced Central US Iowa West US California North Europe Ireland East US Virginia East US 2 Virginia US Gov Virginia North Central US Illinois US Gov Iowa South Central US Texas Brazil South Sao Paulo West Europe Netherlands China North * Beijing China South * Shanghai Japan East Saitama Japan West Osaka India West TBD India East TBD East Asia Hong Kong SE Asia Singapore Australia West Melbourne Australia East Sydney * Operated by 21Vianet

20

21 Data Scientist Interact directly with data Built-in to SQL Server Data Developer/DBA Manage data and analytics together SQL Server 2016 Built-in in-database analytics Example Solutions Fraud detection Sales forecasting Warehouse efficiency Predictive maintenance Relational Data Analytic Library T-SQL Interface Extensibility ? R R Integration 01001 0 10010 0 01010 1 Microsoft Azure Machine Learning Marketplace New R scripts 01001 0 10010 0 01010 1 01001 0 10010 0 01010 1 01001 0 10010 0 01010 1 01001 0 10010 0 01010 1 01001 0 10010 0 01010 1

22 rows minutes R on a server pulling data via SQL R on a server Invoking RRE ScaleR Inside the EDW

23


Download ppt "99s_First_Production_Server.jpg CC-BY-2.0 1996: 10x 4Gb Hard Drives 2000: 5000 Linux PCs Today:"

Similar presentations


Ads by Google