Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning & Data Science Conference

Similar presentations


Presentation on theme: "Machine Learning & Data Science Conference"— Presentation transcript:

1 Machine Learning & Data Science Conference
9/28/2017 8:26 PM © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

2 Data Science doesn’t just happen…
It takes a Process Dr. Jacob Spoelstra Dr. Hang Zhang Gopi Kumar Data Science Microsoft

3 Three major trends are converging…
…ushering in the Fourth Industrial Revolution Big Data & IoT Cloud Intelligence

4 $1.6T The opportunity is bigger than you may think @ How? speed
9/28/2017 The opportunity is bigger than you may think speed More people New analytics Diverse data How? @ $1.6T data dividend available to businesses that embrace data over the next four years Data source: Microsoft and IDC, April 2014 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

5 Traditional Applications Intelligent Applications
Microsoft Build 2016 9/28/2017 8:26 PM Traditional Applications Intelligent Applications Learns from data, makes predictions and improves with experience Leverages Cloud-Hosted APIs Examples: Recommendations, Predictive Maintenance, Churn detection, CRM: Lead scoring Program logic based on static rules Actions are pre-defined Limited adaptation to change in patterns of data © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

6 Sobering statistics “Only 27% of the big data projects are regarded as successful” Only 13% of organizations have achieved full-scale production for their Big Data implementations “Only 8% of the big data projects are regarded as VERY successful” “Only 17% of survey respondents said they had a well-developed Predictive/Prescriptive Analytics program in place, while 80% said they planned on implementing such a program within five years” – Dataversity 2015 Survey Source: CapGemini 2014

7

8 Challenges

9 Machine Learning, Analytics & Data Science Conference
9/28/2017 8:26 PM Courtesy: Steve Geringer. Source: © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

10 How to Do Data Science Right
Machine Learning, Analytics & Data Science Conference 9/28/2017 8:26 PM Coordination, collaboration, and knowledge sharing Data relevancy to problem Scaling , Performance Be versatile on various technologies and algorithms Security should be of high priority Agility in operationalization © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

11 Addressing the Challenges at Microsoft

12 What Problems are we Solving?
Machine Learning & Data Science Conference What Problems are we Solving? 9/28/2017 8:26 PM Quality Consistent, repeatable steps. Required artifacts. Organization keeping track of artifacts in distributed cloud environments Collaboration Across teams, resources Knowledge Accumulation: Effective sharing, preventing reinvention of wheels Agility Get going fast, execute efficiently. Global teams Five geographic locations New Teams Different backgrounds Varied clients Both external and internal © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

13 What is a Process? A process specifies a detailed sequence of activities necessary to perform specific business tasks. It is used to standardize procedures and establish best practices.

14 Evolution of Software Development Process
Machine Learning & Data Science Conference 9/28/2017 8:26 PM © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

15 9/28/2017 8:26 PM Data Science != Software Engineering But, we can learn a lot especially on processes © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

16 Who uses a Process for Data Science?
Source: KDD Nuggets, October 2014

17 Methodologies Cross-Industry Standard Process for Data Mining
(CRISP-DM) Knowledge Discovery in Databases (KDD)

18 Pillars of Data Science Process
Machine Learning & Data Science Conference 9/28/2017 8:26 PM Standard Project Lifecycle Standardized Document Templates, Project Structure Shared, Distributed Resources Productivity Tools, Shared Utilities © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

19

20 Data Science Project Lifecycle
Machine Learning & Data Science Conference 9/28/2017 8:26 PM Data Science Project Lifecycle Business Understanding Scoping Charter Data Acquisition, Understanding Provision Resources Ingest Exploration Modeling Feature Engineering Model Development Evaluation Deployment Model Production Pipeline Monitor Acceptance Finalize Documents Free Resources Handoff © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

21

22 Standardized Artifacts, Shared Productivity Utilities
Templates Utilities

23 Shared and Distributed Infrastructure

24 Machine Learning & Data Science Conference
Documented Practices – Measure and Refine Machine Learning & Data Science Conference 9/28/2017 8:26 PM Efficiency Project 6 Project 5 Project 4 Project 3 Project 2 Project 4 Project 3 Project 1 Project 2 Project 1 Project © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

25 Individual Contributor
Who is this for? Data Science TEAM Manager Individual Contributor Organization: code, artifacts Standardization Versioning Knowledge Accretion Security: role-based access control Productivity: utilities, sharing, templates Collaboration: distributed compute, no contending for resources.

26 Team Data Science Process at Microsoft
Data science virtual machines (DSVMs) as the development platform on cloud Use Visual Studio Team Services (VSTS) work item tracking and scrum planning Git repositories Share utilities in Git repository to make different steps of a data science process more efficient Use cloud-based Azure resources as needed

27 Data science virtual machines (DSVM) on Azure
DSVM (Windows and Linux) by default comes with useful tools for data science Save your time on setting up DSVM can be a shared resource for the team DSVM can be a central hub for other Azure resources

28 Visual Studio Team Services
Visual Studio Team Services (VSTS) is a cloud-based solution for: Work item management and sprint planning, AND Source code management system through Git repositories Project repository: a standard directory structure and a set of standardized documents Each data science project has an individual project repository to ensure security Utilities repository: a set of useful data science tools to make data scientists more productive Facilitates collaboration, security policies and versioning

29 Workitem tracking and agile planning
Terminology: Feature: a project Story: a stage in the E2E process of a DS project Tasks: specific coding/documentation/other activities that are needed to complete a story Iteration: usually a 2-week sprint

30 Link work items with Git branches
Usually work items and versioning are managed by two independent systems VSTS allows a workitem to be connected with a git branch so works done in the git branch can be tracked by the workitem Enables us to understand what have been done to resolve/close a work item.

31 Code review before merging branches to master

32 Productivity Tools Interactive Data Exploration, Analysis, and Reporting (IDEAR) Shiny-based R Utility ?

33 Tracking progress Making the life of managers easier too with PowerBI reporting

34 Live demos Show templates (use Charter, Data Definition as examples)
How to link a work item to a git branch in VSTS IDEAR: Get insights to the data, and generate data summary reports automatically

35 Next Steps Find all our assets on Github: Sign up for a free VSTS account: Contribute templates and utilities!

36 Four roles in TDSP Keep. Illustrates how a team adapts the process and executes on projects. 

37 Machine Learning & Data Science Conference
9/28/2017 8:26 PM Questions? © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

38 Challenges at Microsoft
Global teams Five geographic locations Different time zones How to effectively collaborate? New Teams Different backgrounds How to ensure productivity from day one and consistent work product? Varied clients Both external and internal Process has to be flexible, but consistent

39 Machine Learning & Data Science Conference
9/28/2017 8:26 PM Plan and Prepare Environment Post-deployment Activities Deploy and Consume Models Ingest and explore data, and feature engineering Train and Retrain Models Business and Technology Planning © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

40 Machine Learning & Data Science Conference
9/28/2017 8:26 PM © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

41 Preferred text layout (no bullets)
9/28/2017 8:26 PM Preferred text layout (no bullets) Main topic 1: size 40pt Size 20pt for the subtopics Main topic 2: size 40pt Main topic 3: size 40pt © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

42 Adjusting list levels Main topic 1: size 40pt Main topic 2: size 40pt
9/28/2017 8:26 PM Adjusting list levels Main topic 1: size 40pt Size 20pt for the subtopics Main topic 2: size 40pt Main topic 3: size 40pt Use the “Decrease List Level” and “Increase List Level” tools on the Home Menu to change text levels. Try this: Place your cursor in any row of text to the left that says “Size 20pt for subtopics” Next click the Home tab, and then on the “Decrease List level” tool. Notice how the line moves up one level. Now try placing your cursor in one of the “Main topic…” lines of text. Click the “Increase List Level” tool and see how the text is pushed in one level Use these 2 tools to adjust your text levels as you work © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

43 Bullet points layout with subtitle Subtitle
9/28/2017 8:26 PM Bullet points layout with subtitle Subtitle Example of a bulleted slide with a subhead Set the slide title to “Sentence case” Set subheads to “Sentence case” Hyperlink style © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

44 Accent colors 1-6 – (6 Theme Colors to the far right)
9/28/2017 8:26 PM Slide palette info The PowerPoint palette for this template has been built for you and is shown below. Avoid using too many colors in your presentation. Accent colors 1-6 – (6 Theme Colors to the far right) Accent 1 Accent 2 Accent 3 Accent 4 Accent 5 Accent 6 Use Accents 4-6 sparingly – only when more colors are necessary. Use Accent 1 as the main accent color. Use Accent 2 and Accent 3 when additional colors are needed. © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

45 Video

46 Demo Speaker name

47 9/28/2017 8:26 PM Photo layout 1 © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

48 Microsoft brand guidelines
9/28/2017 8:26 PM Microsoft brand guidelines Looking for more slide resources? Brand guidelines for PowerPoint templates is a separate slide deck that provides an overview of the Microsoft brand, guidelines, resources, tips and much more. A few of the slides are shown at right. Download from: ations/Pages/StoryBoard.aspx?section=Elements1 © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

49 9/28/2017 8:26 PM Section title © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

50 9/28/2017 8:26 PM Section title © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

51 9/28/2017 8:26 PM Section title © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

52 9/28/2017 8:26 PM Section title © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

53 9/28/2017 8:26 PM Section title © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

54 9/28/2017 8:26 PM Software code slide This slide layout uses Consolas, a monotype font which is ideal for showing software code. © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

55 Notes (hidden) NEXT: <next slide title>
9/28/2017 8:26 PM Notes (hidden) Some speakers at Microsoft like to use this slide for hidden “notes slides”. Delete it if you don’t want to use it. NEXT: <next slide title> © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

56 9/28/2017 8:26 PM © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.


Download ppt "Machine Learning & Data Science Conference"

Similar presentations


Ads by Google