Presentation on theme: "Descriptive Diagnostic Prescriptive Predictive IT Professionals Data Modeling, ETL, Data Warehousing, Data Marts and Cubes Information Worker Self-Service."— Presentation transcript:
Descriptive Diagnostic Prescriptive Predictive IT Professionals Data Modeling, ETL, Data Warehousing, Data Marts and Cubes Information Worker Self-Service & Exploration with Power BI Data Scientists Advanced Analytics from Microsoft and 3 rd parties BI Enablement Advanced Analytics Enterprise Data Management What happened?Why did it happen?What will happen?What should I do?
Microsoft DDSG - Vision, Mission and Services Offerings Strategic Analytics Consulting Data Science Community Big Data Analytics Big Data Innovation Predictive and Prescriptive Causality Studies Fraud Detection System Dynamics Forecasting Optimization Big Data Insights & Visualization Social & Sentiment Analysis Web Analytics POC & Pilot Enablement Solution Design Architectural Design Consulting Community Development Data Science Training Data Driven Org Strategy MCS & EPG Partnership Industry Showcase Global Field External Client Consulting Simulation Modeling Services MissionProvide advanced analytic expertise to influence strategy and help drive efficiency, Mission | Provide advanced analytic expertise to influence strategy and help drive efficiency, grow revenue and improve customer satisfaction Vision Build a Culture of Data Driven Decision Making Vision | Build a Culture of Data Driven Decision Making
TelecommunicationsFinancial Services Health Care Fixed Line & Mobile Banking, Insurance, Real Estate Pharmaceuticals, Biotechnology Industry/Utility Aerospace, Utility, Manufacturing
Industry Stats Windows Telemetry SEGMENTATION CYCLE TIME REDUCTION Build a utilization based customer segmentation by analyzing the Click stream from Windows Telemetry panel MS.COM - Targeting TARGETING SURFACE TABLET, WINDOWS PHONE 8 Target visitors that showed an in interest in Surface, Windows Phone, Xbox on the basis of their MS.com/MS Store behavior CRM Online CHURN PREDICTION PROACTIVE SUBSCRIBER RETENTION Building a predictive churn model – for the CRM online customers to help with retention ISRM - Security Enhance ISRM security monitoring and incident response capabilities. Detect potential threats on the Microsoft corporate network. SECURITY INTRUSION DETECTION OEM – Unlicensed Devices ROI, INSIGHT WINDOWS 8 DEVICES Analysis of ROI and development of actionable insight for marketing spend in OEM channels, including manufacturers retailers and distributers PIRACY DETECTION REVENUE GROWTH OPPORTUNITY Analyzing current trends in piracy of MS products and building models to identify instances of pirated software LCA – Cybercrime Unit
VIDEO “There’s no one country, business or organization that can tackle cybercrime threats alone. That’s why we invest in bringing partners into our center – law enforcement agencies, partners and customers – to work alongside us.” Brad Smith, Microsoft’s general counsel and executive vice president of Legal and Corporate Affairs.
Problem: Cybercrime cost governments, corporations and the public billions in recent years, but the techniques and level of proof required to solve enterprise cybercrime problems has been extremely challenging in the past. In particular, lost revenue from software piracy impacts an enterprise’s bottom line Findings: Microsoft’s teams combined cyber forensics, big data analysis and machine learning techniques to enable the ability to identify diverse piracy mechanics to stop 3 massive operations in different geographies and recouped over $5M in revenue Applied Analytics led to stopping piracy at the source by ceasing a daily leak of license keys from a factory As a result, several legal cases were brought to the court of law recently Methodology: Technological advances and Data Science enabled Microsoft Cybercrime Center, Legal Corporate Affairs and Microsoft IT’s Data & Decision Sciences Teams’ to effectively stop unlicensed activity and piracy, backed by the US Computer Fraud and Abuse Act Microsoft IT DDSG mined large volumes of license related data; predictive models built by the Data Scientists were implemented to score millions of product keys that LCA used successfully to identify fraudulent behavior
Problem: Early detection of suspicious activity on the network servers & eliminate the threat. Methodology: File system to store massive security data. Fully automated workflow to drive end-to- end data receiving and transformation process. Analysis and visualizations of Windows Events to identify pre-defined threat scenarios. Move from descriptive analytics to a mature predictive archetype.
Problem: A business line is experiencing 36% Churn annually Findings: Under-utilization is a key leading indicator (Low usage) Each 1% reduction of churn results in ~$342K impact Methodology: 40% of data is missing or incomplete Enumerated key leading indicators drivers of churn and scored every subscription with probability of churn Developed Random Forest model with ~65% accuracy
Problem: To leverage the history of a person’s behavior on Microsoft.com to identify their interests and predict future actions Predict which customers are likely to buy Surface or Windows Phone Methodology: Big Data Platform – HDP for Windows/Azure HDInsight and Advanced Analytics support Develop statistical models to determine the probability of users buying a Surface Device
Path analysis Geography analysis By Microsoft’s PowerMap 5 months of logs from Microsoft.com Analysis conducted using Power BI, SQL Server, & Hadoop Understand the Big Picture of your website’s logs Text Mining on external and internal queries Recognize your users quickly before their behavior changes Big Data Clustering models for user segmentation Big Data Predictive models for user behavior / targeting Do this for any sub-site, campaign, user segment, etc. Leverage big data platform for ongoing model refinement
Queries in Microsoft.com were logged during a specific time range. The engineering team was interested to know the popular “topics” from this collection of queries (documents) A text miner tool pre-processed 3 million queries, and constructed 25 thematic topics formed by “key words”. The 5 most popular “topics” are listed below CategoryTopic Id Doc cutoff Terms cutoff TopicNum of terms Num of queries Multiple window, +live, windowsmedia, xp, aspx Multiple xp, +window, sp3, xp service pack, +download Multiple window, +vista, +installer, +mobile, +phone Multiple medium, +player, +window, +download, +window Multiple office, +microsoft office, microsoft, +mac, +download Internal (i.e. on direct Microsoft pages ) CategoryTopic Id Doc cutoff Terms cutoff TopicNum of terms Num of queries Multiple window, +phone, +bit, +theme, +install Multiple microsoft, +microsoft office, +microsoft word, +microsoft essential, +microsoft outlook Multiple window, +phone, +installer, +vista, +server Multiple error, +server, +file, +code, sharepoint Multiple download, +free, +window, +explorer, microsoft External (i.e. referrals from Google, Yahoo, etc.)
Better customer targeting Targeting coverage improved by 5% due to predictive models and other measures! Increased revenue from display Ads Targeted Ads generated up to 19% of revenue Revenue per 1000 impressions grew by over 8X Revenue per click grew by 6X!
Team Experience: Our Academic Backgrounds Applied Mathematics Computer Science Econometrics Statistics Engineering Our Professional Expertise Financial Services Telecommunications Information Technology Industrials/Manufacturing Utilities Healthcare Marketing Domain Experience: Forecasting/Modeling Demand Forecasting Predictive Modeling Demand-Driven Planning Credit Modeling Fraud Detection Consumer Relations Sentiment Analysis/Social Media Inventory Optimization Customer Acquisition/Segmentation Membership Portfolio Optimization Click stream Data Analysis Data Science Design of experiments Predictive Maintenance Machine Learning Big Data Analytics/Innovation …a key resource for delivering value to the enterprise and your business
…key resources, engaged collaboration essential for delivering value to the enterprise Data Scientist Scientific Method Domain Knowledge Intellectual Curiosity & Critical Thinking Visualization & Communication Math & Statistics Advanced Computing & Data Management Business Problem Insights for Decision Making Ethical Considerations Objectivity Hypotheses Validation Transparency Dialog With Business Problem Description Options Considered Receptive to Conclusions Customer, Partner, Stakeholder
Data Science is a team sport Hire complementary skills to build a rounded team! We need a hybrid Data Science team structure for best results Need a centralized team of Data Scientists to share and promote best practices And Data Scientists in Line of Business groups for domain knowledge Data Science team needs to be peers, but not inside a BI team Analytics team should span descriptive, diagnostic, predictive and prescriptive analytics BI only covers descriptive and diagnostic Data Scientist in a BI team may be under-utilized
Problem: We needed a behavior customer segmentation for Windows and Office Very large volumes of telemetry data are collected – over 1.7 Billion mouse clicks and 2.4 Billion keystrokes Findings: Successfully developed 7 user behavioral segments Prioritize investments around activities people do most Methodology: How can we effectively mine and extract meaning from the data? Used clustering techniques to segment data that included hardware, app usage, user data, URLs visited