Common Anomaly Detection Platform

Slides:



Advertisements
Similar presentations
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Advertisements

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Feature: Reprint Outstanding Transactions Report © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product.
Feature: Purchase Requisitions - Requester © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
MIX 09 4/15/ :14 PM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Co- location Mass Market Managed Hosting ISV Hosting.
Windows 7 Training Microsoft Confidential. Windows ® 7 Compatibility Version Checking.
Feature: Purchase Order Prepayments II © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
Feature: OLE Notes Migration Utility
Session 1.
Built by Developers for Developers…. © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
 Rico Mariani Architect Microsoft Corporation.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Feature: Assign an Item to Multiple Sites © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
WinHEC /22/2017 © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Feature: Print Remaining Documents © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.
Connect with life Connect with life
Windows Azure Connect Name Title Microsoft Corporation.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Feature: Suggested Item Enhancements – Sales Script and Additional Information © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows.
Building Social Games for Windows 8 with Windows Azure Name Title Microsoft Corporation.
Feature: Customer Combiner and Modifier © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.

customer.
demo © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
demo Demo.
demo QueryForeign KeyInstance /sm:body()/x:Order/x:Delivery/y:TrackingId1Z
Feature: Suggested Item Enhancements – Analysis and Assignment © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and.
projekt202 © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
The CLR CoreCLRCoreCLR © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product.
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks.
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.

IT Operations Management
MIX 09 4/17/2018 4:41 PM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
using Microsoft Dynamics CRM solutions
S4 Solution Specialist Sales Summit
S4 Solution Specialist Sales Summit
IT Operations Management
Microsoft Dynamics NAV 2017
Возможности Excel 2010, о которых следует знать
Microsoft Build /8/2018 5:15 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Title of Presentation 11/22/2018 3:34 PM
Microsoft Build /28/2018 2:38 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Office Mac /30/2018 © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Title of Presentation 12/2/2018 3:48 PM
1/3/2019 1:21 PM © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
28 days.
Building SaaS Solutions on Windows Azure
Silverlight Debugging
8/04/2019 9:13 PM © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
4/27/17, Bell #8 What amount of net pay has been earned this period?
Виктор Хаджийски Катедра “Металургия на желязото и металолеене”
WINDOWS AZURE A LAP AROUND PLATFORM THE Steve Marx
PENSACOLA ENERGY WORK PLAN OCTOBER 10, 2016
Title of Presentation 5/12/ :53 PM
Шитманов Дархан Қаражанұлы Тарих пәнінің
Title of Presentation 5/24/2019 1:26 PM
5/24/2019 6:44 PM 1/8/18 Bell #10 In a world governed by the gods, is there any room for human will? Do human choices make a difference? EXPLAIN © 2007.
Using Smart Unit Tests to find bugs earlier in the development cycle
日本初公開!? Vista の新機能を実演 とっちゃん わんくま同盟 7/23/2019 9:09 AM
Title of Presentation 7/24/2019 8:53 PM
5/6/19, Bell #6 12/11/2019 8:26 PM Explain the relationship between this picture and the events that took place in Chapter 7 in Animal Farm. © 2007 Microsoft.
Presentation transcript:

Common Anomaly Detection Platform Tony Xing Senior Product Manager @ Microsoft

Bio Senior Product Manager of Shared Data team @ Microsoft Data quality and anomaly detection NRT datasets Data Ingestion Senior Product Manager of Skype Data team @ Microsoft Real time analytics Anomaly detection Cross platform SDKs

Agenda Context Anomaly detection 101 Problem statement Design principles How it works Algorithms Challenges and future work

Context

Shared Data

Shared Data Our Vision ​Shared Data has a vision to have one common SDK, a data bus to allow easy sharing all of OPG & AI&R's data streams in real time, and a set of common data sets in Cosmos (and Spark) ​At a high level we plan to have a pluggable architecture, where we expect many processors and solutions to share a common data backbone.  OPG & AI&R teams use multiple tools to manage and use their data, given that, the Shared Data architecture is designed to align with OPG & AI&R customer needs and the quickly evolving landscape of 3rd party and open-source data tools

Anomaly Detection 101

What is Anomaly Detection Anomaly detection is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset Widely used in System health monitoring Business metric monitoring Application performance monitoring “My current value is not what it should be as of right now”

Rule setting vs. automated Automate the process of finding outliers across the streams of data with a time dimension

Problem Statement Manual rule setting is impossible for large number of time series Single AD algorithm can not fit all signal types Precision vs. recall Analysis and diagnostics when issues happen Near real time detection Scalable Customers needs flexibility in plugging in different sources

What is CAP One stop shop for metric monitoring, analysis and diagnostics Key capabilities Automation: Full automation from creating rules to detection without human intervention Extensibility: Can plug in new data sources and anomaly detection algorithms. Scalability & real time: linear scale out Azure service Finer Granularity: support time series AD in hour/minute level REST APIs: REST APIs available for all operations. Allow easy integration into other product experience Algorithm tuning: allow easier tuning of algorithm

How it works – Automation 5/12/2018 How it works – Automation Onboarding Helps data owners register the incoming streams Creating rules & detecting The creating rules component creates detection rules which are then used by the detecting component to detect potential anomalies Contain machine learning and statistical analysis algorithms Alerting Once anomalies are found, alerting component will send anomaly info to the data owner © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

How it works - Extensibility Defined a generic interface of training and detection Each algorithm provider would implement per defined interface For example for each data point, we expect following from algorithm providers Whether it is an anomaly What is the predicted/expected value by algorithm What is the suggested lower bound What is the suggested upper bound Confidence level …

How it works – Extensibility

How it works - Scalability

Algorithms Intro Based on the initial customer usage, we start with those algorithms and make the generic interface based on the characteristic of them

Algorithm - Service Insider 5/12/2018 Algorithm - Service Insider Good in time series with periodical pattern Holt-Winters algorithm - Train model and predict Improvements for robustness: Use Median Absolute Deviation (MAD) to get robust estimation Handling for data missing and noise (e.g., data smoothing) Automatically capture the slow and regular trend and seasonal pattern GLR (Generalized Likelihood Ratio) - Used to detect anomalies Improvements Floating Threshold GLR, to dynamically adjust the model using the new input data Outlier removal for noisy data © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Machine Learning, Analytics & Data Science Conference 5/12/2018 2:25 PM Other Improvements Automatic detection of time series types (seasonal/non-seasonal) Automatic detection of seasonality/trend, instead of manual setting Add the feedback channels for end users to intuitively tune the algorithms The automatic detection use an integration of FFT (Fast Fourier Transformation) and LOESS (locally weighted scatterplot smoothing) FFT directly translate the time series into frequency domain, and it helps to capture the strongest frequency signal, but it could have bunch of noisy signals in the results LOESS does not require the specification of a function to fit a model to all of the data, it is flexible, and it could help to eliminate some noisy results Integrate the pure unsupervised statistics with semi-supervised feedback tuning Traditional statistics approaches (like Holt-Winters and ARIMA) are commonly used, it is unsupervised and no labeling required; but it won’t reflect users’ intent (too loose/strict) When users explicitly adjust on some points, their feedback are used as semi-supervised labels to retrain the parameters to fit more on users’ intent © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Azure ML - Exchangeability Martingale Good in detecting slow upward/downward trend, spike and dip, change in dynamic range General framework for online change detection in time series Has the property we are interested in changed in distribution? User specifies meaning of “new value strangeness” given history At each time t we receive a new value Add it to the history. For each item i in the history s[i] = strangeness function of (value[i], history) Let p[t] = (#{i: s[i] > s[t]}+ r*#{i: s[i]==s[t]})/N, where r is uniform in (0,1) Uniform r makes sure p is uniform

Azure ML - Exchangeability Martingale  

Algorithm – Exponential Smoothing  

Result Evaluation of exponential smoothing In some cases with periodical signal with trending, many false positives could be generated

Result Evaluation - ServiceInsider

Result Evaluation – EM

Result Evaluation – ES based

Result Evaluation – ServiceInsider and Azure ML

Challenges and Future Work Real time vs. accuracy Automated handling of data pattern change Easy tuning or usage of different algorithms

Real time vs. Accuracy Real time vs. Accuracy 5/12/2018 Real time vs. Accuracy Real time vs. Accuracy Some data streams are not stable from the perspective of data point latency © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Data Pattern Change

Easy Tuning Tuning the algorithm parameters to achieve right detection precision and recall is a pain to the users Service insider 2 parameters EM based: 7 parameters ES based: 3 parameters Creative UI to hide those details Do without human tuning at all!

Questions!