ADVANCED ANOMALY DETECTION IN CANARY TESTING Presented by Tamas Cser – Functionize, Inc. © All rights reserved
Advanced Deployment Methods Advanced deployment methods - Multiple versions in product with split traffic © All rights reserved
Modern Code Deployment Current App Version 95% New App Version Users 5% Users become the QA for your app! © All rights reserved
Canary Testing Approach Release To All Users Current App Version Build User ML Model Measure Behavior Yes Release To User Subset New App Version Measure Behavior Anomaly Detection Good For Release? No Revise App © All rights reserved
User Journeys © All rights reserved
Experimental Method - Clustering Collected 10000 session and 150000 actions Recorded user sessions on sample application Clusters session data by url/element interaction Identified 5 unique clusters in the data Each session is then analyzed to find how far it is from its cluster mean Cluster standard deviation data is transformed to be normally distributed and confidence interval is created. Sessions falling far enough away from their cluster mean to be outside the confidence interval are deemed anomalous --- Clustering of User Sessions Calculate Confidence Intervals for Clusters Algorithms used: Expectation Maximization Iterative method to find maximum likelihood estimate of parameters Akaike Information Criterion Given a collection of models, selects best one relative to the others © All rights reserved
LSTM Model Source: Analytics Vidhya Long Short-Term Memory Type of Recurrent Neural Network Analyzes data sequences/time series Predicts the probability of the next user action Source: Analytics Vidhya © All rights reserved
LSTM model can work together with the EM clustering Hybrid Approach LSTM model can work together with the EM clustering Build LSTM for each cluster of user sessions Since clusters represent similar user behavior, LSTM can more effectively predict the next user action Anomalous session are easier to identify © All rights reserved
Model trained on user data from live sites Process and Results Model trained on user data from live sites 10,000 user sessions were analyzed 80% of data used for training and 20% for testing Predicted the next user action 85% of the time Since there are hundreds of action/element combinations on a page, this is actually quite accurate © All rights reserved
Split Traffic and Canary Modeling © All rights reserved
2% of sessions were identified as outliers Conclusion Successfully segmented sessions into 5 clusters based on similar behaviors 2% of sessions were identified as outliers Anomalous user sessions can be easily identified LSTM model can accurately predict probabilities of actions in a user session. This analysis can be used for gating releases and canary testing Cluster 1: Single event sessions Cluster 2: Login interactions Cluster 3: Account interactions Cluster 4: Blog interactions Cluster 5: Subdomain interactions © All rights reserved